Application of Data Mining in Medical Decision Support System

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Application of Data Mining in Medical Decision Support System"

Transcription

1 Application of Data Mining in Medical Decision Support System Habib Shariff Mahmud School of Engineering & Computing Sciences University of East London - FTMS College Technology Park Malaysia Bukit Jalil, Kuala Lumpur, Malaysia Mohamed Ismail Z Senior lecturer, School of Engineering & Computing Sciences FTMS College, Technology Park Malaysia, Bukit Jalil, Kuala Lumpur, Malaysia Abstract Medical decision support systems (MDSS) are now being used in many health care institutions across the Glove. These institutions have large amount of medical data stored in different formats and may contain relevant data that are hidden. The use of data mining is to extract hidden knowledge from a relevant data, which would be the main aim of this paper, is to show how data mining methods can be applied in medical decision support system and also to design a web based expert system that can predict heart condition using neural network. The design of the system is based on VA Medical center long beach database and collected from the University of California, Irvine (UCI) machine learning repository. After analyzing several medical decision support systems in the relevant literature, three algorithms have been identified: multilayer perceptron, decision tree and Naïve Bayes. These algorithms were tested under different configuration in order to find the best on the two medical dataset. Thereafter, a comparison was made with respect to their performance based on some set of performance metrics. The analysis is done using Waikato Environment for Knowledge Analysis (WEKA) software, on the two medical dataset which are diabetes and heart diseases database. It was found out from the analysis that was carried out; that it is quite difficult to name an algorithm that is more suitable than the other because neural network was found to be the best in heart disease while decision tree was found to be more suitable on diabetes disease dataset. Keyword: Decision Medical Decision Support system, Machine Learning, Neural Network, Naïve Bayes Tree. 1. Introduction Page 1

2 In the last few decades, medical disciplines have become increasingly data-intensive. The advances in digital technology have led to an unprecedented growth in the size, complexity, and quantity of collected data that include medical reports and associated images. According to National electronic decision support taskforce, (2002), highlighted that Modern health centers nowadays comprises not only doctors, patients and medical staff but also various processes, including the patient s treatment. Also, modern systems and techniques have been introduced in health-care institutions to facilitate their operations. A huge amount of medical records are stored in databases and data warehouses. Such databases and applications differ from one another. The basic ones store only primary information about patients such as name, age, address, blood type, etc. The more advanced ones let the medical staff record patients' visits and store detailed information concerning their health condition. Some systems also facilitate patients' registration, units' finances and scheduling of visits. National electronic decision support taskforce, (2002), explained that in recent years a new type of a medical system has emerged called medical decision support system. It originates in the business intelligence and is to support medical decisions. In their introductory part, Turkoglu I., Arslan A., Ilkay E. (2006) explained that the situation stated above is one of the reason that call for a closer collaboration between computer scientists and those in the medical field. Data mining according to Witten I. H., Frank E., (2006) is a research area which seeks for methods to find knowledge from data. It is also called knowledge discovery. It makes use of different types of data mining algorithms to analyze databases. Data mining is not a single technique; it deploys various machine learning algorithms and any technique that can help to procure information out of the massive data to make it useful. Different algorithms serve different purposes; each algorithm offers its own advantages and disadvantages. However, the most commonly used methods for data mining are based on neural networks, decision trees, a-priori, regressions, k-means, Bayesian networks, and so forth. Newman D.J., et al, (2000) cited among others, the reason why UCI medical data repository was chosen in a research is to allow others to conduct similar experiments and compare their results. This is one of the main reasons why UCI data repository databases were considered. The chosen databases are different from each other although they come from two different medical domains. This allows evaluating the algorithms performance under various real medical conditions (attributes features). The UCI Repository of Machine Learning Databases and Domain Theories is a free Internet repository of analytical datasets from several areas. All datasets are in text files format provided with a short description. For the analyses two medical datasets were selected. The chosen data concerns two different medical fields. This research work will only aim to answer the specific research questions stated in the research questions. The reasons being that medical field is very vast field with lots of new discovery every day. There are so many algorithms in data mining but due to time constraints, only the three algorism selected will be tested on the two data set. The algorithms are; decisions trees, Multilayer Perceptron and the Naïve Bayes. Objectives of the study The objectives of the research work are as follows: Page 2

3 i. To show that data mining can be applied to the medical databases, that will predict or classify the data with a reasonable degree of accuracy. ii. Also to achieve evaluation of three selected data mining algorithms, which are commonly implemented in the Medical Decision Support Systems, with regard to their performance. The evaluation is performed on two medical data sets obtained from the UCI Repository. Research questions Outline The research work will aim to answer the following specific questions: Can a web based expert system be developed to predict heart related ailment using data mining technique? Which of the three data mining algorithms provide the most accurate result in the medical diagnosis of heart disease and diabetes? The research work is outline in such a way that, section one introduces the whole research work, the aims and objectives of the research was stated; also, the limitations of the study as well as the research questions were clearly stated. Then next section, related work were reviewed in the field of data mining and medical decision support system. Section three is the methodology and design section, here sources of data were explained and the system architecture is also explained. How the design was carried out is also explained, while in section four the analysis of the experiment and the result obtained was discussed. Finally, the paper was summarized and concludes. 2. Literature Review Application of data mining methods for medical decision support system According to Krzysztof J. Cios ET. all, (2007) to describe the aim of data mining is to make sense of large amounts of mostly unsupervised data, in some domain. The term to make sense here means the data should be able to be understood, novel and useful to the user. Most probably the most important thing that discovered new knowledge would be to the user is that it should be understood in order to use it to some advantage. Data mining deals with large amount of data not a small quantity that can be supervised manually, for example NASA generates tens of gigabyte per hour in its mission of earth observing system, like wise wall mart, NSA, and whole lots of others that generate terabytes or petabytes of data. The main aim of data mining as suggested by J.hardin,D.chhieng (2008 ) is to discover new pattern for the future, The effect of discovering new pattern can aid in serving two purposes; that is prediction and description. Description aims at finding new patterns to the user that can be understood, while prediction is a process of identifying variables in the database in order to be used to predict future events or some entities behavior. Due to the large volume of data that is being generated from the medical settings, it is imperative for medical organization to utilize data mining technique in order to improve the quality of health care in general. Another application of data mining in medical decision support system as suggested by Durairaj, M, Ranjani, V. (2013) is in health care management where data mining tools can be used Page 3

4 to identify and track very chronic dieses, it can also be used to develop and design appropriate interventions that will reduce the number of admissions in the hospital and to also help in aiding healthcare management in general. Durairaj, M, Ranjani, V. (2013), also went ahead to suggest that data mining tools can be used to detect an attack by bio terrorist, this can be done by analyzing the massive amount of data in order to search for patterns that might suggest something is wrong. Medical decision support system The complexity of diagnosing is what makes diagnosis to be very hard to understand. However, symptoms are the primary input in medical diagnoses as suggested by Witten I. H., Frank E., (2005). When these symptoms are processed they produced an output which will indicate whether a patient is sick or at risk to some certain health related issues. The process of medical diagnosis is given in figure 1 below Symptoms Knowledge base diagnoses Fig. 1 the process of medical diagnosis After diagnosing the patient, the next step is for the physician to make decision, and the process of decision making was analyzed by Mora M., Forgionne G.A., Jupta J., (2012). They said that this process is continuous and recycled and they involve the following phases: i. Intelligence ii. Design iii. Choice iv. Implementation According to Marakas GM. (2002) said A typical decision support system consists of five components: i. the data management, ii. the model management iii. the knowledge engine iv. the user interface v. the user(s) 3. Research Design and Methodology Methodology According to the Creswell J. W. (2002), a research can be categorized into three types, which are: quantitative, qualitative or mixed. In view of this categorization, this research can be viewed as a qualitative one. This is because of the fact that the analyses are based on the qualitative aspects of medical data mining. Therefore, this means that the performance of data mining algorithms is the driver of the evaluation. Page 4

5 There are also other types of research: for instance Dawson. C.W. (2010) describes an evaluation research, i.e. a study which involves evaluation. Therefore this research can also be classified as such. Sources of data The data was collected from two sources, a questionnaire was used in collecting the data for building the expert system, and for the analysis of the two dataset, the data from the UCI repository was used. The choice of medical dataset from UCI repository is to allow other researchers to conduct similar or slightly similar experiments and compare their findings. This is one of the main reasons for choosing UCI medical dataset repository. The UCI repository database is a free internet repository of analytical datasets from different areas. Mostly all the dataset are in text rich format files. The UCI datasets gained recognition from across the world and said to be a very true and valuable source of data. In this research two medical datasets were selected and analyzed. Data preprocessing The process involved in data preprocessing is removing duplicate records, normalization of the database and removing unwanted fields. The data is preprocessed in order to make the data mining more efficient. After that the preprocessed data is then clustered using the k- means clustering algorithms with a value of k =2. Now, this will produce two clusters, one that is relevant to the heart disease and the other is the remaining data. The frequent patterns are chosen based on the pattern with significant weightage greater than the threshold that was already predefined, after this the frequent pattern are mined according to the relevant heart diseases. According to S. Oyyathevan and A. Askarunisa (2014) ascertained that it might be appropriate to combine data in order to minimize the number of data sets and also reduce the amount of storage and processing time by the data mining algorithms. For the missing data, the substitution method was used, that is the missing value was replaced with the mean value that was computed from the same data. According to (Erkki et al,.1998) suggested that the method has been found to be very accurate when compared with the artificial neural network based approach. Determinig the significant frequency pattern Determining the significant frequency pattern can aid in designing heart diseases prediction system, and according to S. Oyyathevan and A. Askarunisa (2014) ascertain that before defining the significant frequency pattern, the significant weightage has to be found, and this can be found with the following formula; Where wi is the weight of each attribute, and the frequency of each rule is denoted by fi. And the patterns with the significant weigh more than the pre-defined will be chosen in order to help in predicting heart diseases. Page 5

6 The significant frequency pattern (SFP) is given as; Where; SFP is the significant frequent pattern and is the significant weigh. The feed forward network is determined based on the weight and frequency of each attribute and pattern respectively. Now consider a cell as in fig 2 which may be an output layer or a hidden layer, each input layer is given a weight, if for example there are N+ 1input with the last input having a value of 1. The sum function of the weight inputs will deternine the output. E.g In W0, In1-----W1, In N-1----W N W N. the sum of the weight of the input determines the output as follows: Sum f(sum)---- output o/p Fig. 2 a single cell Neuron Building the expert system One of the aims of this research paper is to build a web based expert system using data mining technique, which is neural network that can aid in predicting heart disease for a given patient. And, as stated in the previous sections data mining is part of machine learning where some rules are extracted from a given dataset, in order to have more meaningful data. To implement the rules, the back propagation algorithm is adopted as shown below: Back propagation algorithm Input: Let D be a dataset L learning rate ntwk a feed forward network Output neural network begin; 1. Initialize all weights 2. If condition not satisfied { 3. For each tuple t in D and for each input layer a { oa = la; 4. For each hidden layer a; la = ; // this is to compute the net input of unit a with respect to I which is the previous layer. Page 6

7 5. O a = ; // to compute the output of each unit. The next step is to propagate the errors and the steps are as follows: For each j in the output layer E rr = O j (1- O j)(t j - O j); // this is to compute the error. For each j in the hidden layers; E rr = O j (1 - O j) k Wj k ; // this to compute the next error k with respect to the higher layer For each W ij weight in the network{ W ij = l E rr O j; // increment weight { O j = (l) E rrj ;//increment bias // update bias Initial Input Screen After the successfully login, the user will now be taken to the next page, which will guide in aiding the prediction of the heart disease. The attributes are chest pain, blood pressure, maximum heart rate, blood sugar, cholesterol level and old peak. It is assumed that the user has knowledge about his condition that is why he is using the system. Figure 5.4 shows the initial input screen where a user is expected to input the relevant information. Fig. 3 Default Screen Depending on the input, the user can submit the values to the system, and will instantly receive a response from the system as to whether the user is at risk or not. 4. Results and Discussion Page 7

8 In this section the results of the experiment performed with the three data mining algorithms are presented. The three algorithms are decision tree, Naïve Bayes and multilayer perceptron. The algorithms were applied to the medical datasets of heart dieses and diabetic, and the experiment is conducted with WEKA (Waikato Environment for Knowledge Analysis). The analyses are as follows: Several parameters are used to calibrate the algorithms The parameters of all of the algorithms are used on each of the two dataset. The results are presented in form of tables The Diabetes Database The diabetes database consists of five attributes and 768 different cases. Out of this 66% will be used for the training set while the remaining will be used for the testing set, as it is always a good practice to have a larger number of training set than the testing set. The decisional attributes takes a binary value of 0 or 1. The figure below shows the WEKA software that was explained. Fig 3 the WEKA software showing the attributes Page 8

9 The distribution of the attributes are shown above in figure 3, where all the five attributes are presented, the five attributes are pedigree, mass, age, plasma and class which is a decisional attribute. The output obtained by WEKA software after running the experiments are all displayed in figure 4.1together with their accuracy in terms of percentages. A comparison will now be made in order to find which among the algorithms has a better result in terms of accuracy for classification for the diabetic dataset % 80.00% 70.00% 50% split 66% split Training set Training set 66% split 50% split Figure 4.1 analyses of the algorithms on diabetic dataset From figure 4.1 it can be seen that decision tree algorithms have the highest percentage of accuracy of classification with 82.03%, followed by Naïve Bayes with 77.60% then multilayer perceptron with accuracy of 76.82%. But in terms of percentage split, when the percentage split is 66%, Naïve Bayes outperformed the others with the accuracy of almost 80%, followed by multilayer perceptron with 78.54% then Decision Tree with 75.86%. It is also shown from the graph that percentage split of 66% always performs better than a smaller percentages. Now let s represent the confusion matrix together with the graphical representation and analysis of the algorithms that are run under the WEKA software with different parameters setting. Page 9

10 A B A B A B Decision tree Naïve Bayes Multilayer perceptron Fig.4.2 graphical view of the confusion matrix on training set A B Legend: A - represent the number of cases that tested negative B - Represents the number of cases that tested positive From the graphical representation of the confusion matrix above, we can conclude that decision tree has the best classification A B A B A B Decision tree Naïve Bayes Multilayer perceptron Fig. 4.3 graphical view of confusion matrix based on 66% split A B From the confusion matrix based on 66% split it can be deducted that the Naïve Bayes perform better, followed by multilayer perceptron then followed by decision tree. Page 10

11 A B 0 A B A B A B Decision tree Naïve Bayes Multilayer perceptron Fig 4.4 graphical view on confusion matrix based on 50% split Finally, based on the experiments conducted on the settings provided, it can be seen that Naïve Bayes gives a better prediction, with an incorrect prediction of 33 and 49, when the percentage split is 66% followed by multilayer perceptron and then decision tree. While when the percentage split is 50% both Naïve Bayes and multilayer perceptron gives a better classification than decision tree. Findings from the Heart Disease Database After running the experiment for the three algorithms with different parameters for the heart disease database, the following table summarized the findings. Decision Tree Naïve Bayes Multilayer Perceptron Training set 87.46% 84.49% 95.38% 66% split 82.52% 84.47% 80.58% 50% split 74.83% 82.78% 78.80% Table 4.1 accuracy of the algorithms on heart disease database From table 6.6 it can be seen that multilayer perceptron performed the best on training set with an accuracy of 95.38% followed by decision tree and the Naïve Bayes. But on the percentage split of 66% and 50%, Naïve Bayes was found to perform better than both the two. The high accuracy of multilayer perceptron on predicting the heart disease motivated us to use it in developing an expert system for this thesis. The confusion matrix for each of the algorithm and the corresponding graphical presentation are summarized and presented in the following tables and figures. Page 11

12 Decision tree Naïve Bayes Multilayer perceptron A B A B A B A B Table 4.2 confusion matrix based on training set From the analysis above it can be seen that the multilayer perceptron has the most accurate result followed by Naïve Bayes then the decision tree. Decision tree Naïve Bayes Multilayer perceptron A B A B A B A B Table 4.3 confusion matrix based on 50% split From the above configuration of 50% split the Naïve Bayes has the highest accurate result classification followed by the multilayer perceptron then the decision tree. Decision tree Naïve Bayes Multilayer perceptron A B A B A B A B Table 4.4 confusion matrix based on 66% split When it comes to 66% configuration, it was found out that still, the Naïve Bayes outperform the rest, followed by multilayer perceptron then the decision tree. 5. Conclusion The conclusions that can be drawn from developing the expert system using neural networks are as follows: (1) dataset of heart diseases from the UCI machine learning Repository was used that was obtained from a previous research, the dataset consist of 303 patients with varying form of symptoms. The data was preprocessed and then clustered using the k- means clustering algorithms with a value of k=2. Also, a questioner was design and administered online in order to develop the web based expert system that can predict or classify heart related risk. (2) to develop the expert system using java server applet (JSP) consist of several steps, among which are; feasibility studies, design, knowledge acquisition and representation, the result which is a web based expert system to predict heart disease. The system has been tested using validation and prototyping. (3) the implementation of the system is an application that provides the user with a set of attributes which a user is expected to fill and get an instant response whether is at risk or not. The application has been tested by a doctor and recommends that in future works some improvements have to be made. Based on the experiments conducted on the two dataset using different configuration, the experiments produced some very interesting result. It was found out that it is very difficult to say which is the best configuration, although, because of time constraints only very few configuration Page 12

13 was used. But, it was found out that the 50% split produced the worst configuration on the two dataset that was used in the experiments. The analysis also produced some very interesting result, because it was found out that based on the two dataset, that is the diabetes and the heart diseases database, the Naïve Bayes perform better then multilayer perceptron then followed by decision tree. While on the heart disease dataset the multilayer perceptron has the best performance followed by Naïve Bayes then decision tree. It can therefore, be concluded that because of the nature and complexity of medical data, it is very difficult to say which methods has the overall best result in terms of performance for medical dataset, only that different methods works better on some different specific dataset. The results obtained showed the applicability of data mining algorithms on medical datasets, but care should be taken in choosing the algorithms for a particular dataset and not to generalize the results. References [1] Chae Y. M., Kim H. S., Tark K. C., Park H. J., Ho S. H.,(2003) Analysis of healthcare quality indicator using data mining and decision support system. Expert Systems with Applications, [2] Creswell J. W., (2012) Research Design, Qualitative, Quantitative and Mixed Method Approached. 2 nd edn. Sage Publications, Thousand Oaks CA. [3] Dawson C. W., (2010)The Essence of Computing Projects a Student's Guide. Prentice Hall, Harlow UK. [4] Detmer W., Barnett G., Hersh W., (2002) MedWeaver: Integrating Decision Support. Literature Searching and Web Exploration using the UMLS, Metathesaurus [5] Durairaj, M, Ranjani, V. (2013) Data Mining Applications In Healthcare Sector: A Study, 2(10) [6] DXp HST Lec 05.pdf, (2009) Computer Science and Artificial Intelligence Laboratory (CSAIL), MIT. [7] Electronic Decision Support for Austraila s Health Sector, National electronic decision support taskforce, [8] (retrieved on ) [9] J.hardin,D.chhieng (2008 ) support system pp [10] Mitchell T. M., (2007) Machine Learning, Redmond, McGraw-Hill. [11] Mora M., Forgionne G.A., Jupta J.,(2012) Decision Making Support Systems: achievements, trends and challenges for the next decade. Idea-Group: Hershey, P.A, [12] Newman D.J., Hettich S., Blake C.L., Merz C.J., UCI Repository of machine learning databases.[ Irvine, CA: University of California, Department of Information and Computer Science (retrieved on ) [13] Nong Y.,(2003) The Handbook of Data Mining. Lawrence Earlbaum Associates [14] S. Oyyathevan, A. Askarunisa. (2014) An Expert System for Heart Disease Prediction Using Data Mining Technique, Research Paper, 1(4), pp.1-6 [15] Shu-Mei W., Yu C., Yu-Mei, Cheng-Fang Y., Hui-Lian C.,(2205) Decision-making tree for women considering hysterectomy, Journal of advanced nursing, Blackwell Publishing, pp Page 13

14 [16] Tang Z., MacLennan J., (2005) Data Mining with SQL Server Indianapolis, Indiana, USA, Wiley Publishing Inc. [17] Teach R. and Shortliffe E.,(2001) An analysis of physician attitudes regarding computerbased clinical consultation systems. Computers and Biomedical Research, 14, pp [18] Turkoglu I., Arslan A., Ilkay E., (2006) An expert system for diagnosis of the heart valve diseases. Expert Systems with Applications, 23(3), pp [19] WHO, Fact sheet No. 297: Cancer, 2006, Retrieved on [20] Witten I. H., Frank E., (2005) Data Mining, Practical Machine Learning Tools and Techniques, 2nd Elsevier. Page 14

Impelling Heart Attack Prediction System using Data Mining and Artificial Neural Network

Impelling Heart Attack Prediction System using Data Mining and Artificial Neural Network General Article International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347-5161 2014 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Impelling

More information

Prediction of Heart Disease Using Naïve Bayes Algorithm

Prediction of Heart Disease Using Naïve Bayes Algorithm Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,

More information

A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE

A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE Kasra Madadipouya 1 1 Department of Computing and Science, Asia Pacific University of Technology & Innovation ABSTRACT Today, enormous amount of data

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

Evaluation of selected data mining algorithms implemented in Medical Decision Support Systems Kamila Aftarczuk

Evaluation of selected data mining algorithms implemented in Medical Decision Support Systems Kamila Aftarczuk Master Thesis Software Engineering Thesis no: MSE-2007-21 September 2007 Evaluation of selected data mining algorithms implemented in Medical Decision Support Systems Kamila Aftarczuk This thesis is submitted

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR. ankitanandurkar2394@gmail.com

INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR. ankitanandurkar2394@gmail.com IJFEAT INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR Bharti S. Takey 1, Ankita N. Nandurkar 2,Ashwini A. Khobragade 3,Pooja G. Jaiswal 4,Swapnil R.

More information

REVIEW OF HEART DISEASE PREDICTION SYSTEM USING DATA MINING AND HYBRID INTELLIGENT TECHNIQUES

REVIEW OF HEART DISEASE PREDICTION SYSTEM USING DATA MINING AND HYBRID INTELLIGENT TECHNIQUES REVIEW OF HEART DISEASE PREDICTION SYSTEM USING DATA MINING AND HYBRID INTELLIGENT TECHNIQUES R. Chitra 1 and V. Seenivasagam 2 1 Department of Computer Science and Engineering, Noorul Islam Centre for

More information

Data quality in Accounting Information Systems

Data quality in Accounting Information Systems Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

Categorical Data Visualization and Clustering Using Subjective Factors

Categorical Data Visualization and Clustering Using Subjective Factors Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,

More information

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

More information

DATA MINING AND REPORTING IN HEALTHCARE

DATA MINING AND REPORTING IN HEALTHCARE DATA MINING AND REPORTING IN HEALTHCARE Divya Gandhi 1, Pooja Asher 2, Harshada Chaudhari 3 1,2,3 Department of Information Technology, Sardar Patel Institute of Technology, Mumbai,(India) ABSTRACT The

More information

Artificial Neural Network Approach for Classification of Heart Disease Dataset

Artificial Neural Network Approach for Classification of Heart Disease Dataset Artificial Neural Network Approach for Classification of Heart Disease Dataset Manjusha B. Wadhonkar 1, Prof. P.A. Tijare 2 and Prof. S.N.Sawalkar 3 1 M.E Computer Engineering (Second Year)., Computer

More information

Web-Based Heart Disease Decision Support System using Data Mining Classification Modeling Techniques

Web-Based Heart Disease Decision Support System using Data Mining Classification Modeling Techniques Web-Based Heart Disease Decision Support System using Data Mining Classification Modeling Techniques Sellappan Palaniappan 1 ), Rafiah Awang 2 ) Abstract The healthcare industry collects huge amounts of

More information

Heart Disease Diagnosis Using Predictive Data mining

Heart Disease Diagnosis Using Predictive Data mining ISSN (Online) : 2319-8753 ISSN (Print) : 2347-6710 International Journal of Innovative Research in Science, Engineering and Technology Volume 3, Special Issue 3, March 2014 2014 International Conference

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

Decision Support System on Prediction of Heart Disease Using Data Mining Techniques

Decision Support System on Prediction of Heart Disease Using Data Mining Techniques International Journal of Engineering Research and General Science Volume 3, Issue, March-April, 015 ISSN 091-730 Decision Support System on Prediction of Heart Disease Using Data Mining Techniques Ms.

More information

Intelligent Heart Disease Prediction System Using Data Mining Techniques *Ms. Ishtake S.H, ** Prof. Sanap S.A.

Intelligent Heart Disease Prediction System Using Data Mining Techniques *Ms. Ishtake S.H, ** Prof. Sanap S.A. Intelligent Heart Disease Prediction System Using Data Mining Techniques *Ms. Ishtake S.H, ** Prof. Sanap S.A. *Department of Copmputer science, MIT, Aurangabad, Maharashtra, India. ** Department of Computer

More information

Data Mining On Diabetics

Data Mining On Diabetics Data Mining On Diabetics Janani Sankari.M 1,Saravana priya.m 2 Assistant Professor 1,2 Department of Information Technology 1,Computer Engineering 2 Jeppiaar Engineering College,Chennai 1, D.Y.Patil College

More information

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

In this presentation, you will be introduced to data mining and the relationship with meaningful use. In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine

More information

International Journal of Electronics and Computer Science Engineering 1449

International Journal of Electronics and Computer Science Engineering 1449 International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

Effective Analysis and Predictive Model of Stroke Disease using Classification Methods

Effective Analysis and Predictive Model of Stroke Disease using Classification Methods Effective Analysis and Predictive Model of Stroke Disease using Classification Methods A.Sudha Student, M.Tech (CSE) VIT University Vellore, India P.Gayathri Assistant Professor VIT University Vellore,

More information

Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm

Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm R.Karthiyayini 1, J.Jayaprakash 2 Assistant Professor, Department of Computer Applications, Anna University (BIT Campus),

More information

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati tnpatil2@gmail.com, ss_sherekar@rediffmail.com

More information

Keywords data mining, prediction techniques, decision making.

Keywords data mining, prediction techniques, decision making. Volume 5, Issue 4, April 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analysis of Datamining

More information

Decision Support in Heart Disease Prediction System using Naive Bayes

Decision Support in Heart Disease Prediction System using Naive Bayes Decision Support in Heart Disease Prediction System using Naive Bayes Mrs.G.Subbalakshmi (M.Tech), Kakinada Institute of Engineering & Technology (Affiliated to JNTU-Kakinada), Yanam Road, Korangi-533461,

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

COURSE RECOMMENDER SYSTEM IN E-LEARNING

COURSE RECOMMENDER SYSTEM IN E-LEARNING International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand

More information

Pentaho Data Mining Last Modified on January 22, 2007

Pentaho Data Mining Last Modified on January 22, 2007 Pentaho Data Mining Copyright 2007 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at www.pentaho.org

More information

SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH

SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH 330 SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH T. M. D.Saumya 1, T. Rupasinghe 2 and P. Abeysinghe 3 1 Department of Industrial Management, University of Kelaniya,

More information

The Data Mining Process

The Data Mining Process Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

More information

Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

More information

USING DATA SCIENCE TO DISCOVE INSIGHT OF MEDICAL PROVIDERS CHARGE FOR COMMON SERVICES

USING DATA SCIENCE TO DISCOVE INSIGHT OF MEDICAL PROVIDERS CHARGE FOR COMMON SERVICES USING DATA SCIENCE TO DISCOVE INSIGHT OF MEDICAL PROVIDERS CHARGE FOR COMMON SERVICES Irron Williams Northwestern University IrronWilliams2015@u.northwestern.edu Abstract--Data science is evolving. In

More information

Study and Analysis of Data Mining Concepts

Study and Analysis of Data Mining Concepts Study and Analysis of Data Mining Concepts M.Parvathi Head/Department of Computer Applications Senthamarai college of Arts and Science,Madurai,TamilNadu,India/ Dr. S.Thabasu Kannan Principal Pannai College

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India

Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India Volume 5, Issue 6, June 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Multiple Pheromone

More information

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University

More information

Advanced analytics at your hands

Advanced analytics at your hands 2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously

More information

Impact of Boolean factorization as preprocessing methods for classification of Boolean data

Impact of Boolean factorization as preprocessing methods for classification of Boolean data Impact of Boolean factorization as preprocessing methods for classification of Boolean data Radim Belohlavek, Jan Outrata, Martin Trnecka Data Analysis and Modeling Lab (DAMOL) Dept. Computer Science,

More information

6.2.8 Neural networks for data mining

6.2.8 Neural networks for data mining 6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural

More information

Role of Neural network in data mining

Role of Neural network in data mining Role of Neural network in data mining Chitranjanjit kaur Associate Prof Guru Nanak College, Sukhchainana Phagwara,(GNDU) Punjab, India Pooja kapoor Associate Prof Swami Sarvanand Group Of Institutes Dinanagar(PTU)

More information

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support Rok Rupnik, Matjaž Kukar, Marko Bajec, Marjan Krisper University of Ljubljana, Faculty of Computer and Information

More information

Introduction Predictive Analytics Tools: Weka

Introduction Predictive Analytics Tools: Weka Introduction Predictive Analytics Tools: Weka Predictive Analytics Center of Excellence San Diego Supercomputer Center University of California, San Diego Tools Landscape Considerations Scale User Interface

More information

Intelligent Heart Disease Prediction System Using Data Mining Techniques

Intelligent Heart Disease Prediction System Using Data Mining Techniques IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.8, August 2008 343 Intelligent Heart Disease Prediction System Using Data Mining Techniques Sellappan Palaniappan Rafiah

More information

A Content based Spam Filtering Using Optical Back Propagation Technique

A Content based Spam Filtering Using Optical Back Propagation Technique A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT

More information

1. Classification problems

1. Classification problems Neural and Evolutionary Computing. Lab 1: Classification problems Machine Learning test data repository Weka data mining platform Introduction Scilab 1. Classification problems The main aim of a classification

More information

REVIEW ON PREDICTION SYSTEM FOR HEART DIAGNOSIS USING DATA MINING TECHNIQUES

REVIEW ON PREDICTION SYSTEM FOR HEART DIAGNOSIS USING DATA MINING TECHNIQUES International Journal of Latest Research in Engineering and Technology (IJLRET) ISSN: 2454-5031(Online) ǁ Volume 1 Issue 5ǁOctober 2015 ǁ PP 09-14 REVIEW ON PREDICTION SYSTEM FOR HEART DIAGNOSIS USING

More information

First Semester Computer Science Students Academic Performances Analysis by Using Data Mining Classification Algorithms

First Semester Computer Science Students Academic Performances Analysis by Using Data Mining Classification Algorithms First Semester Computer Science Students Academic Performances Analysis by Using Data Mining Classification Algorithms Azwa Abdul Aziz, Nor Hafieza IsmailandFadhilah Ahmad Faculty Informatics & Computing

More information

PharmaSUG2011 Paper HS03

PharmaSUG2011 Paper HS03 PharmaSUG2011 Paper HS03 Using SAS Predictive Modeling to Investigate the Asthma s Patient Future Hospitalization Risk Yehia H. Khalil, University of Louisville, Louisville, KY, US ABSTRACT The focus of

More information

A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan

A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan , pp.217-222 http://dx.doi.org/10.14257/ijbsbt.2015.7.3.23 A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan Muhammad Arif 1,2, Asad Khatak

More information

Data Mining and Neural Networks in Stata

Data Mining and Neural Networks in Stata Data Mining and Neural Networks in Stata 2 nd Italian Stata Users Group Meeting Milano, 10 October 2005 Mario Lucchini e Maurizo Pisati Università di Milano-Bicocca mario.lucchini@unimib.it maurizio.pisati@unimib.it

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM. DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,

More information

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam

More information

Strategic Management System for Effective Health Care Planning (SMS-EHCP)

Strategic Management System for Effective Health Care Planning (SMS-EHCP) 674 Strategic Management System for Effective Health Care Planning (SMS-EHCP) 1 O. I. Omotoso, 2 I. A. Adeyanju, 3 S. A. Ibraheem 4 K. S. Ibrahim 1,2,3,4 Department of Computer Science and Engineering,

More information

DBTech Pro Workshop. Knowledge Discovery from Databases (KDD) Including Data Warehousing and Data Mining. Georgios Evangelidis

DBTech Pro Workshop. Knowledge Discovery from Databases (KDD) Including Data Warehousing and Data Mining. Georgios Evangelidis DBTechNet DBTech Pro Workshop Knowledge Discovery from Databases (KDD) Including Data Warehousing and Data Mining Dimitris A. Dervos dad@it.teithe.gr http://aetos.it.teithe.gr/~dad Georgios Evangelidis

More information

A Proposed Data Mining Model for the Associated Factors of Alzheimer s Disease

A Proposed Data Mining Model for the Associated Factors of Alzheimer s Disease A Proposed Data Mining Model for the Associated Factors of Alzheimer s Disease Dr. Nevine Makram Labib and Mohamed Sayed Badawy Department of Computer and Information Systems Faculty of Management Sciences,

More information

Comparison of Six Classification Techniques for Post Operative Patient data in the Medicable discipline

Comparison of Six Classification Techniques for Post Operative Patient data in the Medicable discipline Comparison of Six Classification Techniques for Post Operative Patient data in the Medicable discipline Chinky Gera 1, Kirti Joshi 2 Research Scholar 1, Assistant Professor 2 Department of Computer Science

More information

PREDICTING STUDENTS PERFORMANCE USING ID3 AND C4.5 CLASSIFICATION ALGORITHMS

PREDICTING STUDENTS PERFORMANCE USING ID3 AND C4.5 CLASSIFICATION ALGORITHMS PREDICTING STUDENTS PERFORMANCE USING ID3 AND C4.5 CLASSIFICATION ALGORITHMS Kalpesh Adhatrao, Aditya Gaykar, Amiraj Dhawan, Rohit Jha and Vipul Honrao ABSTRACT Department of Computer Engineering, Fr.

More information

A Knowledge Management Framework Using Business Intelligence Solutions

A Knowledge Management Framework Using Business Intelligence Solutions www.ijcsi.org 102 A Knowledge Management Framework Using Business Intelligence Solutions Marwa Gadu 1 and Prof. Dr. Nashaat El-Khameesy 2 1 Computer and Information Systems Department, Sadat Academy For

More information

Master of Science in Healthcare Informatics and Analytics Program Overview

Master of Science in Healthcare Informatics and Analytics Program Overview Master of Science in Healthcare Informatics and Analytics Program Overview The program is a 60 credit, 100 week course of study that is designed to graduate students who: Understand and can apply the appropriate

More information

Syllabus. HMI 7437: Data Warehousing and Data/Text Mining for Healthcare

Syllabus. HMI 7437: Data Warehousing and Data/Text Mining for Healthcare Syllabus HMI 7437: Data Warehousing and Data/Text Mining for Healthcare 1. Instructor Illhoi Yoo, Ph.D Office: 404 Clark Hall Email: muteaching@gmail.com Office hours: TBA Classroom: TBA Class hours: TBA

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com

More information

Classification of Women Health Disease (Fibroid) Using Decision Tree algorithm

Classification of Women Health Disease (Fibroid) Using Decision Tree algorithm International Journal of Computer Applications in Engineering Sciences [VOL II, ISSUE III, SEPTEMBER 2012] [ISSN: 2231-4946] Classification of Women Health Disease (Fibroid) Using Decision Tree algorithm

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

Application of Data mining in Medical Applications

Application of Data mining in Medical Applications Application of Data mining in Medical Applications by Arun George Eapen A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Applied Science

More information

Big Data Analytics Predicting Risk of Readmissions of Diabetic Patients

Big Data Analytics Predicting Risk of Readmissions of Diabetic Patients Big Data Analytics Predicting Risk of Readmissions of Diabetic Patients Saumya Salian 1, Dr. G. Harisekaran 2 1 SRM University, Department of Information and Technology, SRM Nagar, Chennai 603203, India

More information

Predictive Data modeling for health care: Comparative performance study of different prediction models

Predictive Data modeling for health care: Comparative performance study of different prediction models Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath hiremat.nitie@gmail.com National Institute of Industrial Engineering (NITIE) Vihar

More information

A New Approach For Estimating Software Effort Using RBFN Network

A New Approach For Estimating Software Effort Using RBFN Network IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.7, July 008 37 A New Approach For Estimating Software Using RBFN Network Ch. Satyananda Reddy, P. Sankara Rao, KVSVN Raju,

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved

Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved Perspective Big Data Framework for Healthcare using Hadoop

More information

A Medical Decision Support System (DSS) for Ubiquitous Healthcare Diagnosis System

A Medical Decision Support System (DSS) for Ubiquitous Healthcare Diagnosis System , pp. 237-244 http://dx.doi.org/10.14257/ijseia.2014.8.10.22 A Medical Decision Support System (DSS) for Ubiquitous Healthcare Diagnosis System Regin Joy Conejar 1 and Haeng-Kon Kim 1* 1 School of Information

More information

Selection of Optimal Discount of Retail Assortments with Data Mining Approach

Selection of Optimal Discount of Retail Assortments with Data Mining Approach Available online at www.interscience.in Selection of Optimal Discount of Retail Assortments with Data Mining Approach Padmalatha Eddla, Ravinder Reddy, Mamatha Computer Science Department,CBIT, Gandipet,Hyderabad,A.P,India.

More information

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p.

Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. Introduction p. xvii Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. 9 State of the Practice in Analytics p. 11 BI Versus

More information

Data Mining Techniques for Prognosis in Pancreatic Cancer

Data Mining Techniques for Prognosis in Pancreatic Cancer Data Mining Techniques for Prognosis in Pancreatic Cancer by Stuart Floyd A Thesis Submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUE In partial fulfillment of the requirements for the Degree

More information

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL International Journal Of Advanced Technology In Engineering And Science Www.Ijates.Com Volume No 03, Special Issue No. 01, February 2015 ISSN (Online): 2348 7550 ASSOCIATION RULE MINING ON WEB LOGS FOR

More information

An Introduction to WEKA. As presented by PACE

An Introduction to WEKA. As presented by PACE An Introduction to WEKA As presented by PACE Download and Install WEKA Website: http://www.cs.waikato.ac.nz/~ml/weka/index.html 2 Content Intro and background Exploring WEKA Data Preparation Creating Models/

More information

Web Site Visit Forecasting Using Data Mining Techniques

Web Site Visit Forecasting Using Data Mining Techniques Web Site Visit Forecasting Using Data Mining Techniques Chandana Napagoda Abstract: Data mining is a technique which is used for identifying relationships between various large amounts of data in many

More information

Healthcare Measurement Analysis Using Data mining Techniques

Healthcare Measurement Analysis Using Data mining Techniques www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik

More information

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition

More information

Rule based Classification of BSE Stock Data with Data Mining

Rule based Classification of BSE Stock Data with Data Mining International Journal of Information Sciences and Application. ISSN 0974-2255 Volume 4, Number 1 (2012), pp. 1-9 International Research Publication House http://www.irphouse.com Rule based Classification

More information

Sunnie Chung. Cleveland State University

Sunnie Chung. Cleveland State University Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:

More information

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant

More information

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

Mobile Phone APP Software Browsing Behavior using Clustering Analysis Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis

More information

Data Mining in Education: Data Classification and Decision Tree Approach

Data Mining in Education: Data Classification and Decision Tree Approach Data Mining in Education: Data Classification and Decision Tree Approach Sonali Agarwal, G. N. Pandey, and M. D. Tiwari Abstract Educational organizations are one of the important parts of our society

More information

Comparative Analysis of Classification Algorithms on Different Datasets using WEKA

Comparative Analysis of Classification Algorithms on Different Datasets using WEKA Volume 54 No13, September 2012 Comparative Analysis of Classification Algorithms on Different Datasets using WEKA Rohit Arora MTech CSE Deptt Hindu College of Engineering Sonepat, Haryana, India Suman

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful

More information

Hexaware E-book on Predictive Analytics

Hexaware E-book on Predictive Analytics Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,

More information

Subject Description Form

Subject Description Form Subject Description Form Subject Code Subject Title COMP417 Data Warehousing and Data Mining Techniques in Business and Commerce Credit Value 3 Level 4 Pre-requisite / Co-requisite/ Exclusion Objectives

More information

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,

More information

MS1b Statistical Data Mining

MS1b Statistical Data Mining MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to

More information

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign Arun K Mandapaka, Amit Singh Kushwah, Dr.Goutam Chakraborty Oklahoma State University, OK, USA ABSTRACT Direct

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, fabian.gruening@informatik.uni-oldenburg.de Abstract: Independent

More information