EFFECT OF DISCRETIZATION METHOD ON THE DIAGNOSIS OF PARKINSON S DISEASE
|
|
- Magnus Jackson
- 8 years ago
- Views:
Transcription
1 International Journal of Innovative Computing, Information and Control ICIC International c 2011 ISSN Volume 7, Number 8, August 2011 pp EFFECT OF DISCRETIZATION METHOD ON THE DIAGNOSIS OF PARKINSON S DISEASE Ersin Kaya, Oğuz Findik, İsmail Babaoğlu and Ahmet Arslan Department of Computer Engineering Faculty of Engineering and Architecture Selçuk University Selçuklu, Konya 42075, Turkey { ersinkaya; oguzf; ibabaoglu; ahmetarslan }@selcuk.edu.tr Received April 2010; revised January 2011 Abstract. Implementing different classification methods, this study analyzes the effect of discretization on the diagnosis of Parkinson s disease. Entropy-based discretization method is used as the discretization method, and support vector machines, C4.5, k-nearest neighbors and Naïve Bayes are used as the classification methods. The diagnosis of Parkinson s disease is implemented without using any preprocessing method. Afterwards, the Parkinson s disease dataset is classified after implementing entropy-based discretization on the dataset. Both results are compared, and it is observed that using discretization method increases the success of classification on the diagnosis of Parkinson s disease by 4.1% to 12.8%. Keywords: Parkinson s disease, Entropy-based discretization method, Classification methods 1. Introduction. Parkinson s disease is a kind of nervous system disorder which generally arises mostly in men in their 50s. This disease is firstly discovered by James Parkinson, so it was called Parkinson s disease [1]. The symptoms like poverty of movement, slowness of movement, rigidity and rest tremor are commonly diagnosed in patients with Parkinson s disease [2]. Nowadays, no treatment for Parkinson s disease is available. However, if the disease is diagnosed at an earlier time, drug treatments mitigating the effects of the symptoms are implemented at clinic environments [3]. Research into this disease shows that sound distortion occurs on 90% of Parkinson s disease [4,5]. Much research was performed by using voice disorders for the diagnosis of Parkinson s disease [6]. Little et al. used linear discriminant analysis (LDA) to identify the characteristics of sound data to be used in the diagnosis of the disease. For the diagnosis of Parkinson s disease, they composed a model using selected properties with support vector machine (SVM) classifier [7]. The data subjected to preprocessing in the classification process increase the performance of classification [8,9]. Discretization in the data mining is an important preprocessing type. Continuous-valued features in dataset are transformed to discrete values with discretization method. Research shows that discretization of continuous values features increases the performance of the classification. Polat et al. studying to diagnose nerve disease showed that when used with traditional methods like artificial neural network (ANN), least squares support vector machines, and C4.5, discretization increases the performance of classification [8]. Abraham et al. studying on 28 publicly available medical dataset pointed out the effect of discretization on the success of Naïve Bayes classification [10]. Demsar et al. created a predictive model on data consisting of 69 examples and 174 properties belonging to trauma patients. They used decision tree and Naïve Bayes 4669
2 4670 E. KAYA, O. FINDIK, İ. BABAOĞLU AND A. ARSLAN classification method in this model. Positive effects on the success of discretization methods have been shown in this study [11]. Acid et al. introduced a model which evaluates the performance of emergency service of a Spanish hospital by using Bayesian network. In their study, some continuously valued features were transformed into interval-valued features [12]. The data obtained from University of California Irvine (UCI) machine learning repository housing Parkinson s disease dataset are used in this study. The continuously valued features in the data are transformed into interval-valued features by discretization method based on entropy. Original data and discretizated data are classified by using Naïve Bayes, C4.5, k-nearest neighbor (k-nn) and SVM classifier methods. The results are compared with each other, and also the effect of discretization on the classification accuracy is shown. 2. Materials and Methods The Parkinson dataset. In this study, the dataset obtained from UCI machine learning repository is used. This dataset is composed of 32 people from both sexes, of them being Parkinson patients. 7 biomedical voice measurements are obtained from S21, S27 and S35 and 6 biomedical voice measurements from the others. The dataset is composed of 195 measurements and 22 features. Detailed analysis of the dataset is shown in Table Discretization. Discretization is an important pre-processing method in data analysis concept. By discretization methods, continuous-valued features are transformed into interval-valued features. Because the data is transformed into a more meaningful shape, the performance of the classification becomes more effective. There are many discretization methods like entropy-based, equal frequency and equal width discretization in literature [13,14]. Common steps of discretization methods are shown in Figure 1 and these steps can be summarized as follows. Firstly, values of the continuous-valued feature in the dataset are sorted. Then, the candidate cut points are determined for this continuous-valued feature. Fitness values of obtained candidate cut points are computed and values of the continuous-valued feature are splitted according to candidate cut point which has the best fitness value. These steps are used recursively until the stopping criterion. A discretization method is identified by determination of the candidate cut points, computation of the fitness values of candidate cut points and the stopping criterion Entropy-based discretization. Entropy-based discretization method is a commonly used discretization method proposed by Fayyad and Irani [15]. In this method, candidate cut-points are determined for the continuous-valued feature. The cut-point is selected according to the entropy of the candidate cut-points. Entropies of candidate cut-points are defined by following expressions: E(A, T ; S) = S 1 S Ent(S 1) S 2 S Ent(S 2) (1) Ent(S) = Z p(c i, S) log 2 (p(c i, S)) (2) i=1 where A is the feature which is going to be discretizated, T is candidate cut point, S is the set of samples, S 1 and S 2 are the subsets of the split samples for the left and right part of S, respectively, Z is the number of the classes in the dataset, C i is the decision value of the ith class, p(c i, S) is the proportion of samples/instances lying in the class C i.
3 EFFECT OF DISCRETIZATION METHOD ON DIAGNOSIS OF PARKINSON S DISEASE 4671 Table 1. Detailed analysis of the dataset Features Max Min Median Mean SD MDVP:Fo (Hz) , MDVP:Fhi (Hz) , MDVP:Flo (Hz) , MDVP:Jitter (%) , MDVP:Jitter (Abs) , MDVP:RAP , MDVP:PPQ , Jitter:DDP , MDVP:Shimmer , MDVP:Shimmer (db) , Shimmer:APQ , Shimmer:APQ , MDVP:APQ , Shimmer:DDA , NHR , HNR , RPDE , DFA , spread , Spread , D , PPE , Feature, names of the features obtained from biomedical voice measurements; Max, maximum value of the features; Min, minimum value of the features; Median, median value of the features; Mean, mean value of the features; SD, Standard derivation of the features; NoC, number of the cut points obtained after discretization; CP, value of the cut points obtained after discretization. After selection of the cut-point which has the minimum entropy, values of the continuousvalued feature are splitted into two parts. Then, this procedure is repeated until the stopping criterion is reached for each part. In entropy-based discretization method, the stopping criterion is defined by following expressions: Gain(A, T ; S) > log 2(N 1) N + (A, T ; S) N Gain(A, T ; S) = Ent(S) E(A, T ; S) (4) (A, T ; S) = log 2 (3 Z 2) [Z.Ent(S) Z 1.Ent(S 1 ) Z 2.Ent(S 2 )] (5) where A is the feature which is going to be discretizated, T is candidate cut point, S is the set of samples, S 1 and S 2 are the subsets of the split samples for the left and right part of S, respectively, N is the number of the samples in S, Z is the number of the classes in the dataset, Z 1 and Z 2 are the numbers of the classes present in S 1 and S 2, respectively Naïve Bayes classifier. Naïve Bayes is a probabilistic classification method [16]. v NB of each different class in training data is calculated for a new sample. The new sample is accepted to be a member of the class where it has the highest v NB value for that class (3)
4 4672 E. KAYA, O. FINDIK, İ. BABAOĞLU AND A. ARSLAN Figure 1. General steps of discretization method [17]. v NB is defined by following expression: v NB = arg max p(v j ) p(a i v j ) (6) v j ı where j is the number of the classes in the dataset, i is the number of the condition features in the dataset, a i is the value of ith feature, v j is the class value of jth class C4.5 decision tree classifier. Decision Tree classifier is a non-complex classification method. Decision trees are composed of nodes, branches and leaves. Nodes, branches and leaves are defined as the features, the values of features and the values of the decision features, respectively. Each different path which begins from the root node and reaches to the leaf denotes a rule like if condition1 and condition2 and... then decision. Nodes and branches correspond to condition terms, and leaves correspond to decision term in the rule. In this study, C4.5 method is used to create the decision tree. In this method, the feature which has maximum gain is determined as the root node. The gains belonging to
5 EFFECT OF DISCRETIZATION METHOD ON DIAGNOSIS OF PARKINSON S DISEASE 4673 the subset of branches of the root node are recalculated. Nodes having maximum gain within each subset are determined as sub-nodes [18,19]. The creation of the tree goes on until each branch denotes a class. Gain is defined by following expression: Gain(S, A) = Entropy(S) Ent(S) = v values(a) S v S Entropy(S v) (7) Z p(c i, S) log 2 (p(c i, S)) (8) i=1 where S is the set of samples, A is the feature which represents the calculated gain, S v is the set of samples in where A feature get v value, Z is the number of the classes in the dataset, p(c i, S) is the proportion of samples/instances lying in the class C i k-nearest neighbor classifier. k-nn is a supervised learning algorithm. The k- neighborhood parameter is determined in the initialization stage of k-nn. The k samples which are closest to the new sample are found among the training data. The class of the new sample is determined according to the closest k-samples by using majority voting [20]. Distance measurements like Euclidean, Hamming and Manhattan are used to calculate the distances of the samples to each other Support vector machine classifier. SVM, which is based on the statistical learning theory, is one of the most commonly used classification techniques. This technique was firstly proposed by Vapnik [21]. In basic concept of linear SVM, the method separates two classes from each other optimally. It is aimed to find the optimal separating hyperplane that makes the margin between the hyperplanes maximum so that the classes are optimally separated. As a learning method, SVM is often used to train and design radial basis function (RBF) networks, and generally, it is more successful compared to similar artificial neural networks. The formulations and the detailed concept of this commonly used classifier can be reached from studies given [22-28]. 3. Experimental Results. Implementing different classification methods, the researchers analyzed the effect of discretization on the diagnosis of Parkinson s disease. The dataset used in the study is available online in the UCI database containing Parkinson dataset. Entropy-based discretization is used as the discretization method. The reason for selecting entropy-based discretization as the discretization method is it is being an unsupervised discretization method. The dataset and discretizated form of the dataset are classified with Naive Bayes, C4.5, k-nn and SVM classification methods. Both of the obtained classification results are compared. To make the results more consistent, k-fold cross validation is used. Each classification is implemented by a 5-fold cross validation in this study. The dataset is classified in both discretizated and non-discretizated forms using RBF, linear and polynomial kernels with SVM classifier. The SVM classifier s kernel parameter range for c and σ can be given as [0.1, 30000] and [0.001, 10], respectively. RBF kernel is determined as the optimum kernel used in SVM. The parameters of the optimum RBF kernel are and 2 for G and c, respectively. k parameter is taken as 5 in k-nn. Euclidian distance is used as the distance measurement between samples in k-nn and is given as follows: D(x, y) = n (x i y i ) 2 (9) i=1
6 4674 E. KAYA, O. FINDIK, İ. BABAOĞLU AND A. ARSLAN where n is the number of the features in the dataset, x and y are the samples in the dataset. Classification accuracy, sensitivity, specificity and area under the ROC curve (AUC) measurements are utilized to compare the results. The measurements are as follows: T P + T N CA = (10) T P + T N + F P + F N T P SEN = (11) T P + F N T N SP E = (12) T N + F P AUC = Area Under the ROC curve (13) where, CA, SEN and SP E denoted classification accuracy, sensitivity and specificity, respectively. T P is number of healthy prediction in healthy samples. T N is number of patient prediction in patient samples. F P is number of patient prediction in healthy samples. F N is number of healthy prediction in patient samples. Twenty two continuous-valued features in Parkinson s disease dataset are discretizated by using entropy-based discretization method. Numbers and values of the cut-points of features are given in Table 2. Table 2. Cut-points of features No Features NoC CP No Features NoC CP 1 MDVP:Fo (Hz) Shimmer:APQ MDVP:Fhi (Hz) MDVP:APQ MDVP:Flo (Hz) Shimmer:DDA MDVP:Jitter (%) NHR MDVP:Jitter (Abs) HNR MDVP:RAP RPDE MDVP:PPQ DFA Jitter:DDP spread MDVP:Shimmer spread MDVP:Shimmer (db) D Shimmer:APQ PPE Feature, names of the features obtained from biomedical voice measurements; NoC, the number of the cut points obtained after discretization; CP, the value of the cut-points obtained after discretization. The classification accuracy, sensitivity, specificity and AUC which are obtained from both discretizated and non-discretizated forms of the classification processes using naive Bayes, C4.5, k-nn and SVM classifiers are given in Table 3. By using entropy-based discretization method, the classification accuracies and AUC values of Naive Bayes, C4.5, k-nn, SVM classifiers have increased to 8.2%, 4.1%, 9.2%, 12.8% and 0.94%, 7.24%, 8.42%, 8.82%, respectively.
7 EFFECT OF DISCRETIZATION METHOD ON DIAGNOSIS OF PARKINSON S DISEASE 4675 Table 3. Classification results CA (%) Sen Spe AUC Naïve Bayes non-discretizated discretizated C4.5 non-discretizated discretizated k-nn non-discretizated discretizated SVM non-discretizated discretizated CA, SEN, SPE and AUC are denoted classification accuracy, sensitivity, specificity and Area under ROC curve, respectively. ROC curves belonging to healthy and unhealthy samples obtained using Naive Bayes, C4.5, k-nn and SVM are as shown in Figures 2-5. As shown by ROC curves, after discretization of dataset, an increase in classification accuracy has been observed in this study. Besides, the obtained results have shown that discretization method has given a very promising result in the diagnosis of Parkinson disease. The best model on the diagnosing of Parkinson disease was SVM with discretizated dataset. As a result, discretization method can be used in medical dataset as pre-processing. Thanks to discretization, diagnosis of diseases can be performed more accurately. 4. Conclusion. In this study, the dataset of Parkinson s disease obtained from UCI machine learning repository is used. Naïve Bayes, C4.5, k-nn and SVM classifier methods are used to classify the dataset. The dataset is classified using the features discretizated and non-discretizated in order to show the effectiveness of discretization on diagnosis of Parkinson s disease. The results have shown that discretization increases the classification accuracy of the diagnosis of Parkinson s disease. (a) (b) Figure 2. (a) ROC curve belongs to the healthy class obtained by classification of both discretizated and non-discretizated dataset using Naïve Bayes and (b) ROC curve belongs to the unhealthy class obtained by classification of both discretizated and non-discretizated datasets using Naïve Bayes
8 4676 E. KAYA, O. FINDIK, İ. BABAOĞLU AND A. ARSLAN (a) (b) Figure 3. (a) ROC curve belongs to the healthy class obtained by classification of both discretizated and non-discretizated dataset using C4.5 and (b) ROC curve belongs to the unhealthy class obtained by classification of both discretizated and non-discretizated datasets using C4.5 (a) (b) Figure 4. (a) ROC curve belongs to the healthy class obtained by classification of both discretizated and non-discretizated datasets using k-nn and (b) ROC curve belongs to the unhealthy class obtained by classification of both discretizated and non-discretizated datasets using k-nn (a) (b) Figure 5. (a) ROC curve belongs to the healthy class obtained by classification of both discretizated and non-discretizated datasets using SVM and (b) ROC curve belongs to the unhealthy class obtained by classification of both discretizated and non-discretizated datasets using SVM
9 EFFECT OF DISCRETIZATION METHOD ON DIAGNOSIS OF PARKINSON S DISEASE 4677 REFERENCES [1] A. E. Lang and A. M. Lozano, Parkinson s disease First of two parts, The New England Journal of Medicine, vol.339, pp , [2] N. Singh, V. Pillay and Y. E. Choonara, Advances in the treatment of Parkinson s disease, Progr. Neurobiol, vol.81, pp.29-44, [3] National Collaborating Centre for Chronic Conditions, Parkinson s disease: National clinical guideline for diagnosis and management in primary and secondary care, Royal College of Physicians, [4] A. K. Ho, R. Iansek, C. Marigliani, J. L. Bradshaw and S. Gates, Speech impairment in a large sample of patients with Parkinson s disease, Behavioural Neurology, vol.11, pp , [5] J. A. Logemann, H. B. Fisher, B. Boshes and E. R. Blonsky, Frequency and co-occurrence of vocaltract dysfunctions in speech of a large sample of parkinson patients, Journal of Speech and Hearing Disorders, vol.43, pp.47-57, [6] M. A. Little, P. E. McSharry, S. J. Roberts, D. A. Costello and I. M. Moroz, Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection, Biomedical Engineering Online, vol.6, pp.23-58, [7] M. A. Little, E. McSharry, E. J. Hunter, J. Spielman and L. O. Ramig, Suitability of dysphonia measurements for telemonitoring of Parkinson s disease, IEEE Transactions on Biomedical Engineering, vol.56, pp , [8] A. Kumar and D. Zhang, Hand-geometry recognition using entropy-based discretization, IEEE Transactions on Information Forensics and Security, vol.2, pp , [9] K. Polat, S. Kara, A. Güven and S. Güneş, Utilization of discretization method on the diagnosis of optic nerve disease, Computer Methods and Programs in Biomedicine, vol.91, pp , [10] R. Abraham, J. Simha and S. Iyengar, A comparative analysis of discretization methods for medical datamining with Naïve Bayesian clasifier, The 9th International Conference on Information Technology, pp , [11] J. Demsar, B. Zupan, N. Aoki, M. J. Wall, T. H. Granchi and J. R. Beck, Feature mining and predictive model construction from severe trauma patient s data, International Journal of Medical Informatics, vol.63, pp.41-50, [12] S. Acid, L. M. Campos, J. M. Fernandez-Luna, S. Rodriguez, J. M. Rodriguez and J. L. Salcedo, A comparison of learning algorithms for Bayesian networks: A case study based on data from an emergency medical service, Artificial Intelligence in Medicine, vol.30, pp , [13] M. K. Ismail and V. Ciesielski, An empirical investigation of the impact of discretization on common data distributions, Design and Application of Hybrid Intelligent Systems, pp , [14] H. Kodaz, S. Özşen, A. Arslan and S. Güneş, Medical application of information gain based artificial immune recognition system (AIRS): Diagnosis of thyroid disease, Expert Systems with Applications, vol.36, no.2, pp , [15] U. M. Fayyad and K. B. Irani, Multi-interval discretization of continuous-valued attributes for classification learning, The 13th International Joint Conference on Artificial Intelligence, pp , [16] H. Kima and S. Chen, Associative Naïve Bayes classifier: Automated linking of gene ontology to medline documents, Pattern Recognition, vol.42, pp , [17] C. Hsu, H. Huang and T. Wong, On why discretization works for Naïve Bayesian, Lecture Notes in Computer Science, pp , [18] M. Hill and M. T. Mitchell, Machine Learning, Singapore, [19] J. R. Quinlan, Induction of C4.5 decision trees, Machine Learning, vol.1, pp , [20] G. Shakhnarovish, T. Darrell and P. Indyk, Nearest-Neighbor Methods in Learning and Vision, MIT Press, [21] V. Vapnik, The Nature of Statistical Learning Theory, Springer, New York, [22] K. Y. Chen and C. H. Wang, A hybrid SARIMA and support vector machines in forecasting the production values of the machinery industry in Taiwan, Expert Systems with Applications, vol.32, pp , [23] E. Çomak, A. Arslan and İ. Türko qlu, A decision support system based on support vector machines for diagnosis of the heart valve diseases, Computers in Biology and Medicine, vol.37, pp.21-27, [24] K. Takeuchi and N. Collier, Bio-medical entity extraction using support vector machines, Artificial Intelligence in Medicine, vol.33, no.2, pp , 2003.
10 4678 E. KAYA, O. FINDIK, İ. BABAOĞLU AND A. ARSLAN [25] J. Chen and F. Pan, A new online support vector machine algorithm, ICIC Express Letters, vol.4, no.1, pp , [26] Z. Chen, W. Hong and C. Wang, RNA secondary structure prediction with plane pseudoknots based on support vector machine, ICIC Express Letters, vol.3, no.4(b), pp , [27] B. R. Chang and H. F. Tsai, Training support vector regression by quantum-neuron-based hopfield neural net with nested local adiabatic evolution, International Journal of Innovative Computing, Information and Control, vol.5, no.4, pp , [28] N. Begum, M. A. Fattah and F. Ren, Automatic text summarization using support vector machine, International Journal of Innovative Computing, Information and Control, vol.5, no.7, pp , 2009.
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationClassification algorithm in Data mining: An Overview
Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
More informationAn Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
More informationA NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE Kasra Madadipouya 1 1 Department of Computing and Science, Asia Pacific University of Technology & Innovation ABSTRACT Today, enormous amount of data
More informationANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS
ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS Abstract D.Lavanya * Department of Computer Science, Sri Padmavathi Mahila University Tirupati, Andhra Pradesh, 517501, India lav_dlr@yahoo.com
More informationPredicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
More informationENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA
ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA D.Lavanya 1 and Dr.K.Usha Rani 2 1 Research Scholar, Department of Computer Science, Sree Padmavathi Mahila Visvavidyalayam, Tirupati, Andhra Pradesh,
More informationInternational Journal of Software and Web Sciences (IJSWS) www.iasir.net
International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) ISSN (Print): 2279-0063 ISSN (Online): 2279-0071 International
More informationINTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR. ankitanandurkar2394@gmail.com
IJFEAT INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR Bharti S. Takey 1, Ankita N. Nandurkar 2,Ashwini A. Khobragade 3,Pooja G. Jaiswal 4,Swapnil R.
More informationAn Introduction to Data Mining
An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015
RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering
More informationIdentifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100
Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Erkan Er Abstract In this paper, a model for predicting students performance levels is proposed which employs three
More informationData Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
More informationA Content based Spam Filtering Using Optical Back Propagation Technique
A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT
More informationModel Trees for Classification of Hybrid Data Types
Model Trees for Classification of Hybrid Data Types Hsing-Kuo Pao, Shou-Chih Chang, and Yuh-Jye Lee Dept. of Computer Science & Information Engineering, National Taiwan University of Science & Technology,
More informationTOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam
More informationFeature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier
Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier D.Nithya a, *, V.Suganya b,1, R.Saranya Irudaya Mary c,1 Abstract - This paper presents,
More informationHYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION
HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION Chihli Hung 1, Jing Hong Chen 2, Stefan Wermter 3, 1,2 Department of Management Information Systems, Chung Yuan Christian University, Taiwan
More informationKeywords data mining, prediction techniques, decision making.
Volume 5, Issue 4, April 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analysis of Datamining
More informationData Mining: A Preprocessing Engine
Journal of Computer Science 2 (9): 735-739, 2006 ISSN 1549-3636 2005 Science Publications Data Mining: A Preprocessing Engine Luai Al Shalabi, Zyad Shaaban and Basel Kasasbeh Applied Science University,
More informationScalable Developments for Big Data Analytics in Remote Sensing
Scalable Developments for Big Data Analytics in Remote Sensing Federated Systems and Data Division Research Group High Productivity Data Processing Dr.-Ing. Morris Riedel et al. Research Group Leader,
More informationData Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin
Data Mining for Customer Service Support Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin Traditional Hotline Services Problem Traditional Customer Service Support (manufacturing)
More informationAUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM
AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM ABSTRACT Luis Alexandre Rodrigues and Nizam Omar Department of Electrical Engineering, Mackenzie Presbiterian University, Brazil, São Paulo 71251911@mackenzie.br,nizam.omar@mackenzie.br
More informationComparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
More informationComparison of Data Mining Techniques used for Financial Data Analysis
Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract
More informationPredict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons
More informationDATA MINING AND REPORTING IN HEALTHCARE
DATA MINING AND REPORTING IN HEALTHCARE Divya Gandhi 1, Pooja Asher 2, Harshada Chaudhari 3 1,2,3 Department of Information Technology, Sardar Patel Institute of Technology, Mumbai,(India) ABSTRACT The
More informationDECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com
More informationArtificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier
International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6, January 2013 Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing
More informationHong Kong Stock Index Forecasting
Hong Kong Stock Index Forecasting Tong Fu Shuo Chen Chuanqi Wei tfu1@stanford.edu cslcb@stanford.edu chuanqi@stanford.edu Abstract Prediction of the movement of stock market is a long-time attractive topic
More informationSURVEY OF TEXT CLASSIFICATION ALGORITHMS FOR SPAM FILTERING
I J I T E ISSN: 2229-7367 3(1-2), 2012, pp. 233-237 SURVEY OF TEXT CLASSIFICATION ALGORITHMS FOR SPAM FILTERING K. SARULADHA 1 AND L. SASIREKA 2 1 Assistant Professor, Department of Computer Science and
More informationMachine learning for algo trading
Machine learning for algo trading An introduction for nonmathematicians Dr. Aly Kassam Overview High level introduction to machine learning A machine learning bestiary What has all this got to do with
More informationlife science data mining
life science data mining - '.)'-. < } ti» (>.:>,u» c ~'editors Stephen Wong Harvard Medical School, USA Chung-Sheng Li /BM Thomas J Watson Research Center World Scientific NEW JERSEY LONDON SINGAPORE.
More informationLearning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal
Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether
More informationUsing Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
More informationA Comparative Analysis of Classification Techniques on Categorical Data in Data Mining
A Comparative Analysis of Classification Techniques on Categorical Data in Data Mining Sakshi Department Of Computer Science And Engineering United College of Engineering & Research Naini Allahabad sakshikashyap09@gmail.com
More informationPrediction and Diagnosis of Heart Disease by Data Mining Techniques
Prediction and Diagnosis of Heart Disease by Data Mining Techniques Boshra Bahrami, Mirsaeid Hosseini Shirvani* Department of Computer Engineering, Sari Branch, Islamic Azad University Sari, Iran Boshrabahrami_znu@yahoo.com;
More informationA fast multi-class SVM learning method for huge databases
www.ijcsi.org 544 A fast multi-class SVM learning method for huge databases Djeffal Abdelhamid 1, Babahenini Mohamed Chaouki 2 and Taleb-Ahmed Abdelmalik 3 1,2 Computer science department, LESIA Laboratory,
More informationComparing the Results of Support Vector Machines with Traditional Data Mining Algorithms
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail
More informationClassifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang
Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental
More informationDATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
More informationEmail Spam Detection A Machine Learning Approach
Email Spam Detection A Machine Learning Approach Ge Song, Lauren Steimle ABSTRACT Machine learning is a branch of artificial intelligence concerned with the creation and study of systems that can learn
More informationHealthcare Data Mining: Prediction Inpatient Length of Stay
3rd International IEEE Conference Intelligent Systems, September 2006 Healthcare Data Mining: Prediction Inpatient Length of Peng Liu, Lei Lei, Junjie Yin, Wei Zhang, Wu Naijun, Elia El-Darzi 1 Abstract
More informationMachine Learning in Spam Filtering
Machine Learning in Spam Filtering A Crash Course in ML Konstantin Tretyakov kt@ut.ee Institute of Computer Science, University of Tartu Overview Spam is Evil ML for Spam Filtering: General Idea, Problems.
More informationData quality in Accounting Information Systems
Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania
More informationData Mining Analysis (breast-cancer data)
Data Mining Analysis (breast-cancer data) Jung-Ying Wang Register number: D9115007, May, 2003 Abstract In this AI term project, we compare some world renowned machine learning tools. Including WEKA data
More informationIDENTIFIC ATION OF SOFTWARE EROSION USING LOGISTIC REGRESSION
http:// IDENTIFIC ATION OF SOFTWARE EROSION USING LOGISTIC REGRESSION Harinder Kaur 1, Raveen Bajwa 2 1 PG Student., CSE., Baba Banda Singh Bahadur Engg. College, Fatehgarh Sahib, (India) 2 Asstt. Prof.,
More informationApplied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.
Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.38457 Accuracy Rate of Predictive Models in Credit Screening Anirut Suebsing
More informationDiscretization and grouping: preprocessing steps for Data Mining
Discretization and grouping: preprocessing steps for Data Mining PetrBerka 1 andivanbruha 2 1 LaboratoryofIntelligentSystems Prague University of Economic W. Churchill Sq. 4, Prague CZ 13067, Czech Republic
More informationCategorical Data Visualization and Clustering Using Subjective Factors
Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,
More informationAn Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
More informationON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION
ISSN 9 X INFORMATION TECHNOLOGY AND CONTROL, 00, Vol., No.A ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION Danuta Zakrzewska Institute of Computer Science, Technical
More informationPredicting Student Performance by Using Data Mining Methods for Classification
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, No 1 Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0006 Predicting Student Performance
More informationARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)
ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Preliminaries Classification and Clustering Applications
More informationFirst Semester Computer Science Students Academic Performances Analysis by Using Data Mining Classification Algorithms
First Semester Computer Science Students Academic Performances Analysis by Using Data Mining Classification Algorithms Azwa Abdul Aziz, Nor Hafieza IsmailandFadhilah Ahmad Faculty Informatics & Computing
More informationData Mining Classification: Decision Trees
Data Mining Classification: Decision Trees Classification Decision Trees: what they are and how they work Hunt s (TDIDT) algorithm How to select the best split How to handle Inconsistent data Continuous
More informationCOMBINED METHODOLOGY of the CLASSIFICATION RULES for MEDICAL DATA-SETS
COMBINED METHODOLOGY of the CLASSIFICATION RULES for MEDICAL DATA-SETS V.Sneha Latha#, P.Y.L.Swetha#, M.Bhavya#, G. Geetha#, D. K.Suhasini# # Dept. of Computer Science& Engineering K.L.C.E, GreenFields-522502,
More informationDATA MINING USING INTEGRATION OF CLUSTERING AND DECISION TREE
DATA MINING USING INTEGRATION OF CLUSTERING AND DECISION TREE 1 K.Murugan, 2 P.Varalakshmi, 3 R.Nandha Kumar, 4 S.Boobalan 1 Teaching Fellow, Department of Computer Technology, Anna University 2 Assistant
More informationCOMPARING NEURAL NETWORK ALGORITHM PERFORMANCE USING SPSS AND NEUROSOLUTIONS
COMPARING NEURAL NETWORK ALGORITHM PERFORMANCE USING SPSS AND NEUROSOLUTIONS AMJAD HARB and RASHID JAYOUSI Faculty of Computer Science, Al-Quds University, Jerusalem, Palestine Abstract This study exploits
More informationDATA MINING-BASED PREDICTIVE MODEL TO DETERMINE PROJECT FINANCIAL SUCCESS USING PROJECT DEFINITION PARAMETERS
DATA MINING-BASED PREDICTIVE MODEL TO DETERMINE PROJECT FINANCIAL SUCCESS USING PROJECT DEFINITION PARAMETERS Seungtaek Lee, Changmin Kim, Yoora Park, Hyojoo Son, and Changwan Kim* Department of Architecture
More informationDecision Support System on Prediction of Heart Disease Using Data Mining Techniques
International Journal of Engineering Research and General Science Volume 3, Issue, March-April, 015 ISSN 091-730 Decision Support System on Prediction of Heart Disease Using Data Mining Techniques Ms.
More informationData Mining Techniques for Prognosis in Pancreatic Cancer
Data Mining Techniques for Prognosis in Pancreatic Cancer by Stuart Floyd A Thesis Submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUE In partial fulfillment of the requirements for the Degree
More informationAnalysisofData MiningClassificationwithDecisiontreeTechnique
Global Journal of omputer Science and Technology Software & Data Engineering Volume 13 Issue 13 Version 1.0 Year 2013 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals
More informationBagged Ensemble Classifiers for Sentiment Classification of Movie Reviews
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 3 Issue 2 February, 2014 Page No. 3951-3961 Bagged Ensemble Classifiers for Sentiment Classification of Movie
More informationIntroduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
More informationE-commerce Transaction Anomaly Classification
E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce
More informationReference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors
Classification k-nearest neighbors Data Mining Dr. Engin YILDIZTEPE Reference Books Han, J., Kamber, M., Pei, J., (2011). Data Mining: Concepts and Techniques. Third edition. San Francisco: Morgan Kaufmann
More informationImproving spam mail filtering using classification algorithms with discretization Filter
International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational
More informationElectroencephalography Analysis Using Neural Network and Support Vector Machine during Sleep
Engineering, 23, 5, 88-92 doi:.4236/eng.23.55b8 Published Online May 23 (http://www.scirp.org/journal/eng) Electroencephalography Analysis Using Neural Network and Support Vector Machine during Sleep JeeEun
More informationThe treatment of missing values and its effect in the classifier accuracy
The treatment of missing values and its effect in the classifier accuracy Edgar Acuña 1 and Caroline Rodriguez 2 1 Department of Mathematics, University of Puerto Rico at Mayaguez, Mayaguez, PR 00680 edgar@cs.uprm.edu
More informationEFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE
EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE S. Anupama Kumar 1 and Dr. Vijayalakshmi M.N 2 1 Research Scholar, PRIST University, 1 Assistant Professor, Dept of M.C.A. 2 Associate
More informationEMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH
EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH SANGITA GUPTA 1, SUMA. V. 2 1 Jain University, Bangalore 2 Dayanada Sagar Institute, Bangalore, India Abstract- One
More informationPredictive Data modeling for health care: Comparative performance study of different prediction models
Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath hiremat.nitie@gmail.com National Institute of Industrial Engineering (NITIE) Vihar
More informationAn Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset
P P P Health An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset Peng Liu 1, Elia El-Darzi 2, Lei Lei 1, Christos Vasilakis 2, Panagiotis Chountas 2, and Wei Huang
More informationTowards better accuracy for Spam predictions
Towards better accuracy for Spam predictions Chengyan Zhao Department of Computer Science University of Toronto Toronto, Ontario, Canada M5S 2E4 czhao@cs.toronto.edu Abstract Spam identification is crucial
More informationPrediction of Heart Disease Using Naïve Bayes Algorithm
Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,
More informationGender Identification using MFCC for Telephone Applications A Comparative Study
Gender Identification using MFCC for Telephone Applications A Comparative Study Jamil Ahmad, Mustansar Fiaz, Soon-il Kwon, Maleerat Sodanil, Bay Vo, and * Sung Wook Baik Abstract Gender recognition is
More informationImplementation of Data Mining Techniques to Perform Market Analysis
Implementation of Data Mining Techniques to Perform Market Analysis B.Sabitha 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, P.Balasubramanian 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
More informationData Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1
Data Mining 1 Introduction 2 Data Mining methods Alfred Holl Data Mining 1 1 Introduction 1.1 Motivation 1.2 Goals and problems 1.3 Definitions 1.4 Roots 1.5 Data Mining process 1.6 Epistemological constraints
More informationAn Overview of Data Mining Techniques Applied for Heart Disease Diagnosis and Prediction
Lecture Notes on Information Theory Vol. 2, No. 4, December 2014 An Overview of Data Mining Techniques Applied for Heart Disease Diagnosis and Prediction Salha M. Alzahani, Afnan Althopity, Ashwag Alghamdi,
More informationRule based Classification of BSE Stock Data with Data Mining
International Journal of Information Sciences and Application. ISSN 0974-2255 Volume 4, Number 1 (2012), pp. 1-9 International Research Publication House http://www.irphouse.com Rule based Classification
More informationNeural Networks Lesson 5 - Cluster Analysis
Neural Networks Lesson 5 - Cluster Analysis Prof. Michele Scarpiniti INFOCOM Dpt. - Sapienza University of Rome http://ispac.ing.uniroma1.it/scarpiniti/index.htm michele.scarpiniti@uniroma1.it Rome, 29
More informationApplication of Event Based Decision Tree and Ensemble of Data Driven Methods for Maintenance Action Recommendation
Application of Event Based Decision Tree and Ensemble of Data Driven Methods for Maintenance Action Recommendation James K. Kimotho, Christoph Sondermann-Woelke, Tobias Meyer, and Walter Sextro Department
More informationEquity forecast: Predicting long term stock price movement using machine learning
Equity forecast: Predicting long term stock price movement using machine learning Nikola Milosevic School of Computer Science, University of Manchester, UK Nikola.milosevic@manchester.ac.uk Abstract Long
More informationCustomer Data Mining and Visualization by Generative Topographic Mapping Methods
Customer Data Mining and Visualization by Generative Topographic Mapping Methods Jinsan Yang and Byoung-Tak Zhang Artificial Intelligence Lab (SCAI) School of Computer Science and Engineering Seoul National
More informationTweaking Naïve Bayes classifier for intelligent spam detection
682 Tweaking Naïve Bayes classifier for intelligent spam detection Ankita Raturi 1 and Sunil Pranit Lal 2 1 University of California, Irvine, CA 92697, USA. araturi@uci.edu 2 School of Computing, Information
More informationAdvanced Ensemble Strategies for Polynomial Models
Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer
More informationHow To Cluster
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main
More informationPredicting required bandwidth for educational institutes using prediction techniques in data mining (Case Study: Qom Payame Noor University)
260 IJCSNS International Journal of Computer Science and Network Security, VOL.11 No.6, June 2011 Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case
More informationNetwork Intrusion Detection Using a HNB Binary Classifier
2015 17th UKSIM-AMSS International Conference on Modelling and Simulation Network Intrusion Detection Using a HNB Binary Classifier Levent Koc and Alan D. Carswell Center for Security Studies, University
More informationKnowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
More informationData Mining for Knowledge Management. Classification
1 Data Mining for Knowledge Management Classification Themis Palpanas University of Trento http://disi.unitn.eu/~themis Data Mining for Knowledge Management 1 Thanks for slides to: Jiawei Han Eamonn Keogh
More informationA Survey on classification & feature selection technique based ensemble models in health care domain
A Survey on classification & feature selection technique based ensemble models in health care domain GarimaSahu M.Tech (CSE) Raipur Institute of Technology,(R.I.T.) Raipur, Chattishgarh, India garima.sahu03@gmail.com
More informationBIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
More informationBIG DATA IN HEALTHCARE THE NEXT FRONTIER
BIG DATA IN HEALTHCARE THE NEXT FRONTIER Divyaa Krishna Sonnad 1, Dr. Jharna Majumdar 2 2 Dean R&D, Prof. and Head, 1,2 Dept of CSE (PG), Nitte Meenakshi Institute of Technology Abstract: The world of
More informationData Mining Essentials
This chapter is from Social Media Mining: An Introduction. By Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu. Cambridge University Press, 2014. Draft version: April 20, 2014. Complete Draft and Slides
More informationBiomarker Discovery and Data Visualization Tool for Ovarian Cancer Screening
, pp.169-178 http://dx.doi.org/10.14257/ijbsbt.2014.6.2.17 Biomarker Discovery and Data Visualization Tool for Ovarian Cancer Screening Ki-Seok Cheong 2,3, Hye-Jeong Song 1,3, Chan-Young Park 1,3, Jong-Dae
More informationDocument Image Retrieval using Signatures as Queries
Document Image Retrieval using Signatures as Queries Sargur N. Srihari, Shravya Shetty, Siyuan Chen, Harish Srinivasan, Chen Huang CEDAR, University at Buffalo(SUNY) Amherst, New York 14228 Gady Agam and
More informationMachine Learning in FX Carry Basket Prediction
Machine Learning in FX Carry Basket Prediction Tristan Fletcher, Fabian Redpath and Joe D Alessandro Abstract Artificial Neural Networks ANN), Support Vector Machines SVM) and Relevance Vector Machines
More information