Chapter 7. Diagnosis and Prognosis of Breast Cancer using Histopathological Data

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Chapter 7. Diagnosis and Prognosis of Breast Cancer using Histopathological Data"

Transcription

1 Chapter 7 Diagnosis and Prognosis of Breast Cancer using Histopathological Data In the previous chapter, a method for classification of mammograms using wavelet analysis and adaptive neuro-fuzzy inference system (ANFIS) was analyzed. In this chapter, cytologically proved tumors are evaluated using support vector machine (SVM), radial basis function neural network (RBFNN) and auto associative neural network (AANN) based on the analysis of the histopathological data obtained from fine needle aspirate (FNA) procedure. Diagnosis of breast cancer is carried out using the polynomial kernel of SVM and RBFNN. Accurate cancer prognosis prediction is critical to cancer treatment. Prognosis is a medical term denoting the doctor s prediction of how a patient will progress, and whether there is a chance of recovery. In this chapter, prognosis of breast cancer is also carried out using a different set of histopathological data and the classifiers namely SVM and AANN are used to predict the long term behavior of the disease. 7.1 Introduction A pathologist is a physician who analyzes cells and tissues under a microscope. The pathologist s report helps to characterize specimens taken during biopsy or other surgical procedures and also helps to determine the treatment. Histology is the study of tissues, including cellular structure and function. To determine a tumor s histologic grade, pathologists examine the tissue for cellular patterns under a microscope. A sample of breast cells may be taken from a breast biopsy and the findings of the pathologist are recorded to form a database and this serves as input to the classifier. 108

2 In this chapter, histopathological data obtained from the Wisconsin breast cancer database is used in the diagnosis and prognosis of breast cancer. 7.2 Dataset used for Diagnosis of Breast Cancer In this section, histopathological data are used to demonstrate the applicability of SVM and RBFNN to medical diagnosis and decision making. The database containing 699 instances of breast cancer cases obtained from the Wisconsin diagnosis breast cancer database [29] is used for this purpose. The feature vector formulated has nine attributes related to the frequency of cell mitosis (rate of cell division) and nuclear pleomorphism (change in cell size, shape and uniformity), etc. The nine features used for classification include clump thickness, uniformity of cell size, uniformity of cell shape, marginal adhesion, single epithelial cell size, bare nuclei, bland chromatin, normal nucleoli and mitosis. These nine characteristics are found to differ significantly between benign and malignant samples. Each of the nine cytological characteristics of breast FNA reported to differ between benign and malignant samples graded 1 to 10 at the time of sample collection. Out of the total data 65.5% data belong to benign class and the remaining 34.5% of the data belong to malignant class. 7.3 Techniques for Diagnosis of Breast Cancer Radial Basis Function Neural Network The RBFNN has a feed forward architecture as shown in Fig The construction of a radial basis function network in its most basic form involves three different layers. The input layer is made up of N I units for a N I dimensional input vector. The input layer is fully connected to the second layer which is a hidden layer of N H units. The hidden layer units are fully connected to the N C output layer units where N C is the number of output classes. The output layer supplies the response of the network to the activation patterns applied to the input layer. The transformation from the input space to the hidden-unit space is nonlinear whereas the transformation from the hidden-unit space to the output space is linear [180]. 109

3 Fig. 7.1: Architecture of a radial basis function neural network. The activation functions (AFs) of the hidden layers are chosen to be Gaussian and are characterized by their mean vectors (centers) µ i, and covariance matrices Σ i, i = 1, 2..., N H. For simplicity it is assumed that the covariance matrices are of the form Σ i = σi 2 I, where i = 1, 2..., N H. Then the activation function of the i th hidden unit for an input vector x j is given by (7.1): ( ) xj µ i 2 g i (x j ) = exp 2σ 2 i (7.1) The µ i and σi 2 are estimated using a suitable clustering algorithm. The number of AFs in the network and their spread influence the smoothness of the mapping. The number of hidden units is empirically determined and it is assumed that σi 2 = σ2, where σ 2 is given in (7.2). σ 2 = η l2 2 (7.2) In (7.2), l is the maximum distance between the chosen centers and η is the empirical factor which serves to control the smoothness of the mapping function. Therefore (7.1) 110

4 is rewritten as ( ) xj µ i 2 g i (x j ) = exp (7.3) η l 2 The hidden layer units are fully connected to the N C output layer through weights λ ik. The output units are linear, and the response of the k th output for an input x j is given by y k (x j ) = N H i=0 λ ik g i (x j ), k = 1, 2,...N C (7.4) where g 0 (x j ) = 1. Given N T cytology feature vectors from N C classes, that is (benign and malignant) training the RBFNN involves estimating µ i, i = 1, 2,...N H, η, l 2 and λ ik, i = 1, 2,...N H. Training the RBFNN involves two stages [181]. First, the basis functions must be established using an algorithm to cluster data in the training set. Typical ways to do this include Kohonen self organizing maps [182], k-means clustering, decision trees, genetic algorithms or orthogonal least squares algorithms [183]. In this study, k-means clustering is used. k-means clustering involves sorting all objects into a predefined number of groups by minimizing the total squared Euclidean distance for each object with respect to its nearest cluster center. Next, it is necessary to fix the weights linking the hidden and the output layers. If neurons in the output layer contain linear activation functions, these weights can be calculated directly using matrix inversion (using singular value decomposition) and matrix multiplication. Because of the direct calculation of weights in an RBFNN, it is usually much quicker to train than an equivalent multi-layer perceptron (MLP) training algorithm Experimental Results and Discussion Nine cytological features of breast fine-needle aspirates reported to differ between benign and malignant samples of 699 patients are used to train and test the models. All the features are first normalized between 1 and +1 in order for the classifier to have a common range to work with. A program has been written in C language for that purpose. 111

5 Training and Testing RBFNN In this implementation, the k-means unsupervised algorithm was used to estimate the hidden-layer weights from a set of training data. After the initial training and the estimation of the hidden-layer weights, the weights in the output layer are computed. The training phase consists of two steps. By using the k-means algorithm, appropriate centers are generated based on the training patterns as the first step. Initially the dataset containing 699 patterns are stored as two data files one containing data related to benign class (458 instances)and the other related to malignant class (241 instances). A program has been written to generate the required number of centroids for each of the class datasets. Then all the generated means are combined into a single file. The computed centers are copied into the corresponding links. Evenly distributed centers from the training patterns are selected and assigned to the links between input and hidden layer. The second step is the computation of the weights between the hidden layer and the output layer. Then another program has been written to test the data using the weights so generated. The performance of the classifier has been found out by varying the number of centroids in each run. The classifier output for the test data has been compared with the original class attribute for identifying true positives, true negatives, false positives and false negative values. Table 7.1 gives these values in the form of a confusion matrix and Table 7.2 shows the performance metrics calculated using this confusion matrix. The overall performance of RBFNN is arrived at by taking the average performance values of the different clusters and it is shown in Fig Support vector machine A support vector machine performs classification by constructing an N-dimensional hyperplane that optimally separates the data into two categories. A brief overview of SVM is given in Section of Chapter

6 Table 7.1: RBFNN : Confusion matrix for diagnosis of breast cancer using histopathological data. No.of Clusters (k) tp tn f p f n Table 7.2: RBFNN : Performance measures for diagnosis of breast cancer using histopathological data. Performance in (%) No. of Clusters Accuracy Specificity Sensitivity F-Score Training and Testing SVM SVM Torch, is used for training and testing the model [173]. In order to evaluate the result three fold cross validation is used. A program has been written in C to divide the data randomly into three different sets for training and testing the classifier. The train data includes the nine feature attributes and a class attribute, while the test data has only the nine feature attributes excluding the class attribute. The polynomial kernel based SVM is trained using two third (433 instances) of the data randomly chosen and tested with the remaining one third (233 instances) of the data for evaluating the classifier s effectiveness. Training and testing is done using all the three randomly 113

7 Fig. 7.2: Overall performance of RBFNN for diagnosis of breast cancer using histopathological data. divided sets(3 cross validation)to ensure fair and unbiased classification. The classifier output for the test data is compared with the original class attribute for identifying true positives, true negatives, false positives and false negative values. Table 7.3 gives these values in the form of a confusion matrix and Table 7.4 shows the performance metrics calculated using this confusion matrix. The overall performance of SVM is arrived at, by taking the average performance values of the different cross runs and it is shown in Fig. 7.3 Accuracy approximates, how effective the algorithm is, by showing the probability of the true value of the class label. Table 7.5 shows that the accuracy of RBFNN in classifying benign and malignant mass using cytological data is better (96.31%) than SVM ( 92.11%). Sensitivity/Specificity approximates the probability of the positive/negative label being true (assesses the effectiveness of the algorithm on a single class). Here positive 114

8 Table 7.3: SVM : Confusion matrix for diagnosis of breast cancer using histopathological data. Cross run tp tn f p f n Table 7.4: SVM : Performance measures for diagnosis of breast cancer using histopathological data. Performance in (%) Cross run Accuracy Specificity Sensitivity F-Score refers to benign mass and negative refers to malignant mass. Referring to Table 7.5, it can be observed that the sensitivity of both RBFNN and SVM is around 96% indicating that they are equally good in identifying malignant mass correctly. However, RBFNN is far better than SVM in identifying benign masses having a sensitivity of 96.73% in comparison to the sensitivity of SVM (86.73%), indicating that SVM has failed in identifying the true positives correctly. F-Score is a composite measure which favors algorithms with higher sensitivity and challenges those with higher specificity. RBFNN having a higher sensitivity has higher value of F-Score compared to SVM as seen in the Fig

9 Fig. 7.3: Overall performance of SVM for breast cancer diagnosis using histopathological data. 7.4 Dataset used for Prognosis of Breast Cancer The word prognosis is often used in medical reports dictating a physician s view on a case. Prognosis is a medical term denoting the doctor s prediction of how a patient will progress, and whether there is a chance of recovery. In other words, prognosis is the prediction of long term behavior of the disease. In this work, prognosis is done using cytological features and classifiers such as support vector machine and auto associative neural network which are used to classify the disease as either recurrent or non-recurrent. Three fold cross validation is done to avoid bias in classifying and the performance metrics such as accuracy, sensitivity, specificity, F-score of SVM and AANN is found and compared. The dataset of 198 samples from Wisconsin prognosis breast cancer database [29] are taken as input. Among the dataset, two-third of the dataset are used for training the classifier, and one-third of the dataset are used for testing the classifier. The 116

10 Fig. 7.4: Graph comparing performance of SVM and RBFNN for diagnosis of breast cancer using histopathological data. first attribute of each sample is discarded which is an ID number. The remaining 34 attributes are considered for training and testing the classifier. Some of the attributes include time, radius, texture, perimeter, area, smoothness, compactness, concavity, symmetry, fractal dimension, etc. These features specify the texture of the cell which gives the physician the clue for arriving at the the prognosis. In this work, an attempt has been made to automate prognosis using these features and pattern classification models namely SVM and AANN. 7.5 Techniques for Prognosis of Breast Cancer SVM and AANN are used to classify the cancer as recurrent or non recurrent based on cytological data. SVM has been dealt with in detail in Chapter 4. This section discusses AANN. 117

11 Table 7.5: Performance comparison of SVM and RBFNN for diagnosis of breast cancer using histopathological data Measures (%) SVM RBF Accuracy Sensitivity Specificity F-Score Autoassociative Neural Network Autoassociative neural network is a special class of feedforward neural network architecture having some interesting properties which can be exploited for some pattern recognition tasks [184]. Separate AANN models are used to capture the distribution of feature vectors of each class namely recurrent and non-recurrent. A five layer autoassociative neural network model is shown in Fig Autoassociative neural network is a network having the same number of neurons in input and output layers, and less in the hidden layers. The network is trained using the input vector itself as the desired output. This training leads to organize a compression encoding network between the input layer and the hidden layer, and a decoding network between the hidden layer and the output layer as shown in Fig Each of the autoassociative networks is trained independently for each class using the feature vector of the class. As a result, the squared error between an input and the output is generally minimized by the network of the class to which the input pattern belongs. This property enables to classify an unknown input pattern. The unknown pattern is fed to all networks, and is classified to the class with minimum squared error. The processing units of the input layer and the output layer are linear, whereas the units in the hidden layer are nonlinear [185]. During training of the network, the target vectors are the same as the input vectors. To realize the input vectors at the output layer, the network projects an M-dimensional vector in the input space R M onto a vector in the subspace R N, and then maps it back onto the M-dimensional space, where N < M. The network performs nonlinear 118

12 Fig. 7.5: An autoassociative neural network. principal component analysis of projecting the input vectors onto the subspace R N. The subspace R N is the space spanned by the first N principal components derived from the training data. The value of N is determined by the number of units in the dimension compression layer. The mapping of the subspace R N back to the M - dimensional space R M determines the way in which the subspace R N is embedded in the original subspace R M. It has been shown that the AANN trained with a dataset will capture the subspace and the hyper surface along the maximum variance of the data [186] and [169]. In other words, AANN can be used to capture the distribution of the given data set. In this work, this feature of AANN is used to classify recurrent and non-recurrent cancer cases. 119

13 7.5.1 Experimental Results and Discussion Thirty three cytological features of breast fine-needle aspirates reported to differ between recurrent and non-recurrent cases of 198 patients are used to train and test the models. Training and testing the SVM In order to evaluate the result, three fold cross validation is used. The training data includes thirty four feature attributes and a class attribute while the test data has only thirty three feature attributes excluding the class attribute. The data are normalized between -1 and +1 in order for the classifier to have a common range to work with. SVM Torch is used for training and testing the model [173]. The polynomial kernel based SVM is trained using two third of the data randomly chosen and tested with the remaining one third of the data for evaluating the classifier s effectiveness. Table 7.6 shows the confusion matrix and Table 7.7 depicts the performance measures for the three different cross runs. The overall performance of SVM was arrived at, by taking the average performance values of the different cross runs and it is shown in Fig Table 7.6: SVM : Confusion matrix for prognosis of breast cancer using histopathological data. Cross run tp tn f p f n

14 Table 7.7: SVM : Performance measures for prognosis of breast cancer using histopathological data. Performance in (%) Cross run Accuracy Specificity Sensitivity F-Score Table 7.8: AANN : Confusion matrix for prognosis of breast cancer using histopathological data. Cross run Epochs tp tn f p f n Training and Testing AANN The structure of the AANN model used is 12L 38N 4N 38N 12L, where L denotes linear units and N denotes non-linear units. The activation function of the non-linear unit is a hyperbolic tangent function. The network is trained using error backpropagation learning algorithm for 100, 500 and 1000 epochs. Table 7.8 gives the results in the form of a confusion matrix and the performance metrics calculated for the three different cross runs is shown in Table 7.9. The AANN gives better performance for 121

15 Fig. 7.6: Overall performance of SVM for prognosis of breast cancer using histopathological data. 500 epochs and the final overall performance is calculated by taking the average of all three cross runs with respect to 500 epochs and is shown in Fig The comparison of SVM and AANN obtained by taking the average performance values of the different cross runs is shown in Fig It can be seen that the accuracy of AANN is far better (86.66%) than SVM (71.81%). It is also seen from the Fig. 7.8 that specificity of SVM is very poor compared to AANN. This implies that SVM could not perform well in identifying non-recurring cases correctly resulting in more false positivies. 122

16 Table 7.9: AANN : Performance measures for prognosis of breast cancer using histopathological data. Performance in (%) Cross run Epochs Accuracy Specificity Sensitivity F-Score Summary In this chapter, the usage of support vector machines and radial basis function neural networks in actual clinical diagnosis was examined. Known sets of cytologically proved tumor data obtained from the Wisconsin breast cancer database were used to train the models to categorize cancer patients according to their diagnosis. Experimental results show that RBFNN gives better performance than SVM for breast cancer classification. Also methods were proposed to arrive at prognosis using AANN and SVM. Cytological features of the Wisconsin breast cancer database was used for this purpose. The experimental results reveal that AANN is better than SVM for prognosis of breast cancer. This work indicates that RBFNN and AANN can be effectively used for breast cancer diagnosis and prognosis to help oncologists. 123

17 Fig. 7.7: Overall performance of AANN for prognosis of breast cancer using histopathological data. Fig. 7.8: Performance comparison of SVM and AANN for prognosis of breast cancer using histopathological data. 124

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear

More information

Application of Data Mining Techniques in Improving Breast Cancer Diagnosis

Application of Data Mining Techniques in Improving Breast Cancer Diagnosis Application of Data Mining Techniques in Improving Breast Cancer Diagnosis ABSTRACT Breast cancer is the second leading cause of cancer deaths among women in the United States. Although mortality rates

More information

Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier

Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier D.Nithya a, *, V.Suganya b,1, R.Saranya Irudaya Mary c,1 Abstract - This paper presents,

More information

Data Mining Analysis (breast-cancer data)

Data Mining Analysis (breast-cancer data) Data Mining Analysis (breast-cancer data) Jung-Ying Wang Register number: D9115007, May, 2003 Abstract In this AI term project, we compare some world renowned machine learning tools. Including WEKA data

More information

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support

More information

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j What is Kiva? An organization that allows people to lend small amounts of money via the Internet

More information

Notes on Support Vector Machines

Notes on Support Vector Machines Notes on Support Vector Machines Fernando Mira da Silva Fernando.Silva@inesc.pt Neural Network Group I N E S C November 1998 Abstract This report describes an empirical study of Support Vector Machines

More information

Neural Networks. Neural network is a network or circuit of neurons. Neurons can be. Biological neurons Artificial neurons

Neural Networks. Neural network is a network or circuit of neurons. Neurons can be. Biological neurons Artificial neurons Neural Networks Neural network is a network or circuit of neurons Neurons can be Biological neurons Artificial neurons Biological neurons Building block of the brain Human brain contains over 10 billion

More information

Neural Pattern Recognition Model for Breast Cancer Diagnosis

Neural Pattern Recognition Model for Breast Cancer Diagnosis Cyber Journals: Multidisciplinary Journals in Science and Technology, Journal of Selected Areas in Bioinformatics (JBIO), August Edition, 2012 Neural Pattern Recognition Model for Breast Cancer Diagnosis

More information

1. Classification problems

1. Classification problems Neural and Evolutionary Computing. Lab 1: Classification problems Machine Learning test data repository Weka data mining platform Introduction Scilab 1. Classification problems The main aim of a classification

More information

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental

More information

Feature Extraction by Neural Network Nonlinear Mapping for Pattern Classification

Feature Extraction by Neural Network Nonlinear Mapping for Pattern Classification Lerner et al.:feature Extraction by NN Nonlinear Mapping 1 Feature Extraction by Neural Network Nonlinear Mapping for Pattern Classification B. Lerner, H. Guterman, M. Aladjem, and I. Dinstein Department

More information

SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH

SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH 330 SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH T. M. D.Saumya 1, T. Rupasinghe 2 and P. Abeysinghe 3 1 Department of Industrial Management, University of Kelaniya,

More information

Data Mining Techniques for Prognosis in Pancreatic Cancer

Data Mining Techniques for Prognosis in Pancreatic Cancer Data Mining Techniques for Prognosis in Pancreatic Cancer by Stuart Floyd A Thesis Submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUE In partial fulfillment of the requirements for the Degree

More information

203.4770: Introduction to Machine Learning Dr. Rita Osadchy

203.4770: Introduction to Machine Learning Dr. Rita Osadchy 203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:

More information

Introduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011

Introduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 Introduction to Machine Learning Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 1 Outline 1. What is machine learning? 2. The basic of machine learning 3. Principles and effects of machine learning

More information

Predictive Dynamix Inc

Predictive Dynamix Inc Predictive Modeling Technology Predictive modeling is concerned with analyzing patterns and trends in historical and operational data in order to transform data into actionable decisions. This is accomplished

More information

Neural Networks and Support Vector Machines

Neural Networks and Support Vector Machines INF5390 - Kunstig intelligens Neural Networks and Support Vector Machines Roar Fjellheim INF5390-13 Neural Networks and SVM 1 Outline Neural networks Perceptrons Neural networks Support vector machines

More information

MACHINE LEARNING. Introduction. Alessandro Moschitti

MACHINE LEARNING. Introduction. Alessandro Moschitti MACHINE LEARNING Introduction Alessandro Moschitti Department of Computer Science and Information Engineering University of Trento Email: moschitti@disi.unitn.it Course Schedule Lectures Tuesday, 14:00-16:00

More information

SUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK

SUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK SUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK N M Allinson and D Merritt 1 Introduction This contribution has two main sections. The first discusses some aspects of multilayer perceptrons,

More information

Neural Nets. General Model Building

Neural Nets. General Model Building Neural Nets To give you an idea of how new this material is, let s do a little history lesson. The origins are typically dated back to the early 1940 s and work by two physiologists, McCulloch and Pitts.

More information

Classifiers & Classification

Classifiers & Classification Classifiers & Classification Forsyth & Ponce Computer Vision A Modern Approach chapter 22 Pattern Classification Duda, Hart and Stork School of Computer Science & Statistics Trinity College Dublin Dublin

More information

Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Models vs. Patterns Models A model is a high level, global description of a

More information

Data Mining. Supervised Methods. Ciro Donalek donalek@astro.caltech.edu. Ay/Bi 199ab: Methods of Computa@onal Sciences hcp://esci101.blogspot.

Data Mining. Supervised Methods. Ciro Donalek donalek@astro.caltech.edu. Ay/Bi 199ab: Methods of Computa@onal Sciences hcp://esci101.blogspot. Data Mining Supervised Methods Ciro Donalek donalek@astro.caltech.edu Supervised Methods Summary Ar@ficial Neural Networks Mul@layer Perceptron Support Vector Machines SoLwares Supervised Models: Supervised

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification

More information

PCA, Clustering and Classification. By H. Bjørn Nielsen strongly inspired by Agnieszka S. Juncker

PCA, Clustering and Classification. By H. Bjørn Nielsen strongly inspired by Agnieszka S. Juncker PCA, Clustering and Classification By H. Bjørn Nielsen strongly inspired by Agnieszka S. Juncker Motivation: Multidimensional data Pat1 Pat2 Pat3 Pat4 Pat5 Pat6 Pat7 Pat8 Pat9 209619_at 7758 4705 5342

More information

3 An Illustrative Example

3 An Illustrative Example Objectives An Illustrative Example Objectives - Theory and Examples -2 Problem Statement -2 Perceptron - Two-Input Case -4 Pattern Recognition Example -5 Hamming Network -8 Feedforward Layer -8 Recurrent

More information

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

An Introduction to Neural Networks

An Introduction to Neural Networks An Introduction to Vincent Cheung Kevin Cannons Signal & Data Compression Laboratory Electrical & Computer Engineering University of Manitoba Winnipeg, Manitoba, Canada Advisor: Dr. W. Kinsner May 27,

More information

Comparison of Supervised and Unsupervised Learning Classifiers for Travel Recommendations

Comparison of Supervised and Unsupervised Learning Classifiers for Travel Recommendations Volume 3, No. 8, August 2012 Journal of Global Research in Computer Science REVIEW ARTICLE Available Online at www.jgrcs.info Comparison of Supervised and Unsupervised Learning Classifiers for Travel Recommendations

More information

Supervised Learning with Unsupervised Output Separation

Supervised Learning with Unsupervised Output Separation Supervised Learning with Unsupervised Output Separation Nathalie Japkowicz School of Information Technology and Engineering University of Ottawa 150 Louis Pasteur, P.O. Box 450 Stn. A Ottawa, Ontario,

More information

Cheng Soon Ong & Christfried Webers. Canberra February June 2016

Cheng Soon Ong & Christfried Webers. Canberra February June 2016 c Cheng Soon Ong & Christfried Webers Research Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 31 c Part I

More information

INTRODUCTION TO NEURAL NETWORKS

INTRODUCTION TO NEURAL NETWORKS INTRODUCTION TO NEURAL NETWORKS Pictures are taken from http://www.cs.cmu.edu/~tom/mlbook-chapter-slides.html http://research.microsoft.com/~cmbishop/prml/index.htm By Nobel Khandaker Neural Networks An

More information

Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

More information

Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones

Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 What is machine learning? Data description and interpretation

More information

Face Recognition using Principle Component Analysis

Face Recognition using Principle Component Analysis Face Recognition using Principle Component Analysis Kyungnam Kim Department of Computer Science University of Maryland, College Park MD 20742, USA Summary This is the summary of the basic idea about PCA

More information

A Survey of Kernel Clustering Methods

A Survey of Kernel Clustering Methods A Survey of Kernel Clustering Methods Maurizio Filippone, Francesco Camastra, Francesco Masulli and Stefano Rovetta Presented by: Kedar Grama Outline Unsupervised Learning and Clustering Types of clustering

More information

Data Mining: A Hybrid Approach on the Clinical Diagnosis of Breast Tumor Patients

Data Mining: A Hybrid Approach on the Clinical Diagnosis of Breast Tumor Patients Data Mining: A Hybrid Approach on the Clinical Diagnosis of Breast Tumor Patients Onuodu F. E. 1, Eke B. O. 2 2 bathoyol@gmail.com, University of Port Harcourt, Port Harcourt, Nigeria 1 University of Port

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

Diagnosis of Breast Cancer Using Intelligent Techniques

Diagnosis of Breast Cancer Using Intelligent Techniques International Journal of Emerging Science and Engineering (IJESE) Diagnosis of Breast Cancer Using Intelligent Techniques H.S.Hota Abstract- Breast cancer is a serious and life threatening disease due

More information

Face Recognition using SIFT Features

Face Recognition using SIFT Features Face Recognition using SIFT Features Mohamed Aly CNS186 Term Project Winter 2006 Abstract Face recognition has many important practical applications, like surveillance and access control.

More information

CSE 494 CSE/CBS 598 (Fall 2007): Numerical Linear Algebra for Data Exploration Clustering Instructor: Jieping Ye

CSE 494 CSE/CBS 598 (Fall 2007): Numerical Linear Algebra for Data Exploration Clustering Instructor: Jieping Ye CSE 494 CSE/CBS 598 Fall 2007: Numerical Linear Algebra for Data Exploration Clustering Instructor: Jieping Ye 1 Introduction One important method for data compression and classification is to organize

More information

ScienceDirect. Brain Image Classification using Learning Machine Approach and Brain Structure Analysis

ScienceDirect. Brain Image Classification using Learning Machine Approach and Brain Structure Analysis Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 50 (2015 ) 388 394 2nd International Symposium on Big Data and Cloud Computing (ISBCC 15) Brain Image Classification using

More information

Monotonicity Hints. Abstract

Monotonicity Hints. Abstract Monotonicity Hints Joseph Sill Computation and Neural Systems program California Institute of Technology email: joe@cs.caltech.edu Yaser S. Abu-Mostafa EE and CS Deptartments California Institute of Technology

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015 RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering

More information

Classification algorithm in Data mining: An Overview

Classification algorithm in Data mining: An Overview Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department

More information

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

More information

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,

More information

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents

More information

Predict the Popularity of YouTube Videos Using Early View Data

Predict the Popularity of YouTube Videos Using Early View Data 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

CLASSIFICATION AND CLUSTERING. Anveshi Charuvaka

CLASSIFICATION AND CLUSTERING. Anveshi Charuvaka CLASSIFICATION AND CLUSTERING Anveshi Charuvaka Learning from Data Classification Regression Clustering Anomaly Detection Contrast Set Mining Classification: Definition Given a collection of records (training

More information

A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization

A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization Ángela Blanco Universidad Pontificia de Salamanca ablancogo@upsa.es Spain Manuel Martín-Merino Universidad

More information

Introduction to Neural Networks : Revision Lectures

Introduction to Neural Networks : Revision Lectures Introduction to Neural Networks : Revision Lectures John A. Bullinaria, 2004 1. Module Aims and Learning Outcomes 2. Biological and Artificial Neural Networks 3. Training Methods for Multi Layer Perceptrons

More information

Supervised Feature Selection & Unsupervised Dimensionality Reduction

Supervised Feature Selection & Unsupervised Dimensionality Reduction Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or

More information

3D Ultrasonic Diagnosis of Breast Tumors. Wei-Ming Chen

3D Ultrasonic Diagnosis of Breast Tumors. Wei-Ming Chen 3D Ultrasonic Diagnosis of Breast Tumors Wei-Ming Chen Three major benefits of ultrasound Ultrasound imaging has been shown to be valuable for differentiating some aspects of benign and malignant diseases.

More information

Novelty Detection in image recognition using IRF Neural Networks properties

Novelty Detection in image recognition using IRF Neural Networks properties Novelty Detection in image recognition using IRF Neural Networks properties Philippe Smagghe, Jean-Luc Buessler, Jean-Philippe Urban Université de Haute-Alsace MIPS 4, rue des Frères Lumière, 68093 Mulhouse,

More information

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6, January 2013 Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing

More information

EVALUATION OF NEURAL NETWORK BASED CLASSIFICATION SYSTEMS FOR CLINICAL CANCER DATA CLASSIFICATION

EVALUATION OF NEURAL NETWORK BASED CLASSIFICATION SYSTEMS FOR CLINICAL CANCER DATA CLASSIFICATION EVALUATION OF NEURAL NETWORK BASED CLASSIFICATION SYSTEMS FOR CLINICAL CANCER DATA CLASSIFICATION K. Mumtaz Vivekanandha Institute of Information and Management Studies, Tiruchengode, India S.A.Sheriff

More information

Data Clustering. Dec 2nd, 2013 Kyrylo Bessonov

Data Clustering. Dec 2nd, 2013 Kyrylo Bessonov Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main

More information

2. Feature Extraction Methods

2. Feature Extraction Methods Artificial Intelligence Research and Development L. Museros et al. (Eds.) IOS Press, 2014 2014 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-61499-452-7-159 159 Improvement of Mass

More information

INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR. ankitanandurkar2394@gmail.com

INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR. ankitanandurkar2394@gmail.com IJFEAT INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR Bharti S. Takey 1, Ankita N. Nandurkar 2,Ashwini A. Khobragade 3,Pooja G. Jaiswal 4,Swapnil R.

More information

Towards better accuracy for Spam predictions

Towards better accuracy for Spam predictions Towards better accuracy for Spam predictions Chengyan Zhao Department of Computer Science University of Toronto Toronto, Ontario, Canada M5S 2E4 czhao@cs.toronto.edu Abstract Spam identification is crucial

More information

6.2.8 Neural networks for data mining

6.2.8 Neural networks for data mining 6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural

More information

Feature Subset Selection in E-mail Spam Detection

Feature Subset Selection in E-mail Spam Detection Feature Subset Selection in E-mail Spam Detection Amir Rajabi Behjat, Universiti Technology MARA, Malaysia IT Security for the Next Generation Asia Pacific & MEA Cup, Hong Kong 14-16 March, 2012 Feature

More information

Building MLP networks by construction

Building MLP networks by construction University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2000 Building MLP networks by construction Ah Chung Tsoi University of

More information

THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS

THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS O.U. Sezerman 1, R. Islamaj 2, E. Alpaydin 2 1 Laborotory of Computational Biology, Sabancı University, Istanbul, Turkey. 2 Computer Engineering

More information

KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCER

KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCER KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCER S. Aruna 1, Dr S.P. Rajagopalan 2 and L.V. Nandakishore 3 1,2 Department of Computer Applications, Dr M.G.R University,

More information

Application of Data Mining Techniques in Improving Breast Cancer Diagnosis

Application of Data Mining Techniques in Improving Breast Cancer Diagnosis Paper 9420-2016 Application of Data Mining Techniques in Improving Breast Cancer Diagnosis Josephine S. Akosa, Oklahoma State University; Shannon Kelly, Oklahoma State University ABSTRACT Breast cancer

More information

Machine learning for algo trading

Machine learning for algo trading Machine learning for algo trading An introduction for nonmathematicians Dr. Aly Kassam Overview High level introduction to machine learning A machine learning bestiary What has all this got to do with

More information

Automated Stellar Classification for Large Surveys with EKF and RBF Neural Networks

Automated Stellar Classification for Large Surveys with EKF and RBF Neural Networks Chin. J. Astron. Astrophys. Vol. 5 (2005), No. 2, 203 210 (http:/www.chjaa.org) Chinese Journal of Astronomy and Astrophysics Automated Stellar Classification for Large Surveys with EKF and RBF Neural

More information

Java Modules for Time Series Analysis

Java Modules for Time Series Analysis Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series

More information

Big Data Analytics CSCI 4030

Big Data Analytics CSCI 4030 High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising

More information

Information Model Requirements of Post-Coordinated SNOMED CT Expressions for Structured Pathology Reports

Information Model Requirements of Post-Coordinated SNOMED CT Expressions for Structured Pathology Reports Information Model Requirements of Post-Coordinated SNOMED CT Expressions for Structured Pathology Reports W. Scott Campbell, Ph.D., MBA James R. Campbell, MD Acknowledgements Steven H. Hinrichs, MD Chairman

More information

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376 Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.

More information

Application of Data Mining Techniques to Model Breast Cancer Data

Application of Data Mining Techniques to Model Breast Cancer Data Application of Data Mining Techniques to Model Breast Cancer Data S. Syed Shajahaan 1, S. Shanthi 2, V. ManoChitra 3 1 Department of Information Technology, Rathinam Technical Campus, Anna University,

More information

Accurate and robust image superresolution by neural processing of local image representations

Accurate and robust image superresolution by neural processing of local image representations Accurate and robust image superresolution by neural processing of local image representations Carlos Miravet 1,2 and Francisco B. Rodríguez 1 1 Grupo de Neurocomputación Biológica (GNB), Escuela Politécnica

More information

Mathematical Models of Supervised Learning and their Application to Medical Diagnosis

Mathematical Models of Supervised Learning and their Application to Medical Diagnosis Genomic, Proteomic and Transcriptomic Lab High Performance Computing and Networking Institute National Research Council, Italy Mathematical Models of Supervised Learning and their Application to Medical

More information

Data Mining using Artificial Neural Network Rules

Data Mining using Artificial Neural Network Rules Data Mining using Artificial Neural Network Rules Pushkar Shinde MCOERC, Nasik Abstract - Diabetes patients are increasing in number so it is necessary to predict, treat and diagnose the disease. Data

More information

Visualization of Breast Cancer Data by SOM Component Planes

Visualization of Breast Cancer Data by SOM Component Planes International Journal of Science and Technology Volume 3 No. 2, February, 2014 Visualization of Breast Cancer Data by SOM Component Planes P.Venkatesan. 1, M.Mullai 2 1 Department of Statistics,NIRT(Indian

More information

Predictive Data modeling for health care: Comparative performance study of different prediction models

Predictive Data modeling for health care: Comparative performance study of different prediction models Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath hiremat.nitie@gmail.com National Institute of Industrial Engineering (NITIE) Vihar

More information

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Preliminaries Classification and Clustering Applications

More information

OBJECTIVES By the end of this segment, the community participant will be able to:

OBJECTIVES By the end of this segment, the community participant will be able to: Cancer 101: Cancer Diagnosis and Staging Linda U. Krebs, RN, PhD, AOCN, FAAN OCEAN Native Navigators and the Cancer Continuum (NNACC) (NCMHD R24MD002811) Cancer 101: Diagnosis & Staging (Watanabe-Galloway

More information

Support Vector Machines with Clustering for Training with Very Large Datasets

Support Vector Machines with Clustering for Training with Very Large Datasets Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano

More information

Application of Data mining in Medical Applications

Application of Data mining in Medical Applications Application of Data mining in Medical Applications by Arun George Eapen A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Applied Science

More information

A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique

A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique Aida Parbaleh 1, Dr. Heirsh Soltanpanah 2* 1 Department of Computer Engineering, Islamic Azad University, Sanandaj

More information

A New Approach For Estimating Software Effort Using RBFN Network

A New Approach For Estimating Software Effort Using RBFN Network IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.7, July 008 37 A New Approach For Estimating Software Using RBFN Network Ch. Satyananda Reddy, P. Sankara Rao, KVSVN Raju,

More information

Methods and Applications for Distance Based ANN Training

Methods and Applications for Distance Based ANN Training Methods and Applications for Distance Based ANN Training Christoph Lassner, Rainer Lienhart Multimedia Computing and Computer Vision Lab Augsburg University, Universitätsstr. 6a, 86159 Augsburg, Germany

More information

Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus

Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information

More information

Neural networks. Chapter 20, Section 5 1

Neural networks. Chapter 20, Section 5 1 Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of

More information

Functional Data Analysis of MALDI TOF Protein Spectra

Functional Data Analysis of MALDI TOF Protein Spectra Functional Data Analysis of MALDI TOF Protein Spectra Dean Billheimer dean.billheimer@vanderbilt.edu. Department of Biostatistics Vanderbilt University Vanderbilt Ingram Cancer Center FDA for MALDI TOF

More information

Predicting Results of Brazilian Soccer League Matches

Predicting Results of Brazilian Soccer League Matches University of Wisconsin-Madison ECE/CS/ME 539 Introduction to Artificial Neural Networks and Fuzzy Systems Predicting Results of Brazilian Soccer League Matches Student: Alberto Trindade Tavares Email:

More information

Breast Cancer Diagnosis by using k-nearest Neighbor with Different Distances and Classification Rules

Breast Cancer Diagnosis by using k-nearest Neighbor with Different Distances and Classification Rules Breast Cancer Diagnosis by using k-nearest Neighbor with Different Distances and Classification Rules Seyyid Ahmed Medjahed University of Science and Technology Oran USTOMB, Algeria Tamazouzt Ait Saadi

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

Comparison of Supervised and Unsupervised Learning Algorithms for Pattern Classification

Comparison of Supervised and Unsupervised Learning Algorithms for Pattern Classification Comparison of Supervised and Unsupervised Learning Algorithms for Pattern Classification R. Sathya Professor, Dept. of MCA, Jyoti Nivas College (Autonomous), Professor and Head, Dept. of Mathematics, Bangalore,

More information

Role of Neural network in data mining

Role of Neural network in data mining Role of Neural network in data mining Chitranjanjit kaur Associate Prof Guru Nanak College, Sukhchainana Phagwara,(GNDU) Punjab, India Pooja kapoor Associate Prof Swami Sarvanand Group Of Institutes Dinanagar(PTU)

More information

AUTOMATED CLASSIFICATION OF BLASTS IN ACUTE LEUKEMIA BLOOD SAMPLES USING HMLP NETWORK

AUTOMATED CLASSIFICATION OF BLASTS IN ACUTE LEUKEMIA BLOOD SAMPLES USING HMLP NETWORK AUTOMATED CLASSIFICATION OF BLASTS IN ACUTE LEUKEMIA BLOOD SAMPLES USING HMLP NETWORK N. H. Harun 1, M.Y.Mashor 1, A.S. Abdul Nasir 1 and H.Rosline 2 1 Electronic & Biomedical Intelligent Systems (EBItS)

More information

10-810 /02-710 Computational Genomics. Clustering expression data

10-810 /02-710 Computational Genomics. Clustering expression data 10-810 /02-710 Computational Genomics Clustering expression data What is Clustering? Organizing data into clusters such that there is high intra-cluster similarity low inter-cluster similarity Informally,

More information

Network Intrusion Detection using Semi Supervised Support Vector Machine

Network Intrusion Detection using Semi Supervised Support Vector Machine Network Intrusion Detection using Semi Supervised Support Vector Machine Jyoti Haweliya Department of Computer Engineering Institute of Engineering & Technology, Devi Ahilya University Indore, India ABSTRACT

More information

Electroencephalography Analysis Using Neural Network and Support Vector Machine during Sleep

Electroencephalography Analysis Using Neural Network and Support Vector Machine during Sleep Engineering, 23, 5, 88-92 doi:.4236/eng.23.55b8 Published Online May 23 (http://www.scirp.org/journal/eng) Electroencephalography Analysis Using Neural Network and Support Vector Machine during Sleep JeeEun

More information