FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

Size: px
Start display at page:

Download "FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS"

Transcription

1 FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte, Brazil ABSTRACT Nowadays the electric utilities have to handle problems with the non-technical losses caused by frauds and thefts committed by some of their consumers. In order to minimize this, some methodologies have been created to perform the detection of consumers that might be fraudsters. In this context, the use of classification techniques can improve the hit rate of the fraud detection and increase the financial income. This paper proposes the use of the knowledge-discovery in databases process based on artificial neural networks applied to the classifying process of consumers to be inspected. An experiment performed in a Brazilian electric power distribution company indicated an improvement of over 50% of the proposed approach if compared to the previous methods used by that company. KEYWORDS Fraud Detection, Electric Power Distribution, Low-Voltage, KDD, Artificial Neural Networks 1. INTRODUCTION The losses in electric power distribution networks are a common reality for the electric utilities and can be classified as technical or non-technical losses, where the technical loss corresponds to the electrical energy dissipated between the energy supply and the delivery points of the consumer. On the other hand, a non-technical loss is the difference between the total losses and the technical losses, i.e. all other losses associated to the electric power distribution, such as energy theft, metering errors, errors in the billing process, etc. This loss type is directly related to the distributor s commercial management [1]. One of the usual ways to decrease non-technical losses rates is to perform local inspections to check if there are any thefts or hoaxes being committed by the consumers. For that purpose, the field staff is responsible for creating a methodology to generate an inspection schedule with suspected consumers. However, this task can be a too complex on due to the large amount of existing consumers in an electric power distribution network. Whenever a field staff performs an inspection, it returns only two possible outcomes: consumer is fraudster (there is irregularity) or non-fraudster (there is no irregularity). Therefore, whenever a fraud is found, appropriate measures are applied and the situation returns to normal. Nevertheless, if a fraud is not found some costs such the inspector s hours and vehicles costs are unnecessarily spent, instead of being applied to inspect others consumers that are committing energy frauds. Increasing the hit rate of identifying irregularities is needed to reduce DOI : /ijaia

2 costs in the fraud detection, saving operational costs, even more importantly, identifying the fraudsters of the distribution network. Some studies address this kind of problem applying well known knowledge-discovery techniques. Cabral et al. [2] [4] proposed a fraud detector for low and high voltage consumers. At that study, a knowledge-discovery methodology with artificial intelligence for data preprocessing and mining was successfully applied in different scenarios. Monedero et al. [3] developed an approach to identify non-technical losses in electric power distribution networks using artificial neural networks applied to the city of Seville (Spain), outperforming by over 50% precision compared to the previous methodologies. Muniz et al. [5] presented an approach combining artificial neural networks and neuron-fuzzy systems to identify irregularities of the low voltage consumers. J. Nagi et al. [1] proposed a framework to detect non-technical losses in power distribution companies using pattern recognition by means of Support Vector Machines (SVM). E. W. S. dos Angelos [8] presented a cluster-based classification strategy with an unsupervised algorithm of two steps to identify suspected profiles of power consumption, providing a good assertiveness in real life systems. Over the past years, other studies in this field have been addressed applying different computational techniques to improve the detection of non-technical losses [9-13]. Although some studies have been conducted in the recent years in this field, the fraud rate in Brazilian low-voltage consumers is still very high, especially in metropolitan areas. There are still many electric power companies using only human expert knowledge to treat this problem and the results has been unsatisfactory. This paper addresses the fraud detection problem in electric power distribution networks (low-voltage consumers). Our methodology is based on the knowledge-discovery process using artificial neural networks to classify consumers as fraudsters or non-fraudsters. Thereafter a list of consumers classified as fraudsters is generated to help perform the inspections with the main objective being the improvement of the hit rate of inspections to reduce unnecessary operational costs. This paper is structured as follows: the Section 2 presents the methodology used in the study. In the Section 3, the experiment and results are shown, and Section 4 presents the final discussions and conclusions. 2. METHODOLOGY Our methodology is based on knowledge-discovery in databases (KDD) process using data mining, commonly used to extract previously unknown, non-trivial, and useful patterns from different data sources. The steps are presented in the Figure 1. 18

3 Figure 1: KDD process [6] This process contains three main steps: cleaning and integration, selection and transformation, and data mining. In this case, they were applied to select consumers for inspection against consumption irregularities in electric power distribution networks. The development is presented in the next subsections Cleaning and Integration In this step, different data sources obtained from databases and text files were processed to generate a single and consistent database. Initially, seven data sources were considered: (1) database of all consumers and their socio-economic characteristics; (2) history of inspections; (3) historical consumption; (4) history of services requested by clients; (5) history of ownership exchanges; (6) history of queries debits; and (7) history of meter reading. All data sources were integrated using a common key: consumer installation code. Thus, it was possible to remove incorrect or duplicate records, generating a single database from the seven sources, used as input for the data mining algorithm (training and classification). In this context, only the records with valid inspection results (fraudsters or non-fraudsters) were considered Selection and Transformation In this step, statistical techniques were applied for data selection and transformation. The attribute selection was performed using a multivariate correlation analysis and Information Gain method, generating some key attributes of the model as shown in the Table 1. 19

4 Table 1: Key attributes of the model # Name Type Description 1 Location nominal Consumer location (city + neighborhood) 2 Business class nominal Business class (e.g. residential, industrial, commercial, among others) 3 Activity type nominal Activity type (e.g. residence, drugstore, bakery, public administration, among others) 4 Voltage nominal Consumer voltage (110v, 220v) 5 Number of nominal Number of phases (1, 2, 3) phases 6 Situation nominal Is consumer connected? (yes, no) 7 Direct debit nominal Type of direct debit 8 Metering type nominal Type of electricity metering 9 Mean consumption numeric Mean consumption during the previous 12 months 10 Service notes nominal Are there service notes requested by consumers during the previous 12 months? (Yes or no) 11 Ownership exchange nominal Are there ownership exchanges during the previous 12 months? (Yes or no) 12 Query debits nominal Are there query debits during the previous 12 months? (Yes or no) 13 Meter reading nominal Are there meter reading notes during the previous 12 months? (Yes or no) 14 Inspection (output) nominal The inspection result (Fraudster or Nonfraudster) The data transformation was performed by normalization methods. The Min-Max method was used to transform the single numerical attribute Mean Consumption, wherein each value is converted to a range between 0 and 1. For the nominal attributes, two distinct methods were applied: if the number of distinct values was less than five, the Binary method was applied; otherwise the One-of-N method was used Data Mining The problem addressed here is a supervised classification one because there are real inspections records to be used for the training step. An artificial neural network/multilayer perceptron (ANN- MLP) was used for the dataset training and classification. According to Haykin [7], the MLPs using the Backpropagation algorithm have been successfully applied to solve many similar complex problems, such as pattern recognition, classification, data preprocessing, etc. Therefore, we applied it to solve our problem with the configurations that are presented here. The ANN-MLP s architecture used three layers: input, hidden and output layers. The input layer contains n neurons, wherein n is equal to number of bits of the normalized values. In the hidden layer, the number of neurons is equal to the geometrical mean between the numbers of neurons in the input and output layers [14]. The output layer contains a single neuron to classify the consumers as fraudsters or non-fraudsters. 20

5 After defining the architecture, the data mining algorithm was coded using the training/test/validation schema and the stratified k-fold Cross Validation method. In this case, k is equal to 10, i.e., at the each iteration the dataset is randomly partitioned into 10 equal size subsets. Of the 10 subsets, 9 are used as training data and a single is used for testing the model. This process is then repeated k times, and the average error is calculated. This implementation requires parameter settings to perform the experiments successfully. 3. RESULTS AND DISCUSSION The evaluation of the results was performed using a confusion matrix that compares real inspections to classified inspections by the ANN-MLP, indicating four result types: (1) TP is a fraudster consumer correctly classified as fraudster; (2) FN is a fraudster consumer incorrectly classified as non-fraudster; (3) FP is a consumer non-fraudster incorrectly classified as fraudster; (4) TN is a consumer non-fraudster correctly classified as non-fraudster. Some measures are used to check the classifier efficiency: hit rate is the percentage of records correctly classified; recall: TP / (TP+FN); precision = TP / (TP+FP); True Positive rate = TP / (TP+FN); and False Positive rate = FP / (FP+TN). This approach was evaluated using real data from a Brazilian electric power distribution company with more than seven million consumers. A lot of inspections conducted in the metropolitan region of Minas Gerais (Brazil) during the year 2011 were used to supply the KDD process. The first step of our methodology was applied to clean and integrate databases and text files, generating a single data source with 22,848 records. After that, the second step was applied to select key attributes of the model and transform data to feed the ANN-MLP. Each record consisted of thirteen input attributes and one output attribute indicating fraud or not, as seen in the Section 2.2. After that, the ANN-MLP was trained with Backpropagation learning algorithm using the Logistic activation function for all neurons. The training step was performed for 500 epochs, with learning rate equals to 0.01, momentum factor equals to 0.95, activation threshold equals to 0.5, and maximum error equals to The parameter settings were empirically defined. The algorithm was performed in 903 seconds on a personal computer and the Table 2 shows the classification results. Table 2: Confusion matrix with the classification results Fraudster (a) Non-fraudster (b) Classified as (a) Classified as (b) The classification results were obtained with an error rate of after all epochs, and the performance measures were calculated from the confusion matrix. The accuracy is equal to 87.17%, precision is equal to 65.03% and recall is equal to 29.47%. 21

6 After obtaining the classification results, a list of inspections is generated to check if consumers are fraudsters. The precision indicates the percentage of consumers correctly inspected by the staff team. In this context, increasing of hit rate is essential to improve the number of fraudsters found, and reducing the number of non-fraudsters inspected. This directly implies two consequences: recovery of financial income and reducing of unnecessary operational costs. According to the field measurement, currently the same electric power distribution company has obtained a precision around 40% using different methods. Our methodology improved this number to 65.03%, outperforming in 50% the current scenario. In other words, as previously seen, we obtained 22,848 records of inspections from the year 2011, i.e. only 9,139 consumers were correctly inspected by the company. At the same time, our methodology could obtain a precision of 14,858 consumers a difference of 5719 per year. That difference of fraudsters consumers found represents the recovery of financial income in three senses: (1) the recovery of value defrauded applying fines according to the fraud period, (2) normalization of the consumers' status and the regularization of their consumption, and (3) unnecessary expenses in incorrect inspections were avoided by reducing expenses with employees, transportation costs and fuel consumption. Another important point is the use of the k-fold Cross Validation method, whose goal is to prevent the ANN-MLP of over-fitting the training data. This allows the learning to be used for the classification of consumers outside the training set (the real world case). Although the algorithm used the stratified validation method, the unbalanced class problem is a weakness that must be better handled, since the learning systems can find difficulties in the classification of records belonging to minority class when the class distribution is unequal. This is a need for the next steps of development of our methodology. 4. CONCLUSION This paper proposed the use of the KDD process for fraud detection in the power distribution network (low-voltage consumers). For this purpose, a methodology with data preprocessing and mining using artificial neural networks was applied. The results of the experiment indicated that is possible to improve the hit rate obtained by the methods used currently for the classification of fraudsters. The experiment was performed using only historical data and a deeper evaluation in field is needed to ensure the efficiency of our approach. Although this has not been done, we used the k- Fold Cross Validation method for assessing how the results will generalize to an independent data set. Furthermore, we intend on testing different supervised classification techniques and compare theirs results. 22

7 REFERENCES [1] J. Nagi et al., (2010) Nontechnical Loss Detection for Metered Customers in Power Utility Using Support Vector Machines, IEEE Transactions on Power Delivery, Vol. 25, N. 2, Apr [2] Cabral et al., (2004) Fraud Detection in Electrical Energy Consumers Using Rough Sets, 2004 IEEE International Conference on Systems, Man and Cybernetics: The Hague, Netherlands, Oct [3] Monedero et al., (2006) MIDAS Detection of Non-technical Losses in Electrical Consumption Using Neural Networks and Statistical Techniques, In proceeding of: Computational Science and Its Applications - ICCSA 2006, International Conference, Glasgow, UK, May [4] Cabral et al., (2009) Fraud Detection System for High and Low Voltage Electricity Consumers Based on Data Mining, Power and Energy Society General Meeting, Calgary, Jul [5] Muniz et al., (2009) A Neuron-fuzzy System for Fraud Detection in Electricity Distribution, Proceedings of the IFSA-EUSFLAT 2009, Lisbon, Portugal, Jul [6] J. Han, M. Kamber, J. Pei, (2011) Data mining: concepts and techniques, Morgan Kaufmann, 3rd ed. [7] S. Haykin, (1999) Neural Networks: A Comprehensive Foundation, Prentice Hall International, 2nd ed. [8] E. W. S. dos Angelos et al., (2011) Detection and Identification of Abnormalities in Customer Consumptions in Power Distribution Systems, IEEE Transactions on Power Delivery, Vol. 26, N. 4, Oct [9] C. C. O. Ramos et al., (2012) New Insights on Nontechnical Losses Characterization Through Evolutionary-Based Feature Selection, IEEE Transactions on Power Delivery, Vol. 27, N. 1, Jan [10] A. H. Nizar, Z. Y. Dong, Y. Wang, (2008) "Power Utility Nontechnical Loss Analysis With Extreme Learning Machine Method", IEEE Transactions on Power Systems, Vol. 23, N. 3, pp , Aug [11] J. Nagi et al., (2008) "Detection of abnormalities and electricity theft using genetic Support Vector Machines" TENCON IEEE Region 10 Conference, pp.1-6, Nov [12] J. Nagi et al., (2008) "Non-Technical Loss analysis for detection of electricity theft using support vector machines", IEEE 2nd International Power and Energy Conference, pp , 1-3 Dec [13] A. C. Ramos et al., (2011) "A New Approach for Nontechnical Losses Detection Based on Optimum- Path Forest", IEEE Transactions on Power Systems, Vol. 26, N. 1, pp , Feb [14] M. Negnevitsky, Artificial Intelligence: A Guide to Intelligent Systems, Pearson Education Canada, 3rd ed. 23

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

Neural Network Predictor for Fraud Detection: A Study Case for the Federal Patrimony Department

Neural Network Predictor for Fraud Detection: A Study Case for the Federal Patrimony Department DOI: 10.5769/C2012010 or http://dx.doi.org/10.5769/c2012010 Neural Network Predictor for Fraud Detection: A Study Case for the Federal Patrimony Department Antonio Manuel Rubio Serrano (1,2), João Paulo

More information

Feature Subset Selection in E-mail Spam Detection

Feature Subset Selection in E-mail Spam Detection Feature Subset Selection in E-mail Spam Detection Amir Rajabi Behjat, Universiti Technology MARA, Malaysia IT Security for the Next Generation Asia Pacific & MEA Cup, Hong Kong 14-16 March, 2012 Feature

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification

More information

A Content based Spam Filtering Using Optical Back Propagation Technique

A Content based Spam Filtering Using Optical Back Propagation Technique A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

A Web-based Interactive Data Visualization System for Outlier Subspace Analysis

A Web-based Interactive Data Visualization System for Outlier Subspace Analysis A Web-based Interactive Data Visualization System for Outlier Subspace Analysis Dong Liu, Qigang Gao Computer Science Dalhousie University Halifax, NS, B3H 1W5 Canada dongl@cs.dal.ca qggao@cs.dal.ca Hai

More information

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com

More information

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6, January 2013 Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing

More information

A Neuro-fuzzy System for Fraud Detection in Electricity Distribution

A Neuro-fuzzy System for Fraud Detection in Electricity Distribution A Neuro-fuzzy System for Fraud Detection in Electricity Distribution C. Muniz, M. Vellasco, R. Tanscheit, K. Figueiredo Computational Intelligence Lab., Department of Electrical Engineering, Pontifical

More information

Modeling to improve the customer unit target selection for inspections of Commercial Losses in Brazilian Electric Sector - The case CEMIG

Modeling to improve the customer unit target selection for inspections of Commercial Losses in Brazilian Electric Sector - The case CEMIG Paper 3406-2015 Modeling to improve the customer unit target selection for inspections of Commercial Losses in Brazilian Electric Sector - The case CEMIG Sérgio Henrique Rodrigues Ribeiro, CEMIG; Iguatinan

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

Neural Networks in Data Mining

Neural Networks in Data Mining IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V6 PP 01-06 www.iosrjen.org Neural Networks in Data Mining Ripundeep Singh Gill, Ashima Department

More information

Performance Evaluation of Intrusion Detection Systems using ANN

Performance Evaluation of Intrusion Detection Systems using ANN Performance Evaluation of Intrusion Detection Systems using ANN Khaled Ahmed Abood Omer 1, Fadwa Abdulbari Awn 2 1 Computer Science and Engineering Department, Faculty of Engineering, University of Aden,

More information

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems 2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Impact of Feature Selection on the Performance of ireless Intrusion Detection Systems

More information

Analecta Vol. 8, No. 2 ISSN 2064-7964

Analecta Vol. 8, No. 2 ISSN 2064-7964 EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,

More information

W6.B.1. FAQs CS535 BIG DATA W6.B.3. 4. If the distance of the point is additionally less than the tight distance T 2, remove it from the original set

W6.B.1. FAQs CS535 BIG DATA W6.B.3. 4. If the distance of the point is additionally less than the tight distance T 2, remove it from the original set http://wwwcscolostateedu/~cs535 W6B W6B2 CS535 BIG DAA FAQs Please prepare for the last minute rush Store your output files safely Partial score will be given for the output from less than 50GB input Computer

More information

Evaluation & Validation: Credibility: Evaluating what has been learned

Evaluation & Validation: Credibility: Evaluating what has been learned Evaluation & Validation: Credibility: Evaluating what has been learned How predictive is a learned model? How can we evaluate a model Test the model Statistical tests Considerations in evaluating a Model

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

Addressing the Class Imbalance Problem in Medical Datasets

Addressing the Class Imbalance Problem in Medical Datasets Addressing the Class Imbalance Problem in Medical Datasets M. Mostafizur Rahman and D. N. Davis the size of the training set is significantly increased [5]. If the time taken to resample is not considered,

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Overview. Evaluation Connectionist and Statistical Language Processing. Test and Validation Set. Training and Test Set

Overview. Evaluation Connectionist and Statistical Language Processing. Test and Validation Set. Training and Test Set Overview Evaluation Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes training set, validation set, test set holdout, stratification

More information

Data quality in Accounting Information Systems

Data quality in Accounting Information Systems Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania

More information

ClusterOSS: a new undersampling method for imbalanced learning

ClusterOSS: a new undersampling method for imbalanced learning 1 ClusterOSS: a new undersampling method for imbalanced learning Victor H Barella, Eduardo P Costa, and André C P L F Carvalho, Abstract A dataset is said to be imbalanced when its classes are disproportionately

More information

Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College

More information

Artificial Neural Network Approach for Classification of Heart Disease Dataset

Artificial Neural Network Approach for Classification of Heart Disease Dataset Artificial Neural Network Approach for Classification of Heart Disease Dataset Manjusha B. Wadhonkar 1, Prof. P.A. Tijare 2 and Prof. S.N.Sawalkar 3 1 M.E Computer Engineering (Second Year)., Computer

More information

A MINING FRAMEWORK TO DETECT NON-TECHNICAL LOSSES IN POWER UTILITIES

A MINING FRAMEWORK TO DETECT NON-TECHNICAL LOSSES IN POWER UTILITIES A MINING FRAMEWORK TO DETECT NON-TECHNICAL LOSSES IN POWER UTILITIES Félix Biscarri, Iñigo Monedero, Carlos León, Juan I. Guerrero Department of Electronic Technology, University of Seville, C/ Virgen

More information

1. Classification problems

1. Classification problems Neural and Evolutionary Computing. Lab 1: Classification problems Machine Learning test data repository Weka data mining platform Introduction Scilab 1. Classification problems The main aim of a classification

More information

6.2.8 Neural networks for data mining

6.2.8 Neural networks for data mining 6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural

More information

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM ABSTRACT Luis Alexandre Rodrigues and Nizam Omar Department of Electrical Engineering, Mackenzie Presbiterian University, Brazil, São Paulo 71251911@mackenzie.br,nizam.omar@mackenzie.br

More information

Keywords Data mining, Classification Algorithm, Decision tree, J48, Random forest, Random tree, LMT, WEKA 3.7. Fig.1. Data mining techniques.

Keywords Data mining, Classification Algorithm, Decision tree, J48, Random forest, Random tree, LMT, WEKA 3.7. Fig.1. Data mining techniques. International Journal of Emerging Research in Management &Technology Research Article October 2015 Comparative Study of Various Decision Tree Classification Algorithm Using WEKA Purva Sewaiwar, Kamal Kant

More information

How To Use Neural Networks In Data Mining

How To Use Neural Networks In Data Mining International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

To improve the problems mentioned above, Chen et al. [2-5] proposed and employed a novel type of approach, i.e., PA, to prevent fraud.

To improve the problems mentioned above, Chen et al. [2-5] proposed and employed a novel type of approach, i.e., PA, to prevent fraud. Proceedings of the 5th WSEAS Int. Conference on Information Security and Privacy, Venice, Italy, November 20-22, 2006 46 Back Propagation Networks for Credit Card Fraud Prediction Using Stratified Personalized

More information

Credit Card Fraud Detection Using Self Organised Map

Credit Card Fraud Detection Using Self Organised Map International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 13 (2014), pp. 1343-1348 International Research Publications House http://www. irphouse.com Credit Card Fraud

More information

Neural Networks and Back Propagation Algorithm

Neural Networks and Back Propagation Algorithm Neural Networks and Back Propagation Algorithm Mirza Cilimkovic Institute of Technology Blanchardstown Blanchardstown Road North Dublin 15 Ireland mirzac@gmail.com Abstract Neural Networks (NN) are important

More information

Enhancing Quality of Data using Data Mining Method

Enhancing Quality of Data using Data Mining Method JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2, ISSN 25-967 WWW.JOURNALOFCOMPUTING.ORG 9 Enhancing Quality of Data using Data Mining Method Fatemeh Ghorbanpour A., Mir M. Pedram, Kambiz Badie, Mohammad

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION

HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION Chihli Hung 1, Jing Hong Chen 2, Stefan Wermter 3, 1,2 Department of Management Information Systems, Chung Yuan Christian University, Taiwan

More information

Healthcare Measurement Analysis Using Data mining Techniques

Healthcare Measurement Analysis Using Data mining Techniques www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik

More information

Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm

Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm Martin Hlosta, Rostislav Stríž, Jan Kupčík, Jaroslav Zendulka, and Tomáš Hruška A. Imbalanced Data Classification

More information

EVALUATING THE EFFECT OF DATASET SIZE ON PREDICTIVE MODEL USING SUPERVISED LEARNING TECHNIQUE

EVALUATING THE EFFECT OF DATASET SIZE ON PREDICTIVE MODEL USING SUPERVISED LEARNING TECHNIQUE International Journal of Software Engineering & Computer Sciences (IJSECS) ISSN: 2289-8522,Volume 1, pp. 75-84, February 2015 Universiti Malaysia Pahang DOI: http://dx.doi.org/10.15282/ijsecs.1.2015.6.0006

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

An analysis of suitable parameters for efficiently applying K-means clustering to large TCPdump data set using Hadoop framework

An analysis of suitable parameters for efficiently applying K-means clustering to large TCPdump data set using Hadoop framework An analysis of suitable parameters for efficiently applying K-means clustering to large TCPdump data set using Hadoop framework Jakrarin Therdphapiyanak Dept. of Computer Engineering Chulalongkorn University

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Lecture 6. Artificial Neural Networks

Lecture 6. Artificial Neural Networks Lecture 6 Artificial Neural Networks 1 1 Artificial Neural Networks In this note we provide an overview of the key concepts that have led to the emergence of Artificial Neural Networks as a major paradigm

More information

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, fabian.gruening@informatik.uni-oldenburg.de Abstract: Independent

More information

Comparison of Supervised and Unsupervised Learning Classifiers for Travel Recommendations

Comparison of Supervised and Unsupervised Learning Classifiers for Travel Recommendations Volume 3, No. 8, August 2012 Journal of Global Research in Computer Science REVIEW ARTICLE Available Online at www.jgrcs.info Comparison of Supervised and Unsupervised Learning Classifiers for Travel Recommendations

More information

A survey on Data Mining based Intrusion Detection Systems

A survey on Data Mining based Intrusion Detection Systems International Journal of Computer Networks and Communications Security VOL. 2, NO. 12, DECEMBER 2014, 485 490 Available online at: www.ijcncs.org ISSN 2308-9830 A survey on Data Mining based Intrusion

More information

Towards applying Data Mining Techniques for Talent Mangement

Towards applying Data Mining Techniques for Talent Mangement 2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Towards applying Data Mining Techniques for Talent Mangement Hamidah Jantan 1,

More information

EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE

EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE S. Anupama Kumar 1 and Dr. Vijayalakshmi M.N 2 1 Research Scholar, PRIST University, 1 Assistant Professor, Dept of M.C.A. 2 Associate

More information

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

Machine Learning Approach for Estimating Sensor Deployment Regions on Satellite Images

Machine Learning Approach for Estimating Sensor Deployment Regions on Satellite Images Machine Learning Approach for Estimating Sensor Deployment Regions on Satellite Images 1 Enes ATEŞ, 2 *Tahir Emre KALAYCI, 1 Aybars UĞUR 1 Faculty of Engineering, Department of Computer Engineering Ege

More information

Denial of Service Attack Detection Using Multivariate Correlation Information and Support Vector Machine Classification

Denial of Service Attack Detection Using Multivariate Correlation Information and Support Vector Machine Classification International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-4, Issue-3 E-ISSN: 2347-2693 Denial of Service Attack Detection Using Multivariate Correlation Information and

More information

Index Terms: Intrusion Detection System (IDS), Training, Neural Network, anomaly detection, misuse detection.

Index Terms: Intrusion Detection System (IDS), Training, Neural Network, anomaly detection, misuse detection. Survey: Learning Techniques for Intrusion Detection System (IDS) Roshani Gaidhane, Student*, Prof. C. Vaidya, Dr. M. Raghuwanshi RGCER, Computer Science and Engineering Department, RTMNU University Nagpur,

More information

Use of Artificial Neural Network in Data Mining For Weather Forecasting

Use of Artificial Neural Network in Data Mining For Weather Forecasting Use of Artificial Neural Network in Data Mining For Weather Forecasting Gaurav J. Sawale #, Dr. Sunil R. Gupta * # Department Computer Science & Engineering, P.R.M.I.T& R, Badnera. 1 gaurav.sawale@yahoo.co.in

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case Study: Qom Payame Noor University)

Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case Study: Qom Payame Noor University) 260 IJCSNS International Journal of Computer Science and Network Security, VOL.11 No.6, June 2011 Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case

More information

SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH

SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH 330 SURVIVABILITY ANALYSIS OF PEDIATRIC LEUKAEMIC PATIENTS USING NEURAL NETWORK APPROACH T. M. D.Saumya 1, T. Rupasinghe 2 and P. Abeysinghe 3 1 Department of Industrial Management, University of Kelaniya,

More information

A Neural Network Based System for Intrusion Detection and Classification of Attacks

A Neural Network Based System for Intrusion Detection and Classification of Attacks A Neural Network Based System for Intrusion Detection and Classification of Attacks Mehdi MORADI and Mohammad ZULKERNINE Abstract-- With the rapid expansion of computer networks during the past decade,

More information

Intrusion Detection via Machine Learning for SCADA System Protection

Intrusion Detection via Machine Learning for SCADA System Protection Intrusion Detection via Machine Learning for SCADA System Protection S.L.P. Yasakethu Department of Computing, University of Surrey, Guildford, GU2 7XH, UK. s.l.yasakethu@surrey.ac.uk J. Jiang Department

More information

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals The Role of Size Normalization on the Recognition Rate of Handwritten Numerals Chun Lei He, Ping Zhang, Jianxiong Dong, Ching Y. Suen, Tien D. Bui Centre for Pattern Recognition and Machine Intelligence,

More information

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM. DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,

More information

IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES

IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES Bruno Carneiro da Rocha 1,2 and Rafael Timóteo de Sousa Júnior 2 1 Bank of Brazil, Brasília-DF, Brazil brunorocha_33@hotmail.com 2 Network Engineering

More information

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

AT&T Global Network Client for Windows Product Support Matrix January 29, 2015

AT&T Global Network Client for Windows Product Support Matrix January 29, 2015 AT&T Global Network Client for Windows Product Support Matrix January 29, 2015 Product Support Matrix Following is the Product Support Matrix for the AT&T Global Network Client. See the AT&T Global Network

More information

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH SANGITA GUPTA 1, SUMA. V. 2 1 Jain University, Bangalore 2 Dayanada Sagar Institute, Bangalore, India Abstract- One

More information

Comparative Analysis of Classification Algorithms on Different Datasets using WEKA

Comparative Analysis of Classification Algorithms on Different Datasets using WEKA Volume 54 No13, September 2012 Comparative Analysis of Classification Algorithms on Different Datasets using WEKA Rohit Arora MTech CSE Deptt Hindu College of Engineering Sonepat, Haryana, India Suman

More information

Towards better accuracy for Spam predictions

Towards better accuracy for Spam predictions Towards better accuracy for Spam predictions Chengyan Zhao Department of Computer Science University of Toronto Toronto, Ontario, Canada M5S 2E4 czhao@cs.toronto.edu Abstract Spam identification is crucial

More information

Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.

Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013. Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.38457 Accuracy Rate of Predictive Models in Credit Screening Anirut Suebsing

More information

NEURAL NETWORKS IN DATA MINING

NEURAL NETWORKS IN DATA MINING NEURAL NETWORKS IN DATA MINING 1 DR. YASHPAL SINGH, 2 ALOK SINGH CHAUHAN 1 Reader, Bundelkhand Institute of Engineering & Technology, Jhansi, India 2 Lecturer, United Institute of Management, Allahabad,

More information

Lecture 13: Validation

Lecture 13: Validation Lecture 3: Validation g Motivation g The Holdout g Re-sampling techniques g Three-way data splits Motivation g Validation techniques are motivated by two fundamental problems in pattern recognition: model

More information

Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing

Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing www.ijcsi.org 198 Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing Lilian Sing oei 1 and Jiayang Wang 2 1 School of Information Science and Engineering, Central South University

More information

Datamining. Gabriel Bacq CNAMTS

Datamining. Gabriel Bacq CNAMTS Datamining Gabriel Bacq CNAMTS In a few words DCCRF uses two ways to detect fraud cases: one which is fully implemented and another one which is experimented: 1. Database queries (fully implemented) Example:

More information

Revenue Recovering with Insolvency Prevention on a Brazilian Telecom Operator

Revenue Recovering with Insolvency Prevention on a Brazilian Telecom Operator Revenue Recovering with Insolvency Prevention on a Brazilian Telecom Operator Carlos André R. Pinheiro Brasil Telecom SIA Sul ASP Lote D Bloco F 71.215-000 Brasília, DF, Brazil andrep@brasiltelecom.com.br

More information

Prediction of Heart Disease Using Naïve Bayes Algorithm

Prediction of Heart Disease Using Naïve Bayes Algorithm Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,

More information

Artificial Neural Network-based Electricity Price Forecasting for Smart Grid Deployment

Artificial Neural Network-based Electricity Price Forecasting for Smart Grid Deployment Artificial Neural Network-based Electricity Price Forecasting for Smart Grid Deployment Bijay Neupane, Kasun S. Perera, Zeyar Aung, and Wei Lee Woon Masdar Institute of Science and Technology Abu Dhabi,

More information

Neural network software tool development: exploring programming language options

Neural network software tool development: exploring programming language options INEB- PSI Technical Report 2006-1 Neural network software tool development: exploring programming language options Alexandra Oliveira aao@fe.up.pt Supervisor: Professor Joaquim Marques de Sá June 2006

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Intrusion Detection using Artificial Neural Networks with Best Set of Features

Intrusion Detection using Artificial Neural Networks with Best Set of Features 728 The International Arab Journal of Information Technology, Vol. 12, No. 6A, 2015 Intrusion Detection using Artificial Neural Networks with Best Set of Features Kaliappan Jayakumar 1, Thiagarajan Revathi

More information

Prediction of Stock Performance Using Analytical Techniques

Prediction of Stock Performance Using Analytical Techniques 136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University

More information

Impelling Heart Attack Prediction System using Data Mining and Artificial Neural Network

Impelling Heart Attack Prediction System using Data Mining and Artificial Neural Network General Article International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347-5161 2014 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Impelling

More information

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup Network Anomaly Detection A Machine Learning Perspective Dhruba Kumar Bhattacharyya Jugal Kumar KaKta»C) CRC Press J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor

More information

Data Mining using Artificial Neural Network Rules

Data Mining using Artificial Neural Network Rules Data Mining using Artificial Neural Network Rules Pushkar Shinde MCOERC, Nasik Abstract - Diabetes patients are increasing in number so it is necessary to predict, treat and diagnose the disease. Data

More information

Pattern-Aided Regression Modelling and Prediction Model Analysis

Pattern-Aided Regression Modelling and Prediction Model Analysis San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Fall 2015 Pattern-Aided Regression Modelling and Prediction Model Analysis Naresh Avva Follow this and

More information

FRAUD DETECTION IN MOBILE TELECOMMUNICATION

FRAUD DETECTION IN MOBILE TELECOMMUNICATION FRAUD DETECTION IN MOBILE TELECOMMUNICATION Fayemiwo Michael Adebisi 1* and Olasoji Babatunde O 1. 1 Department of Mathematical Sciences, College of Natural and Applied Science, Oduduwa University, Ipetumodu,

More information

LVQ Plug-In Algorithm for SQL Server

LVQ Plug-In Algorithm for SQL Server LVQ Plug-In Algorithm for SQL Server Licínia Pedro Monteiro Instituto Superior Técnico licinia.monteiro@tagus.ist.utl.pt I. Executive Summary In this Resume we describe a new functionality implemented

More information

Price Prediction of Share Market using Artificial Neural Network (ANN)

Price Prediction of Share Market using Artificial Neural Network (ANN) Prediction of Share Market using Artificial Neural Network (ANN) Zabir Haider Khan Department of CSE, SUST, Sylhet, Bangladesh Tasnim Sharmin Alin Department of CSE, SUST, Sylhet, Bangladesh Md. Akter

More information

Time Series Data Mining in Rainfall Forecasting Using Artificial Neural Network

Time Series Data Mining in Rainfall Forecasting Using Artificial Neural Network Time Series Data Mining in Rainfall Forecasting Using Artificial Neural Network Prince Gupta 1, Satanand Mishra 2, S.K.Pandey 3 1,3 VNS Group, RGPV, Bhopal, 2 CSIR-AMPRI, BHOPAL prince2010.gupta@gmail.com

More information

Mining the Software Change Repository of a Legacy Telephony System

Mining the Software Change Repository of a Legacy Telephony System Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,

More information

Data Mining. Nonlinear Classification

Data Mining. Nonlinear Classification Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15

More information

A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE

A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE Kasra Madadipouya 1 1 Department of Computing and Science, Asia Pacific University of Technology & Innovation ABSTRACT Today, enormous amount of data

More information

Predictive Data modeling for health care: Comparative performance study of different prediction models

Predictive Data modeling for health care: Comparative performance study of different prediction models Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath hiremat.nitie@gmail.com National Institute of Industrial Engineering (NITIE) Vihar

More information

An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation

An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation Shanofer. S Master of Engineering, Department of Computer Science and Engineering, Veerammal Engineering College,

More information

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trakovski trakovski@nyus.edu.mk Neural Networks 2 Neural Networks Analogy to biological neural systems, the most robust learning systems

More information