Recognizing The Theft of Identity Using Data Mining

Size: px
Start display at page:

Download "Recognizing The Theft of Identity Using Data Mining"

Transcription

1 Recognizing The Theft of Identity Using Data Mining Aniruddha Kshirsagar 1, Lalit Dole 2 1,2 CSE Department, GHRCE, Nagpur, Maharashtra, India Abstract Identity fraud is the great matter of concern in the field of the e-commerce. Identity Fraud is more than a security issue it is a financial burden as in transaction and application domain the culprit can make serious problems for the victims as they can be affected with unethical activity by using victims private information for the economic gain. Application fraud is one of the prominent example of identity fraud where the thief can use victims personal information for issuing the credit card account or loan. To counter this problem data mining based two step recognition system is proposed. This system contains two algorithms Communal Detection (CD) which checks for multi-attribute link and Spike Detection (SD) which checks for single attribute link. CD algorithm targets the communal relationships of the dataset while SD algorithm finds the spikes between the duplicates in the dataset. Together these two algorithms can be used for the theft detection in the application fraud. Also detect several attacks. Keywords Anomaly Detection, Application Domain, Data Stream, Identity Theft. I. INTRODUCTION Identity Crime may occur when someone steals victim s personal information, to open credit card accounts or taking various loans using the victim's name without authorization, and issues products to those accounts. Identity Crime is a substitute of unlawful identity change. It points to unauthorized activities by using the identity of another person or of a non-existing person as a primary tool for products procurement. Identity Crime can be committed by forging the related documents in two ways first one is the issuance of a genuine identity document, under a synthetic identity. A synthetic identity is made after several altered or forged identity documents, which are used to prove one s identity at the enrollment step. Synthetic identity fraud is the act of creating a virtual identity, to perpetrate criminal activities. The second one is, the illegitimate use of a genuine identity document. These can be harder to obtain but easier to successfully apply. In reality, identity crime can be done with a mix of both synthetic and real identity details. The manufacturing and use of forged identity documents is a financial burden for governments, social welfare institutions, and financial institutions. Prominently there are two domains where the identity crime is a big apprehension in financial field: Transaction and Credit Application. 556 Transaction domain is concerned with the identity crime in the online financial transaction which is done through credit card, online banking. Here fraudster does the transaction online through the victim s credit card or bank details disguised as the victim. In Credit Application domain the identity fraud is when someone applies for a credit card, mortgage loan, and home loan with false information. Fraud that involves e-commerce activities such as credit card transactions etc. denote significant problems for governments and businesses, however detecting and preventing fraud is not an easy job. Fraud is an adaptive crime, so it needs special methods of intelligent data analysis to detect and prevent it. These methods exists in the fields of Knowledge Discovery in Databases (KDD), Data Mining, Machine Learning and Statistics. They propose applicable and successful solutions in different areas of fraud crimes. II. DATA MINING: OVERVIEW Data mining is about searching for understandings which are statistically reliable, not known before and actionable from data. This data must be accessible, relevant, enough and clean. Also, the data mining problem must be precisely defined, cannot be solved by query and reporting tools, and directed by a data mining process model. Data mining is used to detect to classify, cluster and segment the data and automatically find associations and rules in the data that may signify interesting patterns, including those related to fraud. So, if data mining results in discovering meaningful patterns, data turns into information and this information is used in detecting anomalies which results in fraud. The purpose of Data Mining tools is to have a knowledgeable understanding of the private data of the people, and of the activity logs of the document issuance system. Data Mining enables an all-inclusive view on the data related to one citizen, from the enrolment step to each transaction made with the identity documents. Data mining tools take data and construct a depiction of reality in the form of a model. The resultant model describes patterns and relationships existing in the data. From a process orientation, data mining activities fall into three general categories: Discovery-the process of finding hidden patterns in a database without a predetermined information or hypothesis about what the patterns may be.

2 Analytical Modelling-the process of using patterns found in the database and using them to predict the future. Criminal Analysis-the process of implying the mined patterns to find anomalous or unusual data elements. The data mining techniques can help companies in various fields by mining their expanding databases for useful, thorough transaction information. The use of Data Mining technology help in decreasing the amount of work of analysts and enables them to focus on investigating activities or Individuals that have been tagged as suspicious. III. LITERATURE SURVEY Logistic regression, neural networks, or Support Vector Machines (SVM), cannot achieve scalability or handle the extreme imbalanced class [1] in credit application data streams. As scam and lawful performance changes very often, the classifiers will not work up to the mark rapidly and the supervised classification algorithms will need to be trained on the new data. The training time taken for realtime credit application fraud detection is very high as the new training data have too many derived numerical attributes and too few known frauds. Separately many data mining algorithms have been used in fraud detection. Case-based reasoning (CBR) [6] is the known prior publication in the screening of credit applications. CBR looks for the toughest cases which have been misclassified by existing methods and techniques. For the recovery purpose it uses the threshold nearest neighbor matching. For the analysis purpose multiple selection criteria and resolution strategies are used to analyze the retrieved cases. Peer group analysis [5] displays interaccount performance over time. It compares the cumulative mean weekly amount between a target account and other similar accounts at subsequent time points. The suspicion score is taken as a threshold which determines the consistent distance from the center of the peer group. Break point analysis [5] displays intraccount performance over time. It spots sudden increases in weekly spending within a single account. The arrangement of the accounts are based on the t-test. Bayesian networks [3] discover simulated anthrax attacks from real emergency department data. Wong [2] surveys algorithms for finding suspicious activity in time for disease outbreaks. Goldenberg et al. [4] use time series analysis to track early symptoms of synthetic anthrax outbreaks from daily sales of retail medication and some grocery items. Existing methods uses supervised learning algorithms such as neural networks, SVM etc. but they cannot achieve scalability. Supervised learning algorithm needs training on new data. IV. PROPOSED METHODOLOGY As the previous methods uses supervised learning algorithms and have problems related to the scalability factor and requirement of the new training data for the identity theft detection in application fraud domain where the new stream of data is ever coming the unsupervised algorithms is thought of useful as there will not be problems concerning training of new data. For this the two unsupervised algorithms are used namely communal detection and spike detection the working of these algorithms are explained with the help of the figure of system architecture. As shown in this figure the dataset is taken as the input which has the records of credit applications further the data records are taken based on the timestamp then the Communal Detection(CD) and Spike Detection (SD) algorithms gives the combined suspicion score. The CD and SD algorithms are explained as follows: Fig. System Architecture CD Algorithm: This algorithm helps for the bank when there are applications from the users specifically it is used to verify the duplicity of the users that is either by changing their name or else mobile number. 557

3 The problem is that when the name is nearby same for pronouncing to the previously applied name from the same contact number of home, identical address as well as identical area then there could be a possibility that the user might be trying to do some scam through the card. This mechanism issues the white list of the users whose data is not at all similar with the other data of the users. If there is any identical data then we need to blacklist that user and go for the manual verification which is a step in advance than normal communal detection. The need for communal detection is defined here. When there are two applications where in alike kind of records exist with very minute changes, there could be possibility of they being related or the same person is applying twice. Communal Detection is a method where such criteria is looked after. It works on fixed set of attributes and it uses a white-list oriented approach the communal relationships are records with have near identical values on the chosen attributes. A white-list is constructed with entities that display more probabilities of communal relationships. The algorithm takes exponential smoothing factor, input size threshold, state of alert, threshold for the similarity between the string, threshold of the attribute, exact duplicate filter, link-types in existing white-list, affecting window and current application as input furthermore returns output as suspicion score along with parameter change and the new whitelist. The steps for the CD algorithm is as given: Let Vi is current un scored application and labeled with set of attributes (a1,a2 an) Is compared with previous scored application Vj (a1,a2, an). 1. Attribute Vector: It finds attributes that exceed string similarity threshold; generate multi-attribute links against link types in current white-list when their duplicates similarity is more than attribute threshold. The first step of the CD algorithm matches every current application s value against a moving window of previous application s values to find links. S(ei,j)={1 if sim(ai,aj)>tsim 0 otherwise S(E)= s(ei,j) Where S(E) is attribute weight score and ei,j is the single-attribute match between the current value and a previous value. The first case uses Jaro-Winkler(.) which is a case sensitive method which match the linking current value as well as previous values from an additional similar attribute by cross referring. In the second case it is based on a non match as the values are not alike. 2. Current Weight: Using first step s multi-attribute links examine single link score. In further step of the CD algorithm accounts for weights of the attributes moreover it matches all current application s link beside the white list to discover communal relationships furthermore it reduces their link score. Here Wk is over all current weight between Vi,Vj. Rx;link-type is the linktype of the current whitelist. Thiis formula contains three cases. The first one uses attribute weights. The second one gives the link score of the grey list and in the third one it looked whether there are multiattribute links or not. 3. Average Previous Score: Using applications given above which are linked to Step1, examine the average of the preceding scores. In this step of the CD algorithm the calculation of all linked previous application s score designed for inclusion into the present application s score. The scores of previous steps act as the proven threshold. S(Vj) is average previous score of previous example. EO(Vj) is the number of outlinks from the previous application. In this equation, the first case computes each earlier application s average score while the second case is applied if there is no multi-attribute link. S(Vj) is average previous score of previous example. 4. Suspicion score: The suspicion score of the third and fourth step is examined. The fourth step of the CD algorithm is the calculation of all current application s score with every link along with previous application score. S(Vi)= S(E)+wi,j+S(Vj) Here the score of every recent application using prior score of the application and every link present over there is calculated. 558

4 5. Data Quality Improvement: The adaptive CD algorithm exchanges one random parameter s effectiveness based on the suspicion score for efficiency. SD Algorithm: The spike detection process is essential in order to develop adaptively as well as resilience of the proposed solution for Credit crime detection. The spike detection complements communal detection which providing attribute weights. This algorithm takes current application, present step, filter for the time difference, similarity threshold, moreover exponential smoothing factor as input and returns output as suspicion score along with attribute weights. The steps of the SD algorithm are as given: Vi is current unscored application. 1. Attribute Vector: Here present application s value is checked with prior applications in order to discover links using the following equation. S(ei,j)={1 if sim(ai,aj)>tsim 0 otherwise S(E)= s(ei,j) Where ai,j is the single-attribute match between the current value with previous value. The first case uses Jaro- Winkler (.)(Gordon et al, 2007), which is a case sensitive method which match the linking current value as well as previous values from an additional similar attribute by cross referring.. Time (.) which remains time alteration measured in minutes. The second case occur a dissimilar values that are not constant or else continue too quickly. 2. Single Value spike detection: Based on first step s matches, the current value s Score is calculated. In this step the calculation of all single current value s score through assimilating each as well as every steps to find spikes. The before steps act at the same time as the established baseline level. S(ai,j)=(1-α)+S(ai,k)+α* (St(ai,k)/t-1; Where S (ai,k) is the current value score. α is the exponential smoothing factor. 3. Multiple Score Value: In this step every present application s score is calculated by means of all values scores beside with attribute weights. S(Vi)= S(ai,j)*Wk Here S(vi) is the of the current application suspicion score of SD. 4. CD attribute weights change. By the end of all present Mini discrete data stream, in this step of the SD algorithm updates the attribute weights used for CD. At the end of current application cd weight is updated. Where wk is the attribute weight of the SD applied to the CD attributes. V. RESULT This is the communal fraud scoring data set and the file contains 21 attributes and records. These records are filtered for the redundant or duplicates records. This data set does not contain any such false records hence the whole of these records are stored into the database. 559

5 In the time stamp given select the date for the number of applications for that particular date here 1/1/2004 date is selected. It contains 35 records. For this date link type was generated. Link type was generated based on the similarity between these records. If the record has very similar attributes then it is taken as fraud. If three attributes between two records then it is taken as a fraud. Record 0 and 2 contains greater than 3 attributes are similar so it is crime. For this weight also calculated. This is the weight of the link types and this is called as white list. For this white list attribute weight is calculated and based on this weight, single link is produced and suspicion score is calculated. Based on suspicion score parameter are changed. After the parameter are changed the spike detection (SD) also applied for this date. Again for 35 records of this date link type and weight is calculated. This time the Link type is created if the record contains four exact similar attributes. After calculating link type weight is calculated. Based on the Communal Detection (CD) algorithm, update a Spike Detection (SD) and weight is updated. Weight is calculated and then multiple score. Based on the Spike Detection (SD) weight, update the Communal Detection (CD) weight. At the end of every current data process, Spike Detection (SD) algorithm calculated and updates attribute weight for Communal Detection (CD). VI. CONCLUSION The system detects the fraud detection online credit card or loan application. This system is used to avoid the duplicates and from the fraudsters while applying the credit card or applying for any loan. Data mining algorithms are used this system. These algorithms namely communal detection and spike detection used to detect the multiple applicants. This system combing with the spike detection and communal detection algorithms are used to make the system more efficient and secure. The identity thief has limited time because not guilty people can detect the fraud concerned early and quickly the victim can take the needful action. REFERENCES [1] D. Hand, Classifier Technology and the Illusion of Progress, Statistical Science, vol. 21, no. 1, pp. 1-15,doi: / , [2] W. Wong, Data Mining for Early Disease Outbreak Detection, PhD thesis, Carnegie Mellon Univ., [3] W. Wong, A. Moore, G. Cooper, and M. Wagner, Bayesian Network Anomaly Pattern Detection for Detecting Disease Outbreaks, Proc. 20th Int l Conf. Machine Learning (ICML 03), pp , [4] A. Goldenberg, G. Shmueli, R. Caruana, and S. Fienberg, Early Statistical Detection of Anthrax Outbreaks by Tracking Over-the- Counter Medication Sales, Proc. Nat l Academy of Sciences USA(PNAS 02), vol. 99, no. 8, pp , [5] R. Bolton and D. Hand, Unsupervised Profiling Methods for Fraud Detection, Statistical Science, vol. 17, no. 3, pp , [6] I. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java. Morgan Kauffman, [7] A. Bifet and R. Kirkby Massive Online Analysis, Technical Manual, Univ. of Waikato, [8] Experian. Experian Detect: Application Fraud PreventionSystem,Whitepaper, f/ experian_detect.pdf, [9] T. Fawcett, An Introduction to ROC Analysis, Pattern Recognition Letters, vol. 27, pp , 2006, doi: /j.patrec

Online Credit Card Application and Identity Crime Detection

Online Credit Card Application and Identity Crime Detection Online Credit Card Application and Identity Crime Detection Ramkumar.E & Mrs Kavitha.P School of Computing Science, Hindustan University, Chennai ABSTRACT The credit cards have found widespread usage due

More information

Fighting Identity Fraud with Data Mining. Groundbreaking means to prevent fraud in identity management solutions

Fighting Identity Fraud with Data Mining. Groundbreaking means to prevent fraud in identity management solutions Fighting Identity Fraud with Data Mining Groundbreaking means to prevent fraud in identity management solutions Contents Executive summary Executive summary 3 The impact of identity fraud? 4 The forgery

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

Dan French Founder & CEO, Consider Solutions

Dan French Founder & CEO, Consider Solutions Dan French Founder & CEO, Consider Solutions CONSIDER SOLUTIONS Mission Solutions for World Class Finance Footprint Financial Control & Compliance Risk Assurance Process Optimization CLIENTS CONTEXT The

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015 RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering

More information

Credit Card Fraud Detection Using Self Organised Map

Credit Card Fraud Detection Using Self Organised Map International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 13 (2014), pp. 1343-1348 International Research Publications House http://www. irphouse.com Credit Card Fraud

More information

Meeting Identity Theft Red Flags Regulations with IBM Fraud, Risk & Compliance Solutions

Meeting Identity Theft Red Flags Regulations with IBM Fraud, Risk & Compliance Solutions Leveraging Risk & Compliance for Strategic Advantage IBM Information Management software Meeting Identity Theft Red Flags Regulations with IBM Fraud, Risk & Compliance Solutions XXX Astute financial services

More information

Statistics in Retail Finance. Chapter 7: Fraud Detection in Retail Credit

Statistics in Retail Finance. Chapter 7: Fraud Detection in Retail Credit Statistics in Retail Finance Chapter 7: Fraud Detection in Retail Credit 1 Overview > Detection of fraud remains an important issue in retail credit. Methods similar to scorecard development may be employed,

More information

Data Mining + Business Intelligence. Integration, Design and Implementation

Data Mining + Business Intelligence. Integration, Design and Implementation Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution

More information

Data Warehousing and Data Mining in Business Applications

Data Warehousing and Data Mining in Business Applications 133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business

More information

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant

More information

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM. DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,

More information

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,

More information

An effective approach to preventing application fraud. Experian Fraud Analytics

An effective approach to preventing application fraud. Experian Fraud Analytics An effective approach to preventing application fraud Experian Fraud Analytics The growing threat of application fraud Fraud attacks are increasing across the world Application fraud is a rapidly growing

More information

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product Sagarika Prusty Web Data Mining (ECT 584),Spring 2013 DePaul University,Chicago sagarikaprusty@gmail.com Keywords:

More information

Application of Hidden Markov Model in Credit Card Fraud Detection

Application of Hidden Markov Model in Credit Card Fraud Detection Application of Hidden Markov Model in Credit Card Fraud Detection V. Bhusari 1, S. Patil 1 1 Department of Computer Technology, College of Engineering, Bharati Vidyapeeth, Pune, India, 400011 Email: vrunda1234@gmail.com

More information

Prediction of Heart Disease Using Naïve Bayes Algorithm

Prediction of Heart Disease Using Naïve Bayes Algorithm Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification

More information

EXTENDED CENTROID BASED CLUSTERING TECHNIQUE FOR ONLINE SHOPPING FRAUD DETECTION

EXTENDED CENTROID BASED CLUSTERING TECHNIQUE FOR ONLINE SHOPPING FRAUD DETECTION EXTENDED CENTROID BASED CLUSTERING TECHNIQUE FOR ONLINE SHOPPING FRAUD DETECTION Priya J Rana 1, Jwalant Baria 2 1 ME IT, Department of IT, Parul institute of engineering & Technology, Gujarat, India 2

More information

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, fabian.gruening@informatik.uni-oldenburg.de Abstract: Independent

More information

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016 Network Machine Learning Research Group S. Jiang Internet-Draft Huawei Technologies Co., Ltd Intended status: Informational October 19, 2015 Expires: April 21, 2016 Abstract Network Machine Learning draft-jiang-nmlrg-network-machine-learning-00

More information

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,

More information

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

Role of Neural network in data mining

Role of Neural network in data mining Role of Neural network in data mining Chitranjanjit kaur Associate Prof Guru Nanak College, Sukhchainana Phagwara,(GNDU) Punjab, India Pooja kapoor Associate Prof Swami Sarvanand Group Of Institutes Dinanagar(PTU)

More information

Healthcare Measurement Analysis Using Data mining Techniques

Healthcare Measurement Analysis Using Data mining Techniques www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik

More information

Tax Fraud in Increasing

Tax Fraud in Increasing Preventing Fraud with Through Analytics Satya Bhamidipati Data Scientist Business Analytics Product Group Copyright 2014 Oracle and/or its affiliates. All rights reserved. 2 Tax Fraud in Increasing 27%

More information

Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010.

Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010. Title Introduction to Data Mining Dr Arulsivanathan Naidoo Statistics South Africa OECD Conference Cape Town 8-10 December 2010 1 Outline Introduction Statistics vs Knowledge Discovery Predictive Modeling

More information

Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100

Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Erkan Er Abstract In this paper, a model for predicting students performance levels is proposed which employs three

More information

Credit Card Fraud Detection using Hidden Morkov Model and Neural Networks

Credit Card Fraud Detection using Hidden Morkov Model and Neural Networks Credit Card Fraud Detection using Hidden Morkov Model and Neural Networks R.RAJAMANI Assistant Professor, Department of Computer Science, PSG College of Arts & Science, Coimbatore. Email: rajamani_devadoss@yahoo.co.in

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

More information

Data Mining Application for Cyber Credit-card Fraud Detection System

Data Mining Application for Cyber Credit-card Fraud Detection System , July 3-5, 2013, London, U.K. Data Mining Application for Cyber Credit-card Fraud Detection System John Akhilomen Abstract: Since the evolution of the internet, many small and large companies have moved

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

Customer Classification And Prediction Based On Data Mining Technique

Customer Classification And Prediction Based On Data Mining Technique Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor

More information

Machine Learning and Data Mining. Fundamentals, robotics, recognition

Machine Learning and Data Mining. Fundamentals, robotics, recognition Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,

More information

FRAUD DETECTION AND PREVENTION: A DATA ANALYTICS APPROACH BY SESHIKA FERNANDO TECHNICAL LEAD, WSO2

FRAUD DETECTION AND PREVENTION: A DATA ANALYTICS APPROACH BY SESHIKA FERNANDO TECHNICAL LEAD, WSO2 FRAUD DETECTION AND PREVENTION: A DATA ANALYTICS APPROACH BY SESHIKA FERNANDO TECHNICAL LEAD, WSO2 TABLE OF CONTENTS 1. Fraud: The Bad and the Ugly... 03 2. A New Opportunity for Fraud Detection... 03

More information

BIG DATA What it is and how to use?

BIG DATA What it is and how to use? BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14

More information

Unsupervised Outlier Detection in Time Series Data

Unsupervised Outlier Detection in Time Series Data Unsupervised Outlier Detection in Time Series Data Zakia Ferdousi and Akira Maeda Graduate School of Science and Engineering, Ritsumeikan University Department of Media Technology, College of Information

More information

INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR. ankitanandurkar2394@gmail.com

INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR. ankitanandurkar2394@gmail.com IJFEAT INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR Bharti S. Takey 1, Ankita N. Nandurkar 2,Ashwini A. Khobragade 3,Pooja G. Jaiswal 4,Swapnil R.

More information

CHURN PREDICTION IN MOBILE TELECOM SYSTEM USING DATA MINING TECHNIQUES

CHURN PREDICTION IN MOBILE TELECOM SYSTEM USING DATA MINING TECHNIQUES International Journal of Scientific and Research Publications, Volume 4, Issue 4, April 2014 1 CHURN PREDICTION IN MOBILE TELECOM SYSTEM USING DATA MINING TECHNIQUES DR. M.BALASUBRAMANIAN *, M.SELVARANI

More information

A Survey on Intrusion Detection System with Data Mining Techniques

A Survey on Intrusion Detection System with Data Mining Techniques A Survey on Intrusion Detection System with Data Mining Techniques Ms. Ruth D 1, Mrs. Lovelin Ponn Felciah M 2 1 M.Phil Scholar, Department of Computer Science, Bishop Heber College (Autonomous), Trichirappalli,

More information

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

More information

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals The Role of Size Normalization on the Recognition Rate of Handwritten Numerals Chun Lei He, Ping Zhang, Jianxiong Dong, Ching Y. Suen, Tien D. Bui Centre for Pattern Recognition and Machine Intelligence,

More information

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM ABSTRACT Luis Alexandre Rodrigues and Nizam Omar Department of Electrical Engineering, Mackenzie Presbiterian University, Brazil, São Paulo 71251911@mackenzie.br,nizam.omar@mackenzie.br

More information

Assessing Data Mining: The State of the Practice

Assessing Data Mining: The State of the Practice Assessing Data Mining: The State of the Practice 2003 Herbert A. Edelstein Two Crows Corporation 10500 Falls Road Potomac, Maryland 20854 www.twocrows.com (301) 983-3555 Objectives Separate myth from reality

More information

KEITH LEHNERT AND ERIC FRIEDRICH

KEITH LEHNERT AND ERIC FRIEDRICH MACHINE LEARNING CLASSIFICATION OF MALICIOUS NETWORK TRAFFIC KEITH LEHNERT AND ERIC FRIEDRICH 1. Introduction 1.1. Intrusion Detection Systems. In our society, information systems are everywhere. They

More information

A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering

A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering Khurum Nazir Junejo, Mirza Muhammad Yousaf, and Asim Karim Dept. of Computer Science, Lahore University of Management Sciences

More information

A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model

A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model Twinkle Patel, Ms. Ompriya Kale Abstract: - As the usage of credit card has increased the credit card fraud has also increased

More information

Football Match Winner Prediction

Football Match Winner Prediction Football Match Winner Prediction Kushal Gevaria 1, Harshal Sanghavi 2, Saurabh Vaidya 3, Prof. Khushali Deulkar 4 Department of Computer Engineering, Dwarkadas J. Sanghvi College of Engineering, Mumbai,

More information

Anomaly and Fraud Detection with Oracle Data Mining 11g Release 2

Anomaly and Fraud Detection with Oracle Data Mining 11g Release 2 Oracle 11g DB Data Warehousing ETL OLAP Statistics Anomaly and Fraud Detection with Oracle Data Mining 11g Release 2 Data Mining Charlie Berger Sr. Director Product Management, Data

More information

A Web-based Interactive Data Visualization System for Outlier Subspace Analysis

A Web-based Interactive Data Visualization System for Outlier Subspace Analysis A Web-based Interactive Data Visualization System for Outlier Subspace Analysis Dong Liu, Qigang Gao Computer Science Dalhousie University Halifax, NS, B3H 1W5 Canada dongl@cs.dal.ca qggao@cs.dal.ca Hai

More information

The Data Mining Process

The Data Mining Process Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

More information

A Review of Data Mining Techniques

A Review of Data Mining Techniques Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan Handwritten Signature Verification ECE 533 Project Report by Ashish Dhawan Aditi R. Ganesan Contents 1. Abstract 3. 2. Introduction 4. 3. Approach 6. 4. Pre-processing 8. 5. Feature Extraction 9. 6. Verification

More information

A Regression Approach for Forecasting Vendor Revenue in Telecommunication Industries

A Regression Approach for Forecasting Vendor Revenue in Telecommunication Industries A Regression Approach for Forecasting Vendor Revenue in Telecommunication Industries Aida Mustapha *1, Farhana M. Fadzil #2 * Faculty of Computer Science and Information Technology, Universiti Tun Hussein

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition

More information

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data

More information

The University of Jordan

The University of Jordan The University of Jordan Master in Web Intelligence Non Thesis Department of Business Information Technology King Abdullah II School for Information Technology The University of Jordan 1 STUDY PLAN MASTER'S

More information

GETTING REAL ABOUT SECURITY MANAGEMENT AND "BIG DATA"

GETTING REAL ABOUT SECURITY MANAGEMENT AND BIG DATA GETTING REAL ABOUT SECURITY MANAGEMENT AND "BIG DATA" A Roadmap for "Big Data" in Security Analytics ESSENTIALS This paper examines: Escalating complexity of the security management environment, from threats

More information

Research of Postal Data mining system based on big data

Research of Postal Data mining system based on big data 3rd International Conference on Mechatronics, Robotics and Automation (ICMRA 2015) Research of Postal Data mining system based on big data Xia Hu 1, Yanfeng Jin 1, Fan Wang 1 1 Shi Jiazhuang Post & Telecommunication

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Recognize the many faces of fraud

Recognize the many faces of fraud Recognize the many faces of fraud Detect and prevent fraud by finding subtle patterns and associations in your data Contents: 1 Introduction 2 The many faces of fraud 3 Detect healthcare fraud easily and

More information

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Siddhanth Gokarapu 1, J. Laxmi Narayana 2 1 Student, Computer Science & Engineering-Department, JNTU Hyderabad India 1

More information

A survey on Data Mining based Intrusion Detection Systems

A survey on Data Mining based Intrusion Detection Systems International Journal of Computer Networks and Communications Security VOL. 2, NO. 12, DECEMBER 2014, 485 490 Available online at: www.ijcncs.org ISSN 2308-9830 A survey on Data Mining based Intrusion

More information

Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies

Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com Image

More information

IDENTIFIC ATION OF SOFTWARE EROSION USING LOGISTIC REGRESSION

IDENTIFIC ATION OF SOFTWARE EROSION USING LOGISTIC REGRESSION http:// IDENTIFIC ATION OF SOFTWARE EROSION USING LOGISTIC REGRESSION Harinder Kaur 1, Raveen Bajwa 2 1 PG Student., CSE., Baba Banda Singh Bahadur Engg. College, Fatehgarh Sahib, (India) 2 Asstt. Prof.,

More information

A Fast Fraud Detection Approach using Clustering Based Method

A Fast Fraud Detection Approach using Clustering Based Method pp. 33-37 Krishi Sanskriti Publications http://www.krishisanskriti.org/jbaer.html A Fast Detection Approach using Clustering Based Method Surbhi Agarwal 1, Santosh Upadhyay 2 1 M.tech Student, Mewar University,

More information

An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation

An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation Shanofer. S Master of Engineering, Department of Computer Science and Engineering, Veerammal Engineering College,

More information

Web Forensic Evidence of SQL Injection Analysis

Web Forensic Evidence of SQL Injection Analysis International Journal of Science and Engineering Vol.5 No.1(2015):157-162 157 Web Forensic Evidence of SQL Injection Analysis 針 對 SQL Injection 攻 擊 鑑 識 之 分 析 Chinyang Henry Tseng 1 National Taipei University

More information

Insider Threat Detection Using Graph-Based Approaches

Insider Threat Detection Using Graph-Based Approaches Cybersecurity Applications & Technology Conference For Homeland Security Insider Threat Detection Using Graph-Based Approaches William Eberle Tennessee Technological University weberle@tntech.edu Lawrence

More information

Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm

Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm R.Karthiyayini 1, J.Jayaprakash 2 Assistant Professor, Department of Computer Applications, Anna University (BIT Campus),

More information

GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL CLUSTERING

GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL CLUSTERING Geoinformatics 2004 Proc. 12th Int. Conf. on Geoinformatics Geospatial Information Research: Bridging the Pacific and Atlantic University of Gävle, Sweden, 7-9 June 2004 GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL

More information

Software Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo

Software Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo Software Engineering for Big Data CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo Big Data Big data technologies describe a new generation of technologies that aim

More information

Pentaho Data Mining Last Modified on January 22, 2007

Pentaho Data Mining Last Modified on January 22, 2007 Pentaho Data Mining Copyright 2007 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at www.pentaho.org

More information

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES

IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES Bruno Carneiro da Rocha 1,2 and Rafael Timóteo de Sousa Júnior 2 1 Bank of Brazil, Brasília-DF, Brazil brunorocha_33@hotmail.com 2 Network Engineering

More information

A Content based Spam Filtering Using Optical Back Propagation Technique

A Content based Spam Filtering Using Optical Back Propagation Technique A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT

More information

Predictive Data modeling for health care: Comparative performance study of different prediction models

Predictive Data modeling for health care: Comparative performance study of different prediction models Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath hiremat.nitie@gmail.com National Institute of Industrial Engineering (NITIE) Vihar

More information

Immune Support Vector Machine Approach for Credit Card Fraud Detection System. Isha Rajak 1, Dr. K. James Mathai 2

Immune Support Vector Machine Approach for Credit Card Fraud Detection System. Isha Rajak 1, Dr. K. James Mathai 2 Immune Support Vector Machine Approach for Credit Card Fraud Detection System. Isha Rajak 1, Dr. K. James Mathai 2 1Department of Computer Engineering & Application, NITTTR, Shyamla Hills, Bhopal M.P.,

More information

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET DATA MINING TECHNIQUES AND STOCK MARKET Mr. Rahul Thakkar, Lecturer and HOD, Naran Lala College of Professional & Applied Sciences, Navsari ABSTRACT Without trading in a stock market we can t understand

More information

OPERA SOLUTIONS CAPABILITIES. ACH and Wire Fraud: advanced anomaly detection to find and stop costly attacks

OPERA SOLUTIONS CAPABILITIES. ACH and Wire Fraud: advanced anomaly detection to find and stop costly attacks OPERA SOLUTIONS CAPABILITIES ACH and Wire Fraud: advanced anomaly detection to find and stop costly attacks 2 The information you need to fight fraud does exist You just have to know it when you see it

More information

Denial of Service Attack Detection Using Multivariate Correlation Information and Support Vector Machine Classification

Denial of Service Attack Detection Using Multivariate Correlation Information and Support Vector Machine Classification International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-4, Issue-3 E-ISSN: 2347-2693 Denial of Service Attack Detection Using Multivariate Correlation Information and

More information

Classification and Prediction techniques using Machine Learning for Anomaly Detection.

Classification and Prediction techniques using Machine Learning for Anomaly Detection. Classification and Prediction techniques using Machine Learning for Anomaly Detection. Pradeep Pundir, Dr.Virendra Gomanse,Narahari Krishnamacharya. *( Department of Computer Engineering, Jagdishprasad

More information

E-commerce Transaction Anomaly Classification

E-commerce Transaction Anomaly Classification E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

Evaluating Online Payment Transaction Reliability using Rules Set Technique and Graph Model

Evaluating Online Payment Transaction Reliability using Rules Set Technique and Graph Model Evaluating Online Payment Transaction Reliability using Rules Set Technique and Graph Model Trung Le 1, Ba Quy Tran 2, Hanh Dang Thi My 3, Thanh Hung Ngo 4 1 GSR, Information System Lab., University of

More information

Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case Study: Qom Payame Noor University)

Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case Study: Qom Payame Noor University) 260 IJCSNS International Journal of Computer Science and Network Security, VOL.11 No.6, June 2011 Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case

More information

Data Mining: Overview. What is Data Mining?

Data Mining: Overview. What is Data Mining? Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,

More information

COURSE RECOMMENDER SYSTEM IN E-LEARNING

COURSE RECOMMENDER SYSTEM IN E-LEARNING International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand

More information

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

ClusterOSS: a new undersampling method for imbalanced learning

ClusterOSS: a new undersampling method for imbalanced learning 1 ClusterOSS: a new undersampling method for imbalanced learning Victor H Barella, Eduardo P Costa, and André C P L F Carvalho, Abstract A dataset is said to be imbalanced when its classes are disproportionately

More information

SAS Fraud Framework for Banking

SAS Fraud Framework for Banking SAS Fraud Framework for Banking Including Social Network Analysis John C. Brocklebank, Ph.D. Vice President, SAS Solutions OnDemand Advanced Analytics Lab SAS Fraud Framework for Banking Agenda Introduction

More information