Immune Support Vector Machine Approach for Credit Card Fraud Detection System. Isha Rajak 1, Dr. K. James Mathai 2



Similar documents
International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

Knowledge Discovery from patents using KMX Text Analytics

Scalable Developments for Big Data Analytics in Remote Sensing

To improve the problems mentioned above, Chen et al. [2-5] proposed and employed a novel type of approach, i.e., PA, to prevent fraud.

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

Lluis Belanche + Alfredo Vellido. Intelligent Data Analysis and Data Mining

Open Access Research on Application of Neural Network in Computer Network Security Evaluation. Shujuan Jin *

DATA MINING TECHNIQUES AND APPLICATIONS

An Overview of Knowledge Discovery Database and Data mining Techniques

Artificial Neural Network and Location Coordinates based Security in Credit Cards

Data Mining Application for Cyber Credit-card Fraud Detection System

Intrusion Detection via Machine Learning for SCADA System Protection

Application of Hidden Markov Model in Credit Card Fraud Detection

Research Article FraudMiner: A Novel Credit Card Fraud Detection Model Based on Frequent Itemset Mining

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

Machine Learning Introduction

PROBLEM REDUCTION IN ONLINE PAYMENT SYSTEM USING HYBRID MODEL

Keywords - Algorithm, Artificial immune system, Classification, Non-Spam, Spam

Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier

Identifying Online Credit Card Fraud using Artificial Immune Systems

Unsupervised Outlier Detection in Time Series Data

Machine Learning for Fraud Detection

Computational intelligence in intrusion detection systems

Techniques for Fraud Detection

Using Data Mining for Mobile Communication Clustering and Characterization

REVIEW OF ENSEMBLE CLASSIFICATION

SURVIVABILITY OF COMPLEX SYSTEM SUPPORT VECTOR MACHINE BASED APPROACH

Data Mining Solutions for the Business Environment

Machine learning for algo trading

Operations Research and Knowledge Modeling in Data Mining

Electronic Payment Fraud Detection Techniques

Neural Networks in Data Mining

Introduction to Data Mining

HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems

MA2823: Foundations of Machine Learning

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval

An Evaluation of Neural Networks Approaches used for Software Effort Estimation

Machine Learning in FX Carry Basket Prediction

Recognizing The Theft of Identity Using Data Mining

Credit Card Fraud Detection Using Self Organised Map

A Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM

Modeling and Design of Intelligent Agent System

Fraud Detection in Online Banking Using HMM

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Classification and Prediction techniques using Machine Learning for Anomaly Detection.

How To Use Neural Networks In Data Mining

Predictive Analytics using Genetic Algorithm for Efficient Supply Chain Inventory Optimization

Decision Support Systems

Spam Classification With Artificial Neural Network and Negative Selection Algorithm

Optimizing content delivery through machine learning. James Schneider Anton DeFrancesco

How To Detect Credit Card Fraud

How To Improve Cloud Computing With An Ontology System For An Optimal Decision Making

SVM Ensemble Model for Investment Prediction

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Intrusion Detection System using Log Files and Reinforcement Learning

A Review of Anomaly Detection Techniques in Network Intrusion Detection System

E-commerce Transaction Anomaly Classification

IDENTIFIC ATION OF SOFTWARE EROSION USING LOGISTIC REGRESSION

NEURAL NETWORKS IN DATA MINING

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

Machine Learning in Spam Filtering

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

Neural Networks and Support Vector Machines

Robust Outlier Detection Technique in Data Mining: A Univariate Approach

SURVEY OF TEXT CLASSIFICATION ALGORITHMS FOR SPAM FILTERING

Automatic Bank Fraud Detection Using Support Vector Machines

An Analysis on Density Based Clustering of Multi Dimensional Spatial Data

Doctor of Philosophy in Computer Science

Visualization of Breast Cancer Data by SOM Component Planes

Random forest algorithm in big data environment

Prediction of Stock Performance Using Analytical Techniques

A Survey on Outlier Detection Techniques for Credit Card Fraud Detection

KEITH LEHNERT AND ERIC FRIEDRICH

Machine Learning: Overview

Wireless Remote Monitoring System for ASTHMA Attack Detection and Classification

A Survey on Intrusion Detection System with Data Mining Techniques

Evaluation of Feature Selection Methods for Predictive Modeling Using Neural Networks in Credits Scoring

Introduction to Support Vector Machines. Colin Campbell, Bristol University

Data Mining Analytics for Business Intelligence and Decision Support

2. IMPLEMENTATION. International Journal of Computer Applications ( ) Volume 70 No.18, May 2013

MANAGING QUEUE STABILITY USING ART2 IN ACTIVE QUEUE MANAGEMENT FOR CONGESTION CONTROL

Identification algorithms for hybrid systems

CSci 538 Articial Intelligence (Machine Learning and Data Analysis)

Application of Event Based Decision Tree and Ensemble of Data Driven Methods for Maintenance Action Recommendation

Predictive Dynamix Inc

Making Sense of the Mayhem: Machine Learning and March Madness

A fast multi-class SVM learning method for huge databases

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

EXTENDED CENTROID BASED CLUSTERING TECHNIQUE FOR ONLINE SHOPPING FRAUD DETECTION

Transcription:

Immune Support Vector Machine Approach for Credit Card Fraud Detection System. Isha Rajak 1, Dr. K. James Mathai 2 1Department of Computer Engineering & Application, NITTTR, Shyamla Hills, Bhopal M.P., INDIA 2Associate Professor, Department of Computer Engineering & Application, NITTTR, Shyamla Hills, Bhopal M.P., INDIA isharajak20@gmail.com, kjmathai@nitttrbpl.ac.in A B S T R A C T These days, financial institutions usually develop fraud detection systems targeted to their own asset bases. Fraud detection systems are most prevalent in credit card transactions, telecommunications, network intrusions, finance and insurance, and scientific applications. In recent years the number of fraud cases has been on a rise. The traditional and age-old system of intelligence and fraud record maintenance has failed to live up to the requirements of the existing fraud scenario. There are many discussions about frauds and fraudulent activities in different industries. Hence fraudulent activities in the areas of businesses and our daily lives, have now become an important area of research. There are various Hybrid approaches for the Fraud Detection, like through classification, clustering, regression, association etc. In this paper, the researcher is concentrating on a hybrid approach based on danger theory and support vector machine (SVM) which takes advantages of Danger theory to remove fraud transactions and then SVM support to classify these transactions. Index Terms Support Vector Machine (SVM), Danger Theory, Fraud Detection, Credit Card Fraud, Ensemble Approach, Hybrid Approach. I. INTRODUCTION The main reason of doing Fraud is for financial benefits or becoming a reason of loss by deception either by implicit or explicit ways; it is a way for get fraudster benefits by the means of unlawful advantage. Credit card fraud is a major problem for financial institutions globally. Globally, It is main reason and accountable for dollars in losses per annum. Fraud can be defined as criminal deception intended to result in financial gain. Along with the developments in the Information Technology, fraud has been extending all over the world with results of huge financial losses [1]. With the increased use of credit cards, fraudsters are also finding more opportunities to fraudulent activities which effects bank as well as card holders to large financial losses [2]. Fraud detection is a vital business function for minimizing the effects of unauthorized transactions upon organizations customer service delivery, bottom line expenditure and business reputation through deployment of innovative fraud technology frameworks. Fraud detection is about identifying fraud as soon as possible and responding to it [3]. Institutions are now moving towards increasingly proactive methods of fraud detection for real time screening of financial data, and triggering of a preventive response prior to transaction completion in order to minimize the potential fraud deficit [4]. To understand the concept of various aspects like Credit data Fraud detection, Artificial Immune system, Support Vector Machine and their integration, have been explained in the paper. The structure of this paper is as follows: in section II the critical aspects of credit card fraud detection, Section III, briefly explains the need of Artificial Immune System and Danger Theory and Section IV to before implementing 32 2014, IJAFRC All Rights Reserved www.ijafrc.org

the Support Vector Machine. The Section V presents the proposed research based on SVM and Danger Theory based integrated solution for Fraud Detection. II. CREDIT CARD FRAUD DETECTION Most vulnerable areas of fraudulent activities are in unauthorized credit card usage, cell phone bill, superfluous insurance claim and stock exchange insider trading. Stolen credit cards are used in an unusual way than the normal pattern. The usage pattern of a stolen credit card is compared against the regular usage data of the actual owner and thus fraudsters are detected from the credit card transaction data [6]. Criminal rings of illegal insurance claimants and providers manipulate the claim processing system for unauthorized claims. Tracking such activities help the company to avoid financial losses. Neural network based techniques have been successfully applied to detect such outliers. Insider trading is a criminal activity in the stock market, where profit is made by insider from the available information before this information gets public. All together, Credit cards are a fine destination for fraud, since in a very short time a large amount of money can be earned without taking too many hazards. This is because, the crime is only detected normally several weeks after date [5]. As we know that Credit card fraud is very critical and sensitive issue. In this, credit card fraud can be done to make purchases without permission or counterfeiting a credit card [7]. Various kind of credit card frauds are: online credit card fraud, shave and paste, stolen card numbers, advance payments etc. If the same scenario grows up continuously then, this fraud will go beyond number of paid checks. And this thing could happen even before the end of the decade. As the industry continues to expand and offer credit to more and more consumers, fraud will also grow. In the case of fraud detection mechanism, it is given a set of credit card transactions, the process of through which it identifies those transactions that are fraudulent. A desirable fraud detection system needs good metrics to evaluate the system. The system should take into account the cost of the fraudulent behaviour detected and the cost associated with stopping it. In fact, there is a decision layer on top of the fraud detection system. This layer decides what actions to take when fraudulent behaviour is detected via the fraud detection system [5, 8]. III. ARTIFICIAL IMMUNE SYSTEM The problems found in a computer security system are quite similar to those encountered in a Biological Immune System (BIS), since both of them have to maintain stability in a changing environment. Due to numerous desirable characteristics of the natural immune system, such as diversity, self tolerance, immune memory, distributed computation, self-organization, self-learning, self-adaptation, and robustness, BIS has attracted many researchers attention [9] [10]. At the same time, Artificial Immune System (AIS) have become an increasingly popular computational intelligence paradigm [11][12]. Artificial Immune System (AIS) are still relatively young and the natural immune system (NIS) is one of the most complex systems under active study by biologists, there are some distinct viewpoints about the main goal of the NIS. These ideas and understandings are extremely important for AIS researchers and designers. The main two distinct viewpoints are between self & non-self theory and this is with approach of danger theory. The classical immunology stipulates say that that an response of immune is gets start when and only when the specific body meets a particle which something called non-self or another way around foreign particle[13]. This viewpoint is generally accepted by immunologists, and the models are created by AIS researchers based on this approach. A lot of question marks arise from this viewpoint, and a new theory called Danger Theory has been developed. The ultimate and most vital aspect of behind the seen of danger theory is that the immune system which is responsible to react, actually does not react to non-self but at the same point time to danger. Likewise the concept or phenomena of 33 2014, IJAFRC All Rights Reserved www.ijafrc.org

the self and non-self theories, it basically ropes the requirement for narrow-mindedness in all. Moreover at the same point of time, it get different in order to answer that what and how it should be reacted. It would not react to foreignness, all together in other words we can consider that this system reacts to consider it as danger [14].The concept of AIS in whole including Danger theory can be used to find abnormal users who could be part of fraud activity. There are so many advantages of using Danger Theory. It makes convenient approach to separate abnormal data from the normal dataset which makes the process so convenient and effective. It improves the quality of the output by eliminating bad causes. But at the same time it over bourdon the system too. It is so because it requires extra time to filter out abnormality. Therefore researcher believes the need of Artificial Immune System and Danger theory before implementing the Support Vector Machine to filter abnormal data from actual dataset for the cause of detecting the credit fraud. IV. SUPPORT VECTOR MACHINE The Support Vector Machine (SVM) is solely developed by prominent person Vapnik [15]. Is is really very hard to maintain the ease and quality of SVM training is very much far from the capacities of more traditional methods. It could be that SVM could be more complex for so many areas for example text and image classification, handwriting recognition, and bioinformatics etc. One of the vital importance point regarding the SVM that it is good enough on data sets which have lot of attributes, even though with very few cases on which we need to train the model. The support vector machine [SVM] is a training algorithm. It trains the classifier to predict the class of the new sample. SVM is mainly based on the idea of decision planes which explicitly talked about decision boundary and point out the area which form the decision boundary between the classes called support vector treat as parameter. SVM is based on the machine learning algorithm invented by vapnik in 1960 s. It is also based on the structure risk minimization principle to prevent over fitting. There are two key implementations of SVM technique that are mathematical programming and kernel function. It finds an Optimal separates hyper plane between data point of different classes in a high dimensional space. Assuming two classes for classification, P and N for Yn= 1,-1, which can be extended to K class classification by using K two class classifiers. Support Vector Classifier (SVC) searches hyper plane for classification. But SVC is outlined so kernel functions are introduced in order to non line on decision surface. A. Linear SVC Let w is weight vector, Xn is the nearest data point. w T +b 1 for x n ε P And w T x n+b 1 for x n ε N. A2 Support Vectors Optimal Hyperplane Optimal Margin A1 Figure 1. Linear SVC 34 2014, IJAFRC All Rights Reserved www.ijafrc.org

For optimization the problem minimizes the ½[w T w] Subject to y n (w T w+b) 1 for n=1to N. B. Non linear SVC Non-linear SVC can be used to learn nonlinear decision function space. SVM can also be extended for learning non-linear decision function. C. Non separable case Noise is present in the training data, some data point may be misclassified. (i) Advantages a. Accurate methods among all machine learning algorithms. It finds the best classification function of training data b. SVM prevents over fitting than other methods. (ii) Disadvantages a. It is computationally expensive. b. SVMs require large amount of training time and large amount of storage and poor interpretability of results. (iii) Challenge in SVM a. Implementation of SVM is determined by the kernel so find best kernels for appropriate application is the open challenge. b. SVM model is very expensive for the space and evaluation. V. PROPOSED WORK Now looking to the advantages of Danger theory before implementing the Support Vector Machine the researchers proposes to develop an immune Support Vector Machine (SVM) approach for Credit Card Fraud Detection System as shown in figure 2. Dataset from LOG file of Web Users HYBRID APPROACH -- DANGER THEORY & SVM CLASSIFIER Various Pattern Analyses Figure 2. Architecture of Proposed Work 35 2014, IJAFRC All Rights Reserved www.ijafrc.org

The proposed ensemble method has two folds, as shown in fig 2. 1. Danger Theory 2. Support Vector Machine (SVM) The researcher assumes that implementation of hybrid- two approach would give better results than single approach. Before implementing a SVM technique to find Fraud detection, researcher would use danger theory to rectify the bad user s data prior to start of actual work (as discussed in section III). That is the main reason behind all ensemble approach of credit card fraud detection. VI. CONCLUSION AND FUTURE WORK Due to the rapid advances of electronic commerce on the Internet, the use of credit cards for purchasing has become necessary and also convenient. But with the growing number of credit card transactions, more opportunities become available for thieves to steal credit card numbers and subsequently commit fraud. Transaction fraud detection is an important, but hard, real world problem. A significant amount of engineering is required to produce effective solutions. The above challenge motivates researcher to improve the performance of fraud detection system. There is scope for improvement of the fraud detection system by increasing the performance in terms of fraud classification. The researcher believe SVM model in conjunction with Danger theory concept of Artificial Immune System (AIS) can efficiently optimized to reduce Credit card frauds so that the user can experience best performance with high scalability and reliability solution. VII. REFERENCES [1] S. Panigrahi, A. Kundu, and et ai, "Credit card fraud detection: A fusion approach using Dempster-Shafer theory and Bayesian learning", Inf ormation Fusion, Vol.lO, No. 4, pp. 354-363, 2009. [2] CA. W. Paasch, "Credit card fraud detection using artificial neural networks tuned bv genetic algorithms." Thesis (Ph.D.), The Hong Kong University of Science and Technology, February 2008. [3] R. Huang, H. Tawfik and A.K. Nagar, "A Novel Hybrid Artificial Immune Inspired Approach for Online Break-in Fraud Detection, " Faculty of Business and Computer Sciences, Liverpool Hope University, Liverpool, United Kingdom International Conference on Computational Science, ICCS 2010. [4] M. Krivko, "A Hybrid Model For Plastic Card Fraud Detection Systems", Expert Systems with Applications, Vol. 37, No. 8, pp. 6070-6076, 2010. [5] Maes, S., Tuyls, K., Vanschoenwinkel, B. & Manderick, B. Credit Card Fraud Detection using Bayesian and Neural Networks, Proc. of the 1 st International NAISO Congress on Neuro Fuzzy Technologies, (2002). [6] R. J. Bolton and D. J. H, Unsupervised profiling methods for fraud detection, in Proc. Credit Scoring and Credit Control VII, 2001, pp. 5 7. 36 2014, IJAFRC All Rights Reserved www.ijafrc.org

[7] Balan, Lăcrămioara, and Mihai Popescu. "Credit card fraud." The USV Annals of Economics and Public Administration 11.1 (2011): 81-85. [8] Tetro, Donald, Edward Lipton, and Andrew Sackheim. "System and method for enhanced fraud detection in automated electronic credit card processing," U.S. Patent No. 6,095,413. 1 Aug. 2000. [9] F. Sun, and F. Xu, Antibody concentration based method for network security situation awareness, Proc. of the 3nd International Conference on Bioinformatics and Biomedical Engineering (icbbe 2009), IEEE Press, pp. 1-4, June 2009. [10] F. Sun, S. Cheng, A gene technology inspired paradigm for user authentication, Proc. of the 3nd International Conference on Bioinformatics and Biomedical Engineering(iCBBE 2009), IEEE Press, pp. 1-3, June 2009. [11] W. Zhang, C. Wu, and X. Liu, Construction and enumeration of Boolean functions with maximum algebraic immunity, Science In China, Series F: Information Science, vol. 52, pp.32-40, January 2009. [12] F. Sun, and Z. Wu, A new risk assessment model for e-government network security based on antibody concentration, IEEE, Proc. of the 2009 International Conference on E-Learning, E- Business, Enterprise Information Systems, and E-Government. pp. 119-121, December 2009. [13] Perelson, A. S. & Weisbuch, G. (1997), Immunology for Physicists," Rev. of Modem Physics, 69(4), pp. 1219-1267. [14] P. Matzinger, (2002), "The Danger Model: A Renewed Sense of Self', Science, 296, pp. 301-305. [15] V. Vapnik, The nature of statistical learning theory, New York: Springer-Verlag, 1995. 37 2014, IJAFRC All Rights Reserved www.ijafrc.org