Immune Support Vector Machine Approach for Credit Card Fraud Detection System. Isha Rajak 1, Dr. K. James Mathai 2 1Department of Computer Engineering & Application, NITTTR, Shyamla Hills, Bhopal M.P., INDIA 2Associate Professor, Department of Computer Engineering & Application, NITTTR, Shyamla Hills, Bhopal M.P., INDIA isharajak20@gmail.com, kjmathai@nitttrbpl.ac.in A B S T R A C T These days, financial institutions usually develop fraud detection systems targeted to their own asset bases. Fraud detection systems are most prevalent in credit card transactions, telecommunications, network intrusions, finance and insurance, and scientific applications. In recent years the number of fraud cases has been on a rise. The traditional and age-old system of intelligence and fraud record maintenance has failed to live up to the requirements of the existing fraud scenario. There are many discussions about frauds and fraudulent activities in different industries. Hence fraudulent activities in the areas of businesses and our daily lives, have now become an important area of research. There are various Hybrid approaches for the Fraud Detection, like through classification, clustering, regression, association etc. In this paper, the researcher is concentrating on a hybrid approach based on danger theory and support vector machine (SVM) which takes advantages of Danger theory to remove fraud transactions and then SVM support to classify these transactions. Index Terms Support Vector Machine (SVM), Danger Theory, Fraud Detection, Credit Card Fraud, Ensemble Approach, Hybrid Approach. I. INTRODUCTION The main reason of doing Fraud is for financial benefits or becoming a reason of loss by deception either by implicit or explicit ways; it is a way for get fraudster benefits by the means of unlawful advantage. Credit card fraud is a major problem for financial institutions globally. Globally, It is main reason and accountable for dollars in losses per annum. Fraud can be defined as criminal deception intended to result in financial gain. Along with the developments in the Information Technology, fraud has been extending all over the world with results of huge financial losses [1]. With the increased use of credit cards, fraudsters are also finding more opportunities to fraudulent activities which effects bank as well as card holders to large financial losses [2]. Fraud detection is a vital business function for minimizing the effects of unauthorized transactions upon organizations customer service delivery, bottom line expenditure and business reputation through deployment of innovative fraud technology frameworks. Fraud detection is about identifying fraud as soon as possible and responding to it [3]. Institutions are now moving towards increasingly proactive methods of fraud detection for real time screening of financial data, and triggering of a preventive response prior to transaction completion in order to minimize the potential fraud deficit [4]. To understand the concept of various aspects like Credit data Fraud detection, Artificial Immune system, Support Vector Machine and their integration, have been explained in the paper. The structure of this paper is as follows: in section II the critical aspects of credit card fraud detection, Section III, briefly explains the need of Artificial Immune System and Danger Theory and Section IV to before implementing 32 2014, IJAFRC All Rights Reserved www.ijafrc.org
the Support Vector Machine. The Section V presents the proposed research based on SVM and Danger Theory based integrated solution for Fraud Detection. II. CREDIT CARD FRAUD DETECTION Most vulnerable areas of fraudulent activities are in unauthorized credit card usage, cell phone bill, superfluous insurance claim and stock exchange insider trading. Stolen credit cards are used in an unusual way than the normal pattern. The usage pattern of a stolen credit card is compared against the regular usage data of the actual owner and thus fraudsters are detected from the credit card transaction data [6]. Criminal rings of illegal insurance claimants and providers manipulate the claim processing system for unauthorized claims. Tracking such activities help the company to avoid financial losses. Neural network based techniques have been successfully applied to detect such outliers. Insider trading is a criminal activity in the stock market, where profit is made by insider from the available information before this information gets public. All together, Credit cards are a fine destination for fraud, since in a very short time a large amount of money can be earned without taking too many hazards. This is because, the crime is only detected normally several weeks after date [5]. As we know that Credit card fraud is very critical and sensitive issue. In this, credit card fraud can be done to make purchases without permission or counterfeiting a credit card [7]. Various kind of credit card frauds are: online credit card fraud, shave and paste, stolen card numbers, advance payments etc. If the same scenario grows up continuously then, this fraud will go beyond number of paid checks. And this thing could happen even before the end of the decade. As the industry continues to expand and offer credit to more and more consumers, fraud will also grow. In the case of fraud detection mechanism, it is given a set of credit card transactions, the process of through which it identifies those transactions that are fraudulent. A desirable fraud detection system needs good metrics to evaluate the system. The system should take into account the cost of the fraudulent behaviour detected and the cost associated with stopping it. In fact, there is a decision layer on top of the fraud detection system. This layer decides what actions to take when fraudulent behaviour is detected via the fraud detection system [5, 8]. III. ARTIFICIAL IMMUNE SYSTEM The problems found in a computer security system are quite similar to those encountered in a Biological Immune System (BIS), since both of them have to maintain stability in a changing environment. Due to numerous desirable characteristics of the natural immune system, such as diversity, self tolerance, immune memory, distributed computation, self-organization, self-learning, self-adaptation, and robustness, BIS has attracted many researchers attention [9] [10]. At the same time, Artificial Immune System (AIS) have become an increasingly popular computational intelligence paradigm [11][12]. Artificial Immune System (AIS) are still relatively young and the natural immune system (NIS) is one of the most complex systems under active study by biologists, there are some distinct viewpoints about the main goal of the NIS. These ideas and understandings are extremely important for AIS researchers and designers. The main two distinct viewpoints are between self & non-self theory and this is with approach of danger theory. The classical immunology stipulates say that that an response of immune is gets start when and only when the specific body meets a particle which something called non-self or another way around foreign particle[13]. This viewpoint is generally accepted by immunologists, and the models are created by AIS researchers based on this approach. A lot of question marks arise from this viewpoint, and a new theory called Danger Theory has been developed. The ultimate and most vital aspect of behind the seen of danger theory is that the immune system which is responsible to react, actually does not react to non-self but at the same point time to danger. Likewise the concept or phenomena of 33 2014, IJAFRC All Rights Reserved www.ijafrc.org
the self and non-self theories, it basically ropes the requirement for narrow-mindedness in all. Moreover at the same point of time, it get different in order to answer that what and how it should be reacted. It would not react to foreignness, all together in other words we can consider that this system reacts to consider it as danger [14].The concept of AIS in whole including Danger theory can be used to find abnormal users who could be part of fraud activity. There are so many advantages of using Danger Theory. It makes convenient approach to separate abnormal data from the normal dataset which makes the process so convenient and effective. It improves the quality of the output by eliminating bad causes. But at the same time it over bourdon the system too. It is so because it requires extra time to filter out abnormality. Therefore researcher believes the need of Artificial Immune System and Danger theory before implementing the Support Vector Machine to filter abnormal data from actual dataset for the cause of detecting the credit fraud. IV. SUPPORT VECTOR MACHINE The Support Vector Machine (SVM) is solely developed by prominent person Vapnik [15]. Is is really very hard to maintain the ease and quality of SVM training is very much far from the capacities of more traditional methods. It could be that SVM could be more complex for so many areas for example text and image classification, handwriting recognition, and bioinformatics etc. One of the vital importance point regarding the SVM that it is good enough on data sets which have lot of attributes, even though with very few cases on which we need to train the model. The support vector machine [SVM] is a training algorithm. It trains the classifier to predict the class of the new sample. SVM is mainly based on the idea of decision planes which explicitly talked about decision boundary and point out the area which form the decision boundary between the classes called support vector treat as parameter. SVM is based on the machine learning algorithm invented by vapnik in 1960 s. It is also based on the structure risk minimization principle to prevent over fitting. There are two key implementations of SVM technique that are mathematical programming and kernel function. It finds an Optimal separates hyper plane between data point of different classes in a high dimensional space. Assuming two classes for classification, P and N for Yn= 1,-1, which can be extended to K class classification by using K two class classifiers. Support Vector Classifier (SVC) searches hyper plane for classification. But SVC is outlined so kernel functions are introduced in order to non line on decision surface. A. Linear SVC Let w is weight vector, Xn is the nearest data point. w T +b 1 for x n ε P And w T x n+b 1 for x n ε N. A2 Support Vectors Optimal Hyperplane Optimal Margin A1 Figure 1. Linear SVC 34 2014, IJAFRC All Rights Reserved www.ijafrc.org
For optimization the problem minimizes the ½[w T w] Subject to y n (w T w+b) 1 for n=1to N. B. Non linear SVC Non-linear SVC can be used to learn nonlinear decision function space. SVM can also be extended for learning non-linear decision function. C. Non separable case Noise is present in the training data, some data point may be misclassified. (i) Advantages a. Accurate methods among all machine learning algorithms. It finds the best classification function of training data b. SVM prevents over fitting than other methods. (ii) Disadvantages a. It is computationally expensive. b. SVMs require large amount of training time and large amount of storage and poor interpretability of results. (iii) Challenge in SVM a. Implementation of SVM is determined by the kernel so find best kernels for appropriate application is the open challenge. b. SVM model is very expensive for the space and evaluation. V. PROPOSED WORK Now looking to the advantages of Danger theory before implementing the Support Vector Machine the researchers proposes to develop an immune Support Vector Machine (SVM) approach for Credit Card Fraud Detection System as shown in figure 2. Dataset from LOG file of Web Users HYBRID APPROACH -- DANGER THEORY & SVM CLASSIFIER Various Pattern Analyses Figure 2. Architecture of Proposed Work 35 2014, IJAFRC All Rights Reserved www.ijafrc.org
The proposed ensemble method has two folds, as shown in fig 2. 1. Danger Theory 2. Support Vector Machine (SVM) The researcher assumes that implementation of hybrid- two approach would give better results than single approach. Before implementing a SVM technique to find Fraud detection, researcher would use danger theory to rectify the bad user s data prior to start of actual work (as discussed in section III). That is the main reason behind all ensemble approach of credit card fraud detection. VI. CONCLUSION AND FUTURE WORK Due to the rapid advances of electronic commerce on the Internet, the use of credit cards for purchasing has become necessary and also convenient. But with the growing number of credit card transactions, more opportunities become available for thieves to steal credit card numbers and subsequently commit fraud. Transaction fraud detection is an important, but hard, real world problem. A significant amount of engineering is required to produce effective solutions. The above challenge motivates researcher to improve the performance of fraud detection system. There is scope for improvement of the fraud detection system by increasing the performance in terms of fraud classification. The researcher believe SVM model in conjunction with Danger theory concept of Artificial Immune System (AIS) can efficiently optimized to reduce Credit card frauds so that the user can experience best performance with high scalability and reliability solution. VII. REFERENCES [1] S. Panigrahi, A. Kundu, and et ai, "Credit card fraud detection: A fusion approach using Dempster-Shafer theory and Bayesian learning", Inf ormation Fusion, Vol.lO, No. 4, pp. 354-363, 2009. [2] CA. W. Paasch, "Credit card fraud detection using artificial neural networks tuned bv genetic algorithms." Thesis (Ph.D.), The Hong Kong University of Science and Technology, February 2008. [3] R. Huang, H. Tawfik and A.K. Nagar, "A Novel Hybrid Artificial Immune Inspired Approach for Online Break-in Fraud Detection, " Faculty of Business and Computer Sciences, Liverpool Hope University, Liverpool, United Kingdom International Conference on Computational Science, ICCS 2010. [4] M. Krivko, "A Hybrid Model For Plastic Card Fraud Detection Systems", Expert Systems with Applications, Vol. 37, No. 8, pp. 6070-6076, 2010. [5] Maes, S., Tuyls, K., Vanschoenwinkel, B. & Manderick, B. Credit Card Fraud Detection using Bayesian and Neural Networks, Proc. of the 1 st International NAISO Congress on Neuro Fuzzy Technologies, (2002). [6] R. J. Bolton and D. J. H, Unsupervised profiling methods for fraud detection, in Proc. Credit Scoring and Credit Control VII, 2001, pp. 5 7. 36 2014, IJAFRC All Rights Reserved www.ijafrc.org
[7] Balan, Lăcrămioara, and Mihai Popescu. "Credit card fraud." The USV Annals of Economics and Public Administration 11.1 (2011): 81-85. [8] Tetro, Donald, Edward Lipton, and Andrew Sackheim. "System and method for enhanced fraud detection in automated electronic credit card processing," U.S. Patent No. 6,095,413. 1 Aug. 2000. [9] F. Sun, and F. Xu, Antibody concentration based method for network security situation awareness, Proc. of the 3nd International Conference on Bioinformatics and Biomedical Engineering (icbbe 2009), IEEE Press, pp. 1-4, June 2009. [10] F. Sun, S. Cheng, A gene technology inspired paradigm for user authentication, Proc. of the 3nd International Conference on Bioinformatics and Biomedical Engineering(iCBBE 2009), IEEE Press, pp. 1-3, June 2009. [11] W. Zhang, C. Wu, and X. Liu, Construction and enumeration of Boolean functions with maximum algebraic immunity, Science In China, Series F: Information Science, vol. 52, pp.32-40, January 2009. [12] F. Sun, and Z. Wu, A new risk assessment model for e-government network security based on antibody concentration, IEEE, Proc. of the 2009 International Conference on E-Learning, E- Business, Enterprise Information Systems, and E-Government. pp. 119-121, December 2009. [13] Perelson, A. S. & Weisbuch, G. (1997), Immunology for Physicists," Rev. of Modem Physics, 69(4), pp. 1219-1267. [14] P. Matzinger, (2002), "The Danger Model: A Renewed Sense of Self', Science, 296, pp. 301-305. [15] V. Vapnik, The nature of statistical learning theory, New York: Springer-Verlag, 1995. 37 2014, IJAFRC All Rights Reserved www.ijafrc.org