Identifying Online Credit Card Fraud using Artificial Immune Systems
|
|
|
- Ophelia Manning
- 10 years ago
- Views:
Transcription
1 Provided by the author(s) and University College Dublin Library in accordance with publisher policies. Please cite the published version when available. Title Identifying online credit card fraud using artificial immune systems Author(s) Brabazon, Anthony; Cahill, Jane; Keenan, Peter; Walsh, Daniel Publication Date Publication information 2010 IEEE Congress on Evolutionary Computation (CEC) [proceedings] Publisher IEEE Press Link to publisher's version This item's record/more information Rights Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. DOI Downloaded T03:18:18Z Some rights reserved. For more information, please see the item record link above.
2 Identifying Online Credit Card Fraud using Artificial Immune Systems A. Brabazon, J. Cahill, P. Keenan 1, D. Walsh, UCD Business School, University College Dublin, Dublin 4, Ireland Abstract Significant payment flows now take place on-line, giving rise to a requirement for efficient and effective systems for the detection of credit card fraud. A particular aspect of this problem is that it is highly dynamic, as fraudsters continually adapt their strategies in response to the increasing sophistication of detection systems. Hence, system training by exposure to examples of previous examples of fraudulent transactions can lead to fraud detection systems which are susceptible to new patterns of fraudulent transactions. The nature of the problem suggests that Artificial Immune Systems (AIS) may have particular utility for inclusion in fraud detection systems as AIS can be constructed which can flag non standard transactions without having seen examples of all possible such transactions during training of the algorithm. In this paper, we investigate the effectiveness of Artificial Immune Systems (AIS) for credit card fraud detection using a large dataset obtained from an on-line retailer. Three AIS algorithms were implemented and their performance was benchmarked against a logistic regression model. The results suggest that AIS algorithms have potential for inclusion in fraud detection systems but that further work is required to realize their full potential in this domain. I. INTRODUCTION WebBiz (anonymized) conducts a growing online business in addition to their main street premises. WebBiz accepts payments for their online business through several channels such as credit cards, debit cards and bank transfers. This paper is concerned only with detecting fraud in credit card transactions made online. Although WebBiz employs good industry practice, credit card fraud remains a problem and occurs at a rate of about one fraudulent transaction in every million transactions. This paper describes the use of an Artificial Immune System approach in anomaly detection to attempt to reduce credit card fraud. Credit card fraud is of significant economic importance, fraudulent card transactions in the U.S in 2005 were estimated to cost $790 million [1]. A UK survey of online businesses indicated that merchants expect to lose an average of 1.8% of their overall online revenue to payment fraud [2]. Credit card fraud can be broken down into two forms [3]; Inner card fraud requires collusion between merchants and cardholders and is not relevant here. External card fraud occurs when stolen, fake or counterfeit credit cards are used and this form of fraud is of interest here. With the use of more secure Chip and PIN verification in 1. Peter..Keenan corresponding author [email protected] most European countries, credit card fraudsters are increasingly targeting card not present transactions such as online shopping [4]. Data from the UK Payments industry shows a 118% increase in the value of phone, internet and mail order fraud (card not present fraud) between 2004 and From 2001 to 2008 card not present fraud losses in the UK rose by 243% and the total value of online shopping transactions increased by 524% [2]. Against this background of growing online transactions, which are less secure than over the counter transactions, there has been an increase in demand for credit card fraud detection. Fraud detection generally is an important area of application for artificial intelligence techniques [5] Anti fraud approaches for online credit card transactions have included the use of artificial intelligence, with new techniques being introduced in addition to older approaches, such as rule based systems [6]. There are commercial applications in this field, for instance the Falcon Fraud Manager software [7]. In addition, technical approaches such as enhanced encryption and passwords have been introduced in the credit card industry. Artificial Immune Systems (AIS) are a recent branch of artificial intelligence based on the biological metaphor of the human immune system [8]. The immune system can distinguish between self and non-self, or more appropriately, between harmful non-self and everything else. This ability to recognize differences in patterns and to identify anomalies sparked the interest in adapting its processes to use in other domains, including the identification of anomalous credit card transactions. The natural immune system is a highly complex system, comprised of an intricate network of specialized tissues, organs, cells and chemical molecules. The natural immune system can recognize, destroy, and remember an almost unlimited numbers of pathogens (foreign objects that enter the body, including viruses, bacteria, multi-cellular parasites, and fungi). To assist in protecting the organism, the immune system has the capability to distinguish between self and non-self. Notably, the system does not require exhaustive training with negative (non-self) examples to make these distinctions, but can identify items as non-self which it has never before encountered. The negative selection algorithm was proposed in 1994 for anomaly detection [9]. The basis of the negative selection algorithm is the ability of the immune system to discriminate between self and non-self, or more broadly to
3 distinguish between two system states, normal or abnormal. Forrest et al. [9] developed a binary-valued negative selection algorithm analogous to the negative selection or self-tolerogenesis process during T cell maturation in the thymus. Later this was extended to a real-valued representation. The process can be split into three stages, first the self cells need to be defined, and next a binary selection of cells is generated. These cells are randomly selected, the idea being that those who recognize the self samples contained in the training sample. The remaining detectors are used in the third stage of monitoring the occurrence of anomaly detection. In implementing the algorithm, training data is usually normalized to (0,1) and a predetermined number of detectors are created at random positions in the data space. During the training process (akin to tolerogenesis) any detector that falls within a threshold distance $r_{s}$ of any member of the set of self samples is discarded and replaced with another randomly generated detector. The replacement detector is also checked against the set of self samples. The process of detector generation iterates until the required number of valid detectors is generated. All of the resulting detectors are potentially useful detectors of non-self. Once a population of detectors has been created they can be used to classify new data observations. To do this the new data vector is presented to the population of detectors and if it does not fall within a hypersphere of radius $r_{s}$ of any detector, the data vector is deemed to be non-self. Otherwise, the new data vector is deemed to be self. A crucial point in the negative selection process is that the immune system does not require specific examples of nonself in creating its detectors. Potentially, the detectors can uncover any instance of non-self, even those never before encountered. This is an important attribute for fraud detection, as fraudsters continue to devise novel approaches to fraud which WebBiz may not have experienced before and the online nature of their business immediately exposes them to fraud innovation occurring anywhere in the world. Many alterations of this model have been proposed; the major difference between the different models is the choice of a matching rule. This rule must determine the similarity between two patterns in order to classify self/non-self samples [10]. Other issues that weigh heavily with regards the success of a system are the number of detectors required, as well as the threshold set for the level of similarity. If this threshold value was to be too small it may not be possible to generate a suitable number of detectors from the available self. A balance needs to be found, setting the threshold high means the generated detectors become sensitive to any anomaly in the data patterns, so more detectors are necessary to achieve a desired reliability [10]. AIS approaches have been applied to several different areas. Different applications of information security have been examined through based on the workings of the immune system; these include host intrusion as well as network intrusion [11]. With host intrusion sequences of system calls were used as the detectors. Network intrusion has received a lot of research, the detectors here would relate to the IP address, whether it is the IP source address or the IP destination address. In the retail sector, a substantial research effort was undertaken using a computer intelligence fraud detection system [12]. This system consisted of a combination of negative selection as well as clonal selection. The results of this study were mixed. On the one hand, the AIS succeeded in highlighting different anomalies, however due to the large number of attributes or predictors for each transaction, the method of unsupervised learning proved troublesome. In 2008 Gadi et al [13] used data from a Brazilian Bank to study the effectiveness of using an Artificial Immune System to detect fraud. In their study, they used three different strategies and compared the results of using an Artificial Immune System, Artificial Neural Network, Decision Tree, Naive Bayes and Bayesian Nets with each of the strategies. The study focused more on reducing the cost of using each of the methods and producing the best set of parameters than on the success of each method with classification. There is no detail on which algorithm they used or which parameters were most successful. In Tuo et al an Artificial Immune System for credit card fraud detection is suggested but not actually implemented [14]. It suggests integrating a Case based Reasoning approach with an Artificial Immune System. Brabazon, et al [15] study the use of Artificial Immune System in corporate failure prediction. In this study, both a canonical negative selection algorithm and a variable size detector algorithm were used and their performance compared. In this paper, the Artificial Immune System outperformed Linear Discriminant Analysis and, although it performed less well than the sophisticated GA/ANN approach, it has the advantage of not needing to be exposed to bad exemplars II. PROBLEM DATA This analytics uses data drawn from data provided by WebBiz which recorded 4 million transactions from unique customers with 5417 fraudulent transactions classified as fraudulent. Data was provided about the customer accounts, e.g. data of registration, and individual transactions e.g. date and amount of transaction (Table 1). Data cleansing is a huge part of any project that involves raw data. It requires a huge amount of time and attention to ensure the quality of the overall data. They included having too much or too little data, missing data and noisy data or outliers. Initially the data was in two database tables and appropriate operations were needed to join these. Within the two files there were a number of transactions that were present in one but absent in the other i.e. missing data. These transactions could not be used since too many fields were absent. For this reason, they were eliminated. Since declined transactions have already failed a fraud prevention system, these too were removed from the data file.
4 Encrypted Customer ID Journal ID Date Transaction type Reference key User ID Amount Description Balance Payment ID Payment Sort Status Chargeback Flag Scheme Currency Country of Bank Secure 3D Flag IP Address Registration date Registration time Date of Birth TABLE I PROBLEM DATA Customer identifier Unique ID for the journal entry. The date and time of the transaction The type of transaction carried out. Gives information about the type of transaction carried out. A separate table is provided which lists the different keys Again, the most important aspect that this ID played was enabling the tracking of data. Very important when trying to normalize transactions with information split between the two files. The transaction amount. Some additional information on the transaction The customer s balance after the transaction Again, the most important aspect that this ID played was enabling the tracking of data. Very important when trying to normalize transactions with information split between the two files. Identifies whether the transaction is a deposit or a withdrawal Y or N depending on whether the transaction was a success or declined A chargeback flag is needed to identify the known fraudulent cases. These transactions will be used for training purposes. The type of credit card used, e.g. MasterCard, Visa, etc. The currency used. The country of the bank which was used to carry out the transaction is listed here States whether the transaction was a 3D secure transaction or not The IP address from where the transaction took place. The date the customer registered to open their account. The time the customer registered to open their account. The customer s date of birth. The most time consuming aspect of the data cleansing was dealing with noisy data. Different pieces of information such as country had been entered into the computer system manually. Hence, there were problems when it came to identifying the number of transactions in different countries, particularly the U.K, where there are several different representations i.e. G.B, England, U.K. etc. The IP address data also required a great degree of further preparation. As the four octet IP address e.g had to be matched with its location using the an IP to Country Database [16]. For ease of analysis, the date of birth was replaced by the age of the customer at the time they registered. Before any normalization could be carried out, the information present needed to be standardized. Since WebBiz allow transactions in multiple currencies, these needed to be brought to a common currency. There were three separate classes of variable in the final data, nominal, binary and ordinal or continuous data. Since all combinations of data were going to be examined, the normalized range needed to be standard and since the binary variables were present, the range of 0 to 1 was deemed most appropriate. The continuous variables were normalized using an approach by Yu [17] I min = the minimal value of the input range, 0. I max = the maximal value of the input range, 1. D min = the minimal value of a given input range, varies for each variable, listed above. D max = the maximal value of a given input range, listed above D id = the figure due to be normalized. Since the input range is 0-1, this equation simplifies to III. REGRESSION We initially sought to apply conventional statistical techniques to the data, to provide a benchmark against which AIS techniques can be measured. Suitable benchmarks would include linear discriminant analysis (LDA) and logistic regression (LR). Our dataset contains nominal, ordinal and binary data. For various reasons even the continuous variables do not have a close to normal distribution, for instance the age distribution is not normal as people under 18 generally do not have credit cards. Other data such as the date of first registration had a roughly uniform distribution (Fig 1). Consequently, linear regression is not an appropriate technique to use to investigate the data and we chose to use logistic regression. The logistic model calculates the probability of a certain outcome based on the values of the predictor variables.
5 that an artificial immune system approach will yield useful results. IV. ARTIFICIAL IMMUNE SYSTEM APPROACHES The regression test results were then compared with three algorithms from the Artificial Immune Systems literature which we coded and tested on our data set. The three chosen algorithms were the Unmodified Negative Selection Algorithm, the Modified Negative Selection Algorithm and the Clonal Selection Algorithm. Before we consider these algorithms however, we need to decide on the appropriate distance The most commonly used distance algorithms in Artificial Immune Systems is the Euclidean distance algorithm [19]: Figure 1: Histogram of Date of Transaction Let the conditional probability that the outcome is present be denoted by where x is the collection of predictor variables. the logit of the multiple logistic regression model is given by g(x)= β 0 +β 1 x 1 + β 2 x β p x p However, this algorithm is not suitable for nominal variables. The fact that our data set contains nominal, ordinal and binary variables means that we need to investigate non- Euclidean distance measures to find the appropriate way to measure the distance between transactions in our dataset. The concept that was most suited to our data set was the Value Distance Metric [20]. This metric is used to calculate the distance between nominal variables. It works using the probability that the value we are working with would be observed in our dataset. We will, however, use the unweighted version as presented in Wilson & Martinez.[21] where β is estimated using maximum likelihood. The maximum likelihood returns values for the unknown parameters which help to maximize the probability of obtaining the observed data set. The logistic regression model [18] can now be expressed For the analysis in the study we used a randomly selected subset of 50,000 transactions. Of these, were nonfraudulent and 209 were fraudulent. Consequently the test data set has 0.418% transactions which were fraudulent and so has a much higher proportion of fraudulent transactions than the population Using the logistic regression model 0.012% of transactions which were not fraudulent were misclassified. In addition, % of fraudulent transactions were misclassified. Overall using the logistic regression model % of transactions were accurately classified, compared to an accuracy of % for the naive method, suggesting that there are hidden relationships in the data and Where: N a,x = the number of instances in the training set T that have value x for attribute a N a,x,c = the number of instances in T that have value x for attribute a and output class c C = the number of output classes q = a constant, usually 1 or 2 P a,x,c = is the conditional probability that the output class is c given that attribute a has the value x In order to apply a distance metric to our entire dataset, we use the Heterogeneous Value Distance Metrics drawn from Hanmaker & Bogess [19]:
6 self transactions are classified as non-self. At this point, the test radius is fixed at 0.07 and the testing radius is varied, where 0.6 was selected as the best value. Because our data includes both nominal and continuous data fields, we have to carefully consider how to move a detector away from itself. We considered using a very low mutation rate, generating a random number for each nominal data field and if the random number if below the mutation rate, randomly change it to one of the other possibilities for that field. We felt however that changing a nominal variable was not just moving a detector but changing it outright. For that reason we decided to adapt each of the continuous variables and leave the nominal variables as they were. The decision was made to make the adaptation rate a random very small number and to randomly subtract or add to the variables. Again, when it came to moving the detector away from all other detectors when it was accepted, we decided that this was unnecessary in our version of the code. Because of the large number of possible states for some of the data fields we felt that there was sufficient variability in our detector set to ensure enough coverage. V. EXPERIMENTATION In order to being testing our unmodified negative selection algorithm, it was necessary to first investigate the distances that would be returned by the HVDM distance metric. After multiple iterations we found that the distances were all in the range [0, 0.2] with clustering around 0.1. This gave us a starting point for our radius. We decided to proceed as follows with our testing: 1. Vary the number of training samples 2. Vary the number of detectors 3. Vary the training radius 4. Vary the testing radius VI. DISCUSSION OF RESULTS The results of our analysis is shown in Table 2. The Unmodified Negative Selection Algorithm looks very promising if only the Accuracy result is examined. However, in order to gain a true understanding of how well this algorithm performed it is necessary to examine both figures that present the misclassification results. It turns out that while it classifies almost all of the self-transactions correctly, it misclassifies almost all of the non-self transactions. Because fraud is such a rare event, if one was to employ the scheme of classifying all transactions as nonfraudulent, one would achieve high levels of accuracy. This practice is obviously not a successful strategy for the identification of fraud however. For this reason, we have to conclude that the Unmodified Negative Selection Algorithm is not suitable for the detection of credit card fraud. The Modified Negative Selection Algorithm offers a good tradeoff between classifying self correctly and classifying non-self correctly. As opposed to the Unmodified Negative Selection Algorithm which fails to classify any fraudulent transactions, the Modified Negative Selection Algorithm successfully identifies several of the fraudulent transactions. Algorithm Unmodified Negative Selection Cut Number TABLE 2 AIS RESULTS % self categorized as non-self % non-self categorized as self Accuracy Time taken (seconds) % 96.55% 98.96% We started with 200 detectors and setting the testing radius equal to the training radius, The resulting detectors were tested on a sample of 1,000 transactions. Having tested various training samples sizes, the decision was made to proceed with the testing using 1,000 training samples. We then tested using differing numbers of detectors. As the number of detectors was increased, the number of self transactions being classified as non-self was increasing but the detectors were still failing to identify almost any nonself transactions. At this point we decided to proceed with 200 detectors as that resulted in a low number of self being classified as non-self, as well as having a low run time. By testing varying values for the radius, the best results were achieved with a radius of As the radius is increased past this point, an unacceptably large number of Modified Negative Selection Clonal Selection % 100% 98.48% % 100% 96.86% % 96.55% 90.14% % 89.66% 93.38% % 96.55% 95.4% % 75.86% 79.94% % 72.41% 65.98% % 62% 60.72%
7 This result is very promising as it means that the Modified Negative Selection Algorithm has the ability to identify fraudulent transactions. The results of the Clonal Selection Algorithm are very unpromising in relation to the accuracy metric used. It misidentifies a very high percentage of the self-transactions, and the nature of fraud is that there will always be many more normal transactions than fraudulent ones. This algorithm also takes the longest to run. However, the Clonal Selection Algorithm classifies a large percentage of the fraudulent transactions correctly, which is an extremely valuable trait in a fraud detection technique. An economic measure would have to take account of that the loss for a fraudulent transactions greatly exceeds the typical profit margin available for a successful non-fraudulent transaction. The large number of normal transactions likely to be misclassified makes this algorithm unsuited to fully automatic operation. However potentially fraudulent transactions could be subjected to further automatic or human processing to reduce the number of false negatives. In general, online customers are not willing to tolerate delay in credit card use online [4]. But there can be scope for review of customer accounts in the period before orders are fulfilled. VII. CONCLUSIONS The results suggest that AIS can be applied in this domain, but that care needs to be taken in the implementation of the algorithms in order to get workable results. Although the results obtained from the canonical Negative Selection Algorithm have high overall accuracy, the system misclassified too many fraudulent transactions to be operationalized. Several opportunities are indicated for future work. Design of an appropriate cost function which trades off the relative cost of Type I vs Type II errors is clearly important if a workable system is to be constructed. Another avenue is to investigate the utility of a hybrid, multistage detection system. Routine fraud flags / rules could be applied to weed out the most obvious fraudulent transactions, leaving AIS to detect more subtle cases of fraud. Future work might investigate the automatic generation of parameters using techniques such a genetic programming and genetic evolution [22]. [5] [C. Phua, et al., "A comprehensive survey of data mining-based fraud detection research," Artificial Intelligence Review, [6] K. J. Leonard, "The development of a rule based expert system model for fraud alert in consumer credit," European Journal of Operational Research, vol. 80, pp , [7] Falcon.(2010). (< Fraud-Manager.aspx>). [8] L. N. De Castro and J. Timmis, Artificial immune systems : a new computational intelligence approach. London: Springer, [9] S. Forrest, et al., "Self-nonself discrimination in a computer," in 1994 IEEE Symposium on Research in Security and Privacy, Los Alamitos, CA, USA, 1994, pp [10] D. Dasgupta, et al., "Artificial immune system (AIS) research in the last five years," 2003, pp [11] R. Overill, "Computational immunology and anomaly detection," Information Security Technical Report, vol. 12, pp , [12] J. Kim, et al., "Design of an artificial immune system as a novel anomaly detector for combating financial fraud in the retail sector," in CEC '03. The 2003 Congress on Evolutionary Computation, 2003., 2003, pp [13] M. Gadi, et al., "Credit Card Fraud Detection with Artificial Immune System," in Artificial Immune Systems, ed, 2008, pp [14] T. Jianyong, et al., "Artificial immune system for fraud detection," in Systems, Man and Cybernetics, 2004 IEEE International Conference on, 2004, pp vol.2. [15] A. Brabazon, et al., "Financial Classification using an Artificial Immune System," in Business Applications and Computational Intelligence, K. E. Voges and N. K. Pope, Eds., ed: Idea Group Publishing, 2006, pp [16] IP. (Retrieved June 12, 2009). The IP to Country Database. Available: [17] L. Yu, et al., "An integrated data preparation scheme for neural network data analysis," IEEE Transactions on Knowledge and Data Engineering, vol. 18, pp , [18] D. W. Hosmer and S. Lemeshow, Applied logistic regression, 2nd ed. ed. New York ; Chichester: Wiley, [19] J. S. Hamaker and L. Boggess, "Non-Euclidean distance measures in AIRS, an artificial immune classification system," in Evolutionary Computation, CEC2004. Congress on, 2004, pp Vol.1. [20] C. Stanfill and D. Waltz, "Toward memory-based reasoning," Commun. ACM, vol. 29, pp , [21] D. R. Wilson and T. R. Martinez, "Improved Heterogeneous Distance Functions," Journal of Artifical Intelligence Research, vol. 6, pp. 1-34, [22] H. Bernardino and H. Barbosa, "Grammar-Based Immune Programming for Symbolic Regression," in Artificial Immune Systems, ed, 2009, pp REFERENCES [1] S. Panigrahi, et al., "Credit card fraud detection: A fusion approach using Dempster-Shafer theory and Bayesian learning," Information Fusion, vol. 10, pp , [2] APACS. (2009, 05 Feb). APACS 2008 Fraud Loss Figures. Available: /page/685/ [3] A. Shen, et al., "Application of Classification Models on Credit Card Fraud Detection," 2007, pp [4] S. Herbst-Murphy, "Maintaining a safe environment for payment cards: Examining evolving threats posed by fraud," Payment Cards Center Discussion Paper, 2009.
An Artificial Immune Model for Network Intrusion Detection
An Artificial Immune Model for Network Intrusion Detection Jungwon Kim and Peter Bentley Department of Computer Science, University Collge London Gower Street, London, WC1E 6BT, U. K. Phone: +44-171-380-7329,
E-commerce Transaction Anomaly Classification
E-commerce Transaction Anomaly Classification Minyong Lee [email protected] Seunghee Ham [email protected] Qiyi Jiang [email protected] I. INTRODUCTION Due to the increasing popularity of e-commerce
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
Credit Card Fraud Detection Using Self Organised Map
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 13 (2014), pp. 1343-1348 International Research Publications House http://www. irphouse.com Credit Card Fraud
Computational intelligence in intrusion detection systems
Computational intelligence in intrusion detection systems --- An introduction to an introduction Rick Chang @ TEIL Reference The use of computational intelligence in intrusion detection systems : A review
A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model
A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model Twinkle Patel, Ms. Ompriya Kale Abstract: - As the usage of credit card has increased the credit card fraud has also increased
Immune Support Vector Machine Approach for Credit Card Fraud Detection System. Isha Rajak 1, Dr. K. James Mathai 2
Immune Support Vector Machine Approach for Credit Card Fraud Detection System. Isha Rajak 1, Dr. K. James Mathai 2 1Department of Computer Engineering & Application, NITTTR, Shyamla Hills, Bhopal M.P.,
A Review of Anomaly Detection Techniques in Network Intrusion Detection System
A Review of Anomaly Detection Techniques in Network Intrusion Detection System Dr.D.V.S.S.Subrahmanyam Professor, Dept. of CSE, Sreyas Institute of Engineering & Technology, Hyderabad, India ABSTRACT:In
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
Using Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
A DETECTOR GENERATING ALGORITHM FOR INTRUSION DETECTION INSPIRED BY ARTIFICIAL IMMUNE SYSTEM
A DETECTOR GENERATING ALGORITHM FOR INTRUSION DETECTION INSPIRED BY ARTIFICIAL IMMUNE SYSTEM Walid Mohamed Alsharafi and Mohd Nizam Omar Inter Networks Research Laboratory, School of Computing, College
[email protected]. Keywords - Algorithm, Artificial immune system, E-mail Classification, Non-Spam, Spam
An Improved AIS Based E-mail Classification Technique for Spam Detection Ismaila Idris Dept of Cyber Security Science, Fed. Uni. Of Tech. Minna, Niger State [email protected] Abdulhamid Shafi i
Statistics in Retail Finance. Chapter 7: Fraud Detection in Retail Credit
Statistics in Retail Finance Chapter 7: Fraud Detection in Retail Credit 1 Overview > Detection of fraud remains an important issue in retail credit. Methods similar to scorecard development may be employed,
International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015
RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering
RSA Adaptive Authentication For ecommerce
RSA Adaptive Authentication For ecommerce Risk-based 3D Secure for Credit Card Issuers SOLUTION BRIEF RSA FRAUD & RISK INTELLIGENCE The Threat of ecommerce Fraud ecommerce fraud is a threat to both issuers
How To Detect Credit Card Fraud
Card Fraud Howard Mizes December 3, 2013 2013 Xerox Corporation. All rights reserved. Xerox and Xerox Design are trademarks of Xerox Corporation in the United States and/or other countries. Outline of
An Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems
2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Impact of Feature Selection on the Performance of ireless Intrusion Detection Systems
A Content based Spam Filtering Using Optical Back Propagation Technique
A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT
STATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
DATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Data Mining. Nonlinear Classification
Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15
Dan French Founder & CEO, Consider Solutions
Dan French Founder & CEO, Consider Solutions CONSIDER SOLUTIONS Mission Solutions for World Class Finance Footprint Financial Control & Compliance Risk Assurance Process Optimization CLIENTS CONTEXT The
Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup
Network Anomaly Detection A Machine Learning Perspective Dhruba Kumar Bhattacharyya Jugal Kumar KaKta»C) CRC Press J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor
An Efficient Three-phase Email Spam Filtering Technique
An Efficient Three-phase Email Filtering Technique Tarek M. Mahmoud 1 *, Alaa Ismail El-Nashar 2 *, Tarek Abd-El-Hafeez 3 *, Marwa Khairy 4 * 1, 2, 3 Faculty of science, Computer Sci. Dept., Minia University,
Credit Card Fraud Detection using Hidden Morkov Model and Neural Networks
Credit Card Fraud Detection using Hidden Morkov Model and Neural Networks R.RAJAMANI Assistant Professor, Department of Computer Science, PSG College of Arts & Science, Coimbatore. Email: [email protected]
Data Mining Application for Cyber Credit-card Fraud Detection System
, July 3-5, 2013, London, U.K. Data Mining Application for Cyber Credit-card Fraud Detection System John Akhilomen Abstract: Since the evolution of the internet, many small and large companies have moved
A Review of Data Mining Techniques
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
A survey on Data Mining based Intrusion Detection Systems
International Journal of Computer Networks and Communications Security VOL. 2, NO. 12, DECEMBER 2014, 485 490 Available online at: www.ijcncs.org ISSN 2308-9830 A survey on Data Mining based Intrusion
Chapter 20: Data Analysis
Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification
Predicting Flight Delays
Predicting Flight Delays Dieterich Lawson [email protected] William Castillo [email protected] Introduction Every year approximately 20% of airline flights are delayed or cancelled, costing
Electronic Payment Fraud Detection Techniques
World of Computer Science and Information Technology Journal (WCSIT) ISSN: 2221-0741 Vol. 2, No. 4, 137-141, 2012 Electronic Payment Fraud Detection Techniques Adnan M. Al-Khatib CIS Dept. Faculty of Information
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, [email protected] Abstract: Independent
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 [email protected]
Knowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Evaluating the Accuracy of a Classifier Holdout, random subsampling, crossvalidation, and the bootstrap are common techniques for
Data Mining Approach For Subscription-Fraud. Detection in Telecommunication Sector
Contemporary Engineering Sciences, Vol. 7, 2014, no. 11, 515-522 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ces.2014.4431 Data Mining Approach For Subscription-Fraud Detection in Telecommunication
Recognizing Informed Option Trading
Recognizing Informed Option Trading Alex Bain, Prabal Tiwaree, Kari Okamoto 1 Abstract While equity (stock) markets are generally efficient in discounting public information into stock prices, we believe
Data Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over
BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts
BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an
FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS
FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,
Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar
Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar Prepared by Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc. www.data-mines.com [email protected]
Sage Pay Fraud Prevention Guide
Sage Pay Fraud Prevention Guide April 2014 Table of Contents 1.0 Introduction to fraud prevention 3 1.1 What are the fraud prevention tools 3 2.0 AVS/CV2 4 2.1 What is AVS/CV2 4 2.2 How it works 5 2.3
Evaluating Online Payment Transaction Reliability using Rules Set Technique and Graph Model
Evaluating Online Payment Transaction Reliability using Rules Set Technique and Graph Model Trung Le 1, Ba Quy Tran 2, Hanh Dang Thi My 3, Thanh Hung Ngo 4 1 GSR, Information System Lab., University of
Classification algorithm in Data mining: An Overview
Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department
DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM
INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM M. Mayilvaganan 1, S. Aparna 2 1 Associate
NEURAL NETWORKS IN DATA MINING
NEURAL NETWORKS IN DATA MINING 1 DR. YASHPAL SINGH, 2 ALOK SINGH CHAUHAN 1 Reader, Bundelkhand Institute of Engineering & Technology, Jhansi, India 2 Lecturer, United Institute of Management, Allahabad,
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents
STATISTICA. Financial Institutions. Case Study: Credit Scoring. and
Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT
Predict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, [email protected] Department of Electrical Engineering, Stanford University Abstract Given two persons
Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management
Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Paper Jean-Louis Amat Abstract One of the main issues of operators
Equity forecast: Predicting long term stock price movement using machine learning
Equity forecast: Predicting long term stock price movement using machine learning Nikola Milosevic School of Computer Science, University of Manchester, UK [email protected] Abstract Long
Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India
Volume 5, Issue 6, June 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Multiple Pheromone
Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100
Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Erkan Er Abstract In this paper, a model for predicting students performance levels is proposed which employs three
NEW TECHNIQUE TO DEAL WITH DYNAMIC DATA MINING IN THE DATABASE
www.arpapress.com/volumes/vol13issue3/ijrras_13_3_18.pdf NEW TECHNIQUE TO DEAL WITH DYNAMIC DATA MINING IN THE DATABASE Hebah H. O. Nasereddin Middle East University, P.O. Box: 144378, Code 11814, Amman-Jordan
Less naive Bayes spam detection
Less naive Bayes spam detection Hongming Yang Eindhoven University of Technology Dept. EE, Rm PT 3.27, P.O.Box 53, 5600MB Eindhoven The Netherlands. E-mail:[email protected] also CoSiNe Connectivity Systems
Data Mining Applications in Higher Education
Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2
Question 2 Naïve Bayes (16 points)
Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the
A Novel Binary Particle Swarm Optimization
Proceedings of the 5th Mediterranean Conference on T33- A Novel Binary Particle Swarm Optimization Motaba Ahmadieh Khanesar, Member, IEEE, Mohammad Teshnehlab and Mahdi Aliyari Shoorehdeli K. N. Toosi
An effective approach to preventing application fraud. Experian Fraud Analytics
An effective approach to preventing application fraud Experian Fraud Analytics The growing threat of application fraud Fraud attacks are increasing across the world Application fraud is a rapidly growing
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant
An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation
An Efficient Way of Denial of Service Attack Detection Based on Triangle Map Generation Shanofer. S Master of Engineering, Department of Computer Science and Engineering, Veerammal Engineering College,
Comparison of K-means and Backpropagation Data Mining Algorithms
Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and
Your guide to getting the most from your card
Your guide to getting the most from your card Ulster Bank debitcard Welcome to your Ulster Bank debitcard Your Ulster Bank Debitcard is accepted in 30 million retail outlets around the world, wherever
Credit Card Conditions of Use. Credit Guide.
Credit Card Conditions of Use Credit Guide. Effective Date: September 2015 This document contains the terms and conditions that apply to St.George Bank Business Visa Debit Card cardholders and to all transactions
Prediction of Stock Performance Using Analytical Techniques
136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University
A Fast Fraud Detection Approach using Clustering Based Method
pp. 33-37 Krishi Sanskriti Publications http://www.krishisanskriti.org/jbaer.html A Fast Detection Approach using Clustering Based Method Surbhi Agarwal 1, Santosh Upadhyay 2 1 M.tech Student, Mewar University,
Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016
Network Machine Learning Research Group S. Jiang Internet-Draft Huawei Technologies Co., Ltd Intended status: Informational October 19, 2015 Expires: April 21, 2016 Abstract Network Machine Learning draft-jiang-nmlrg-network-machine-learning-00
Intrusion Detection via Machine Learning for SCADA System Protection
Intrusion Detection via Machine Learning for SCADA System Protection S.L.P. Yasakethu Department of Computing, University of Surrey, Guildford, GU2 7XH, UK. [email protected] J. Jiang Department
Healthcare Measurement Analysis Using Data mining Techniques
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik
SAS. Fraud Management. Overview. Real-time scoring of all transactions for fast, accurate fraud detection. Challenges PRODUCT BRIEF
PRODUCT BRIEF SAS Fraud Management Real-time scoring of all transactions for fast, accurate fraud detection Overview Organizations around the globe lose approximately 5 percent of annual revenues to fraud,
2695 P a g e. IV Semester M.Tech (DCN) SJCIT Chickballapur Karnataka India
Integrity Preservation and Privacy Protection for Digital Medical Images M.Krishna Rani Dr.S.Bhargavi IV Semester M.Tech (DCN) SJCIT Chickballapur Karnataka India Abstract- In medical treatments, the integrity
HALIFAX CASH ISA. Conditions and information
HALIFAX CASH ISA. Conditions and information Welcome to Halifax 3 Section 1 How these conditions work 5 Section 2 Special Conditions 7 ISA Saver Variable 12 ISA Saver Online 13 ISA Saver Fixed 14 Junior
Making Sense of the Mayhem: Machine Learning and March Madness
Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University [email protected] [email protected] I. Introduction III. Model The goal of our research
How To Use Neural Networks In Data Mining
International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,
Foundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of
Review on Financial Forecasting using Neural Network and Data Mining Technique
ORIENTAL JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY An International Open Free Access, Peer Reviewed Research Journal Published By: Oriental Scientific Publishing Co., India. www.computerscijournal.org ISSN:
Travel Card. Cardholder Frequently Asked Questions. June 2014 T.FQ.S.20141.E
Travel Card Cardholder Frequently Asked Questions Travel Card (1) Where can I use my card? Your card may be used anywhere debit cards are accepted. The brand marks on your card indicate where the card
PayPoint.net Gateway Guide to Identifying Fraud Risks
PayPoint.net Gateway Guide to Identifying Fraud Risks Copyright PayPoint.net 2010 This document contains the proprietary information of PayPoint.net and may not be reproduced in any form or disclosed to
The Cyber Threat Profiler
Whitepaper The Cyber Threat Profiler Good Intelligence is essential to efficient system protection INTRODUCTION As the world becomes more dependent on cyber connectivity, the volume of cyber attacks are
Data Loss Prevention Best Practices to comply with PCI-DSS An Executive Guide
Data Loss Prevention Best Practices to comply with PCI-DSS An Executive Guide. Four steps for success Implementing a Data Loss Prevention solution to address PCI requirements may be broken into four key
Application of Data Mining Techniques in Intrusion Detection
Application of Data Mining Techniques in Intrusion Detection LI Min An Yang Institute of Technology [email protected] Abstract: The article introduced the importance of intrusion detection, as well as
Customer Classification And Prediction Based On Data Mining Technique
Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data
An investigation into Data Mining approaches for Anti Money Laundering
2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore An investigation into Data Mining approaches for Anti Money Laundering Nhien An
POINT OF SALE FRAUD PREVENTION
2002 IEEE Systems and Information Design Symposium University of Virginia POINT OF SALE FRAUD PREVENTION Student team: Jeff Grossman, Dawn Herndon, Andrew Kaplan, Mark Michalski, Carlton Pate Faculty Advisors:
Pattern-Aided Regression Modelling and Prediction Model Analysis
San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Fall 2015 Pattern-Aided Regression Modelling and Prediction Model Analysis Naresh Avva Follow this and
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Bank of Scotland Private Banking Savings Accounts
Bank of Scotland Private Banking Savings Accounts Terms and Conditions Applicable to: Premier Investment Account Premier Reserve Account (for Personal Customers) Premier Reserve Account (for Trusts) 3
Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios
Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios By: Michael Banasiak & By: Daniel Tantum, Ph.D. What Are Statistical Based Behavior Scoring Models And How Are
Insider Threat Detection Using Graph-Based Approaches
Cybersecurity Applications & Technology Conference For Homeland Security Insider Threat Detection Using Graph-Based Approaches William Eberle Tennessee Technological University [email protected] Lawrence
Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05
Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification
Data Mining Applications in Manufacturing
Data Mining Applications in Manufacturing Dr Jenny Harding Senior Lecturer Wolfson School of Mechanical & Manufacturing Engineering, Loughborough University Identification of Knowledge - Context Intelligent
