Genetic algorithms for credit card fraud detection



Similar documents
Alpha Cut based Novel Selection for Genetic Algorithm

Introduction To Genetic Algorithms

Genetic Algorithms commonly used selection, replacement, and variation operators Fernando Lobo University of Algarve

ISSN: ISO 9001:2008 Certified International Journal of Engineering Science and Innovative Technology (IJESIT) Volume 2, Issue 3, May 2013

College of information technology Department of software

International Journal of Software and Web Sciences (IJSWS)

Asexual Versus Sexual Reproduction in Genetic Algorithms 1

A hybrid Approach of Genetic Algorithm and Particle Swarm Technique to Software Test Case Generation

Genetic Algorithm. Based on Darwinian Paradigm. Intrinsically a robust search and optimization mechanism. Conceptual Algorithm

Evolutionary SAT Solver (ESS)

GA as a Data Optimization Tool for Predictive Analytics

CHAPTER 6 GENETIC ALGORITHM OPTIMIZED FUZZY CONTROLLED MOBILE ROBOT

Original Article Efficient Genetic Algorithm on Linear Programming Problem for Fittest Chromosomes

A Robust Method for Solving Transcendental Equations

Volume 3, Issue 2, February 2015 International Journal of Advance Research in Computer Science and Management Studies

Genetic Algorithms and Sudoku

New Modifications of Selection Operator in Genetic Algorithms for the Traveling Salesman Problem

A Parallel Processor for Distributed Genetic Algorithm with Redundant Binary Number

Numerical Research on Distributed Genetic Algorithm with Redundant

Estimation of the COCOMO Model Parameters Using Genetic Algorithms for NASA Software Projects

A Fast Computational Genetic Algorithm for Economic Load Dispatch

A Genetic Algorithm Processor Based on Redundant Binary Numbers (GAPBRBN)

CREDIT CARD FRAUD DETECTION SYSTEM USING GENETIC ALGORITHM

Lab 4: 26 th March Exercise 1: Evolutionary algorithms

Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm

Simple Population Replacement Strategies for a Steady-State Multi-Objective Evolutionary Algorithm

Effect of Using Neural Networks in GA-Based School Timetabling

Genetic algorithms for changing environments

A SURVEY ON GENETIC ALGORITHM FOR INTRUSION DETECTION SYSTEM

Optimum Design of Worm Gears with Multiple Computer Aided Techniques

A Non-Linear Schema Theorem for Genetic Algorithms

Genetic Algorithm Performance with Different Selection Strategies in Solving TSP

A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II

About the Author. The Role of Artificial Intelligence in Software Engineering. Brief History of AI. Introduction 2/27/2013

Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

New binary representation in Genetic Algorithms for solving TSP by mapping permutations to a list of ordered numbers

Evolutionary Detection of Rules for Text Categorization. Application to Spam Filtering

Model-based Parameter Optimization of an Engine Control Unit using Genetic Algorithms

Nonlinear Model Predictive Control of Hammerstein and Wiener Models Using Genetic Algorithms

NEUROEVOLUTION OF AUTO-TEACHING ARCHITECTURES

ECONOMIC GENERATION AND SCHEDULING OF POWER BY GENETIC ALGORITHM

HYBRID GENETIC ALGORITHMS FOR SCHEDULING ADVERTISEMENTS ON A WEB PAGE

Introduction to computer science

D A T A M I N I N G C L A S S I F I C A T I O N

LOAD BALANCING IN CLOUD COMPUTING

Genetic Algorithm Evolution of Cellular Automata Rules for Complex Binary Sequence Prediction

Cellular Automaton: The Roulette Wheel and the Landscape Effect

An innovative application of a constrained-syntax genetic programming system to the problem of predicting survival of patients

Evolutionary Prefetching and Caching in an Independent Storage Units Model

An evolutionary learning spam filter system

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

USING GENETIC ALGORITHM IN NETWORK SECURITY

The Dynamics of a Genetic Algorithm on a Model Hard Optimization Problem

A Multi-Objective Performance Evaluation in Grid Task Scheduling using Evolutionary Algorithms

Spatial Interaction Model Optimisation on. Parallel Computers

Programming Risk Assessment Models for Online Security Evaluation Systems

Leran Wang and Tom Kazmierski

The Binary Genetic Algorithm

SOFTWARE TESTING STRATEGY APPROACH ON SOURCE CODE APPLYING CONDITIONAL COVERAGE METHOD

Non-Uniform Mapping in Binary-Coded Genetic Algorithms

A Review And Evaluations Of Shortest Path Algorithms

Evolutionary Algorithms using Evolutionary Algorithms

Comparative Study: ACO and EC for TSP

Genetic algorithms for solving portfolio allocation models based on relative-entropy, mean and variance

Learning in Abstract Memory Schemes for Dynamic Optimization

Genetic Algorithm TOOLBOX. For Use with MATLAB. Andrew Chipperfield Peter Fleming Hartmut Pohlheim Carlos Fonseca. Version 1.2.

BMOA: Binary Magnetic Optimization Algorithm

E190Q Lecture 5 Autonomous Robot Navigation

Chapter 6. The stacking ensemble approach

International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering

MIDLAND ISD ADVANCED PLACEMENT CURRICULUM STANDARDS AP ENVIRONMENTAL SCIENCE

CHAPTER 3 SECURITY CONSTRAINED OPTIMAL SHORT-TERM HYDROTHERMAL SCHEDULING

Stock Market Index Prediction by Hybrid Neuro- Genetic Data Mining Technique

Sanjeev Kumar. contribute

Inventory Optimization in Efficient Supply Chain Management

A Survey on Intrusion Detection System with Data Mining Techniques

Enhanced data mining analysis in higher educational system using rough set theory

SCHEDULING MULTIPROCESSOR TASKS WITH GENETIC ALGORITHMS

Bachelor of Games and Virtual Worlds (Programming) Subject and Course Summaries

An Efficient load balancing using Genetic algorithm in Hierarchical structured distributed system

A Sarsa based Autonomous Stock Trading Agent

Stock price prediction using genetic algorithms and evolution strategies

CPO Science and the NGSS

COMPARISON OF GENETIC OPERATORS ON A GENERAL GENETIC ALGORITHM PACKAGE HUAWEN XU. Master of Science. Shanghai Jiao Tong University.

Introduction to Logistic Regression

Solving Banana (Rosenbrock) Function Based on Fitness Function

A Framework for Genetic Algorithms in Games

Improving the Performance of a Computer-Controlled Player in a Maze Chase Game using Evolutionary Programming on a Finite-State Machine

Evaluation of Crossover Operator Performance in Genetic Algorithms With Binary Representation

The Use of Evolutionary Algorithms in Data Mining. Khulood AlYahya Sultanah AlOtaibi

Design call center management system of e-commerce based on BP neural network and multifractal

A Binary Model on the Basis of Imperialist Competitive Algorithm in Order to Solve the Problem of Knapsack 1-0

A Study of Crossover Operators for Genetic Algorithm and Proposal of a New Crossover Operator to Solve Open Shop Scheduling Problem

Integer Programming: Algorithms - 3

Transcription:

Genetic algorithms for credit card fraud detection SATVIK VATS*, SURYA KANT DUBEY, NAVEEN KUMAR PANDEY Institute of Technology and Management AL-1, Sector-7 GIDA, Gorakhpur, Uttar Pradesh, INDIA E-mail address- Satvik.vats@gmail.com Abstract: - Due to the rise and rapid growth of E-Commerce, use of credit cards for online purchases has dramatically increased and it caused an explosion in the credit card fraud. Fraud is one of the major ethical issues in the credit card industry. As credit card becomes the most popular mode of payment for both online as well as regular purchase, cases of fraud associated with it are also rising. In real life, fraudulent transactions are scattered with genuine transactions and simple pattern matching techniques are not often sufficient to detect those frauds accurately. Implementation of efficient fraud detection systems has thus become imperative for all credit card issuing banks to minimize their losses. Many modern techniques based on Artificial Intelligence, Data mining, Fuzzy logic, Machine learning, Sequence Alignment, Genetic Programming etc., has evolved in detecting various credit card fraudulent transactions. A genetic algorithm is an evolutionary search and optimisation technique that Mimics natural evolution to find the best solution to a problem. Here the characteristics of credit card transactions undergo evolution to allow a modelled credit card fraud detection system to be tested. Key- Words:- Electronic commerce, fraud, credit card, genetic algorithms, detection 1 Introduction In recent history information technology has become far more pervasive in everyone s lives. As reliance on software products increases, so does the pressure to ensure that they work reliably and as expected. This is why software testing has risen to the forefront of public attention, with notable instances such as the iphone alarm bug [3]. In 1998, the Data Protection Act changed the way data can be used [13]. Until this time, developers in the UK working in industry have simply made copies of customer data and used it in an often less secure development environment. The Act introduces legislation intended to give more rights to individuals whose data is being held, and restricts uses to which it can be put. For example a company wishing to outsource some of its development may not have the right to pass on its customers data, hence doing so would be a 42

violation of the Act. The company Grid-Tools Limited is our industrial partner for this project. They offer professional solutions for automatic test data generation in the form of a tool called Data Maker. This tool was originally written to address the increasing size of data sets required by industry. As the set size increases it becomes impractical to create the data by hand, so automatic methods had to be found. Data Maker is capable of generating synthetic data that conforms to the requirements of a test engineer s specification. The data created is not regulated by the Data Protection Act, because it has been generated rather than gathered. Thus it is not related to any individual and hence not covered in legislation. The use of synthetic data has advantages in its own right: large sets of data can be created, with their composition tailored to meet test coverage criteria. A set of real-world data may not do this, as it is likely to be of relatively constant composition, so not testing all aspects of the program. A limitation of Data Maker is that it can only produce linear sets of data from its built-in functions. The systematic testing of some software, however, requires data sets with trends. A typical example of such a system is a credit card fraud detection system. To thoroughly test such a system one would require a large set of realistic transactions, both legitimate and fraudulent. For reasons discussed above real data should not be used, so instead a way to generate such data must be found. 2 Related work Genetic algorithms are a heuristic used to solve high-complexity computational problems. Apart from modeling the phenomena occurring in nature, they help in optimization, simulation, modeling, design and prediction purposes in science, medicine, technology, and everyday life [14]. A recent survey of the state of the art was carried out for the Materials and Manufacturing Processes journal in 2009, by Paszkowicz [14]. As the name of the journal suggests they were only concerned with the application of genetic algorithms to problems in chemistry and physics, but nonetheless they highlighted some innovative uses. One cited example was to help the design process of new materials, in particular with regards to a reverse heat transfer problem. The problem consists of finding a material with desirable thermal properties that give rise to a good temperature field profile. For a particular material well known equations can be used to calculate the temperature profile, but because of their complex nature the process cannot easily be reversed to find optimal parameters. This is an area where evolutionary search often excels, as we will see in the next example where the search is applied to an NP complete problem. The algorithm used in this case modeled a liquid material that was being heated linearly on its surface. The input to the algorithm, its initial population, was properties of already known similar liquids. The output computed for each liquid was the temperature field and the cooling rate. Good results were returned by the algorithm, which were later confirmed to be correct experimentally. Still in the same materials engineering survey, evolutionary search has been applied to the mechanical process of welding. To produce a strong weld several parameters have to be optimized, such as current, voltage, torch speed, arc gap, shielding gas and its flow rate, type and geometry of the electrode. It can already be seen how this optimization process could lend itself to the application of a genetic algorithm, and once again good results were found for what would have been an expensive experimental process. Not only did the results of the optimization provide a better set of welding parameters, they also shed light on 43

the transformation of the metal during the weld. This had already been described theoretically, but the results from the algorithm helped to bring calculations and experimental results closer together. In a purely theoretical area, genetic algorithms have been applied to find approximate solutions to the travelling salesman problem. Scaling became an issue as the number of cities the salesman had to visit increased. Braun [4] reported that the algorithm could generate very good but not optimal solutions for travelling salesman problems with 442 to 531 cities. Using a standard SUN workstation they could optimally solve problems with up to 442 cities in under thirty minutes. The biggest problem examined was 666 cities, which could be solved approximately with a journey 0.04% longer than the optimum route. Potvin also analyzed Travelling Salesman with genetic algorithms [6]. The biggest problem reported in his survey was one million cities, solved to within 4% of an optimal route. This took four hours on a powerful computer. He identified the role played by the crossover operator on the outcome, with performance being significantly affected by the reordering of the tour. Perhaps the most well known application of machine learning is robotic movement. Schultz [1] applied the algorithm so that autonomous robots could navigate and perform collision avoidance of obstacles in their path. An innovative part of his work was once again aimed at cost and time saving, similar to the previously detailed welding example. The task set for the autonomous robot was to navigate from a start to end point down no pre planned route, avoiding randomly placed obstacles on its way. 3. Fraud Detection using Genetic Algorithm Genetic algorithms are evolutionary algorithms which aim at obtaining better solutions as time progresses. Since their first introduction by Holland [5], they have been successfully applied to many problem domains from astronomy to sports [2], from optimization [8] to computer science [7], etc. They have also been used in data mining mainly for variable selection [10] and are mostly coupled with other data mining algorithms. In this study, we try to solve our classification problem by using only a genetic algorithm solution. In this module the system must detect whether any fraud has been occurred in the transaction or not. It must also display the user about the result. In the following we make clear the concept of genetic algorithms by using an own example over boundary value testing. We implemented this algorithm in Java and can successfully generate inputs for the test. A genetic algorithm is a paradigm often used to search vast and poorly understood search spaces. With well defined functions the algorithm will converge into one area of the search space which holds the optimal solution. This example is a very simple instance of the algorithm that searches for a set of optimum inputs for black box testing. The function being tested checks whether or not a value x is within the range 0 x 8. Boundary value testing is concerned with selecting the following input Values: Maximum. 44

Maximum minus one. Nominal middle value. mutation ; perform crossover and evaluate population ; Minimum plus one. Minimum. In the same way as a chromosome is the basic building block of nature, so it is of a genetic algorithm. The chromosome is an encoded statement of the data which one wishes to optimise. In our example the chromosome would represent a tuple of all of the input values, and it is encoded as a binary string. The reason for this choice will become clear when further genetic operators are considered. In our example, the inputs 8, 7, 3, 1 and Fig.1 Selection of inputs for 0 x 8 These test cases will exercise the program to detect any errors, particularly those that are off by one. For simplicity we will assume that the correct input values are known. A generic genetic algorithm [11] SimpleGeneticAlgorithm ( ) initialise population ; evaluate population ; 0 would be encoded as their binary equivalent, and concatenated: 1000 0111 0011 0001 0000. The second task is to write a function to compare the relative merit of chromosomes... The fallowing pseudo code shows, pseudo-code for fitness calculation over an encoding of five bytes, each representing an input integer. A set of chromosomes goes to make up the population of the algorithm. Our algorithm is started with a randomly generated population of chromosomes. Evaluation of the fitness of a chromosome met ) nextgeneration ; while ( termination criteria not select solutions for Int fitness ( Chromosome input ) int fitness = 0 ; int [ ] ideal = new int [A : E ] ;// Array of ideal inputs A to E. 45

int [ ] actual = input. to Array ( ) ; // Retrieved at a from chromosome. ; for ( int i = A : E ; i++) fitness = absolute ( actual ideal ) return fitness ; should make it clear that the bit string is simply crossed, as the name suggests. Depending on the encoding, crossing in the middle of the chromosome may not be likely to give rise to fit progeny, where this is the case other points may be chosen, or indeed more than one point. Another common solution is to select a random point up to the length of the chromosome, and cross there. Crossover pseudo-code Chromosome crossover ( Chromosome parentx, Chromosome parenty ) int c r o s s P o i n t = 8 ; String x First Half = parentx. substring ( 0, cross Point ) ; Fig.2 Diagram of crossover. Crossover is the operator used to reproduce chromosomes. This works by taking a pair of encoded chromosomes - the parents - and combining them to produce two different chromosomes - the progeny. When applied across two fit chromosomes this method aims to produce progeny that have inherited the best attributes of its parents, though this is not always the case. To illustrate the principal, let s consider two chromosomes and assume a central crossover: If the parents are 0011 and 1100 the two progeny will be 0000 and 1111 respectively - see Figure 4. This String x Second Half = parentx. sub string ( cross Point, parentx. length ) ; String y First Half = parenty. sub string ( 0, cross Point ) ; String y Second Half = parenty. sub string ( cross Point, parenty. length ) ; Chromosome crossed X = x First Half + y Second Half ; Chromosome crossed Y = y First Half + x Second Half ; Mutation is essential to a true genetic algorithm. In popular culture mutation is often viewed in a negative light - simply consider how many horror films are based around some kind of mutant! In fact without mutation neither the world as we know it 46

nor our algorithms would evolve efficiently. Mutation is defined as a minimal change to a chromosome, so when one is using a binary string representation often a single bit is flipped. These changes are usually applied at the end of each generation before the breeding pool and population are combined again, but only with a very small probability of each chromosome being affected. If this was not done then no new genetic information would be produced after the initial population - note that crossover doesn t create anything, rather just recombine existing chromosomes. Without new chromosomes the algorithm is likely to cease with a suboptimal population, or run infinitely never converging on a solution. If, on the other hand, mutation levels are set too high the stream of new chromosomes could be too large, disrupting any convergent progress. If mutation was set to affect every chromosome in each generation and crossover removed, then the search has become completely stochastic. Mutation pseudo-code Chromosome mutate (Chromosome) int randomvalue = new Random( Chromosome. length ) ; i f ( Chromosome. valueat ( randomvalue ) == 0 ) == 1 ; Chromosome. valueat ( randomvalue ) else Chromosome. valueat ( randomvalue ) == 0 ; return Chromosome ; 3.1 Mathematical model Chromosome is the logical unit of information transmission to the next generation [12]. The definition of a chromosome can be taken a little deeper. Usually the chromosome holds a binary encoding of the optimization subject. Where this is the case the genetic algorithm is considered discrete, as clearly only a set number of values can be assumed. In some cases the encoding involves the real numbers instead, creating a continuous genetic algorithm. In other cases, such as modeling temperature, the use of a continuous chromosome is more appropriate. For natural selection to take place, some way of comparing one chromosome to the other must be available. In the algorithm this is modeled as a fitness or cost function, where a lower cost chromosome is favored over a higher cost. Cost function is mapping such that: chromosome R, where a value closer to zero shows a better optimized chromosome. The formalization that follows has been drawn from work by B ck [1] and Vose [9]. Wea begin by considering the algorithm at the highest level. It can be considered a finite state machine, where each state represents an arbitrary generation of the population at a time t. Between these states there is a transition, τ, to the next generation. The algorithm can be considered as a function with parameters, as shown in Equation (1). 47

Genetic Algorithm = (I, Φ, Ω, s, µ, λ, τ, ι) (1) In this representation, the following notation is used: I is the space of chromosomes, or the underlying search space. Each chromosome is of length l. Φ is a cost function I R. Ω represents a set of probabilistic genetic operators. We will specify these shortly. s represents a deterministic selection operator. A side affect of this operator is ensuring population size remains constant. µ is the number of parent individuals to include in reproduction. λ is the number of offspring individuals from reproduction. τ represents the complete process of transitioning from one generation to the next. This will be expanded shortly. ι represents an arbitrary termination condition. Initialization of I is carried out by a function randomly sampling the range Z (0,2). This is done l µ times. To relate this model to the finite state machine outlined above, we will clarify the operation of τ. Consider the population P at a generation t, P (t): with an average cost value, a relative improvement or a threshold standard deviation of the population. We now return our analysis to the genetic operators, Ω. In the set of reproductive functions we have recombination (crossover), and mutation. These can be considered as sexual and asexual operators respectively, characterized by the number of input chromosomes used. Because of its simpler nature we will first consider mutation. It can be modeled as a function ω: I p I q, where the chromosome I is shown as a binary vector. This means an arbitrary I can be shown as (a1,..., al ), where l is the length of the binary string. Mutation is the smallest unique change that can be made to a chromosome. By Definition of the mutation over a binary chromosome should be the random change of one bit. To model this, a random bit should be selected in the chromosome, 0 k l, k N, and that bit flipped. The function then looks like this: (a1,..., al ) (a1,..., ak..., al ) (3) Crossover is recombination of two chromosomes without loss of information. Crossover works by taking two chromosomes and swapping over the values after a random cross point. To do this, once again a random is selected, 0 k l, k N, and the function looks like this: t 1 : P (t + 1) = τ (P (t)) (2) The termination condition, ι, can be as simple or complex as required. For this analysis we will assume it is simply a maximum generation count, and that functionality to maintain this count is provided. In implementation this can be combined (a1,..., al ), (b1,..., bl ) (a1,..., ak, bk+1,...bl ), (b1,..., bk, ak+1,...al ) (4) Elitism is a property preventing current best chromosomes participating in mutation. Many algorithms implement elitism, as it prevents the fitness of a population decaying. If the population is in a suboptimal area of the search space the best solution is retained until mutation makes a 48

selection closer to the global optimum. From here normal evolution can continue. 3.2 Example run To illustrate better the operation of a genetic algorithm, we shall dry run an own example. For simplicity we will use a five by five grid, as shown in Figure 3. The optimal square is shaded in the centre. Chromosomes. Four chromosome will be defined, each of which is a coordinate x,y on the grid. Cost function. The number of squares the chromosome is away from the optimum square is used. Fig.4 End of generation one. In generation one the initial population is shown in Figure 4. To progress to generation two, the two best chromosomes are unchanged, and two new ones created by crossover. This is shown in Figure 5. Clearly the chromosome of cost one is selected, and as the remaining three have the same cost, we select the first, 5, and 4. The same process is iterated again, creating generation three, and giving rise to one optimal chromosome. This is in Figure 6. (a) Chromosomes. (b) Search space. Fig.3 Search space. Crossover. Two most optimal chromosomes go to the next generation unchanged, two new ones are created as: (x1, y1 ), (x2, y2 ) (y2, x1 ), (y1, x2 ) (5) Mutation. Not implemented. Initial population. Randomly instantiated. Elitism. Implemented. Fig.5 End of generation two. (a) Chromosomes. (b) Search space. (a) Chromosomes. (b) Search space. 49

The new population is generated and undergoes the same process it maximum number of generation is reached. 4.1 Selection process Selection is used for choosing the best individuals, that is, for selecting those chromosomes with higher fitness values. The Fig.6 End of generation three. One can see from this that the search space is systematically sampled, the best chromosomes selected, and their traits passed on into the next generation. The algorithms do work without mutation being implemented; in this case it was left out in the interests of minimizing generation count. In the case of a larger example it would become necessary to prevent the evolution stagnating in a suboptimal area, as without this convergence cannot be proven. 4. Flow of Genetic algorithm Initially the initial population is selected randomly from the sample space which has many populations. The fitness value is calculated for each chromosome in each population and is sorted out. In selection process two parent chromosomes are selected through tournament method. The Crossover forms new offspring (children) from the parent chromosomes using single point probability. Mutation mutates the new offspring using uniform probability measure. In elitism selection the best solution are passed to the further generation. selection operation takes the current population and produces a mating pool which contains the individuals which are going to reproduce. There are several selection methods, like biased selection, random selection, roulette wheel selection, tournament selection. In this work the following selection mechanisms are used. 4.2 Tournament Selection Tournament selection has been used in this as it selects optimal individuals from diverse groups. It selects t individuals from the current population uniformly at random, forms a tournament and the best individual of a group wins the tournament and is put into the mating pool for recombination. This process is repeated the number of times necessary to achieve the desired size of intermediate population. The tournament size controls the selection strength. The larger the tournament size, the stronger is the selection process. 4.3 Elitist Selection 50

In order to make sure that the best individuals of the solution are passed to further generations, and should not be lost in random selection, this selection operator is used. So we used a few best chromosomes from each generation, based on the higher fitness value and are passed to the next generation of population. 4.4 Reproduction by this procedure for the population, since only the best organisms from the first generation are selected for breeding, along with a small proportion of less fit solutions, for reasons already mentioned above. Although Crossover and Mutation are known as the main genetic operators, it is possible to use other operators such as regrouping, colonization-extinction, or migration in genetic algorithms. To generate a second generation population of solutions from those selected through genetic operators: crossover (also called recombination), and/or mutation. For each new solution to be produced, a pair of "parent" solutions is selected for breeding from the pool selected previously. By producing a "child" solution using the above methods of crossover and mutation, a new solution is created which typically shares many of the characteristics of its "parents". New parents are selected for each new child, and the process continues until a new population of solutions of appropriate size is generated. Although reproduction methods that are based on the use of two parents are more "biology inspired", some research suggests more than two "parents" are better to be used to reproduce a good quality chromosome. These processes ultimately result in the next generation population of chromosomes that is different from the initial generation. Generally the average fitness will have increased 4.5 Termination This generational process is repeated until a termination condition has been reached. Common terminating conditions are: A solution is found that satisfies minimum criteria Fixed number of generations reached Allocated budget (computation time/money) reached The highest ranking solution's fitness is reaching or has reached a plateau such that successive iterations no longer produce better results Manual inspection Combinations of the above 5. Conclusion This method proves accurate in deducting fraudulent transaction and minimizing the number 51

of false alert. Genetic algorithm is a novel one in this literature in terms of application domain. If this algorithm is applied into bank credit card fraud detection system, the probability of fraud transactions can be predicted soon after credit card transactions. And a series of anti-fraud strategies can be adopted to prevent banks from great losses and reduce risks. The objective of the study was taken differently than the typical classification problems in that we had a variable misclassification cost. As the standard data mining algorithms does not fit well with this situation we decided to use multi population genetic algorithm to obtain an optimized parameter. Future Enhancements The findings obtained here may not be generalized to the global fraud detection problem. As future work, some effective algorithm which can perform well for the classification problem with variable misclassification costs could be developed. REFERENCES [1] Alan C Schultz. Learning robot behaviours using genetic algorithms. Navy Center for Applied Research in Artificial Intelligence, Naval Research Laboratory, Washington, 1994. [2] Charbonneau P., Genetic Algorithms in Astronomy and Astrophysics High Altitude Observatory. National Center for Athmospheric Research, pp. 309-334, 1995. [3] Dr Markus Roggenbach. CS364 Software testing slides. Swansea University, 2011. [4] Heinrich Braun. On solving travelling salesman problems by genetic algorithms. Springer Berlin / Heidelberg, 1991. [5] Holland J., Adaptation in Natural and Artificial Systems. Ann Harbor, MI: University of Michigan Press. 1975. [6] Jean-Yves Potvin. Genetic Algorithms for the Travelling Salesman Problem. Centre de Recherche sur les Transports, 1996. [7] Kaya M., Autonomous Classifiers with Understanable Rule Using Multi-objective Genetic Algorithms. Expert Systems with Applications. Vol. 37, no. 4, pp.3489-3494, 2009. [8] Levi M., Burrows J., Fleming M., Hopkins M., The Nature, Extent and Economic Impact of Fraud in the UK. Report for the Association of Chief Police Officers' Economic Crime Portfolio. 2007. [9] Michael Vose. The Simple Genetic Algorithm. Massachusetts Institute of Technology, 1999. [10] Minaei-Bidgoli B., Kashy D., Kortemeyer G., Punch W., Predicting Student Performance: An Application of Data Mining Methods with the Educational Web-based System LON CAPA. Proceedings of ASEEIIEEE Frontiers in Education Conference. 2003. [11] Srinivas M., Patnaik L., Genetic algorithms - a survey. IEEE Computer Society, 1994. [12] Thomas Back. Evolutionary Algorithms in Theory and Practice. Oxford University Press,1996. [13] UK Statute Law. Data Protection Act 1998. Office of Public Sector Information, 1998. [23] Wojciech Paszkowicz. Genetic Algorithms, a Nature-Inspired Tool: Survey of Applications in 52

Materials Science and Related Fields. Taylor and Francis Group, 2009. 53