Detecting Auto Insurance Fraud by Data Mining Techniques



Similar documents
Review: Classification Outline

Ordinal Classification Method for the Evaluation Of Thai Non-life Insurance Companies

I. Chi-squared Distributions

Department of Computer Science, University of Otago

CHAPTER 3 THE TIME VALUE OF MONEY

INVESTMENT PERFORMANCE COUNCIL (IPC) Guidance Statement on Calculation Methodology

Confidence Intervals for One Mean

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

INVESTMENT PERFORMANCE COUNCIL (IPC)

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

Lesson 15 ANOVA (analysis of variance)

Introducing Your New Wells Fargo Trust and Investment Statement. Your Account Information Simply Stated.

Modified Line Search Method for Global Optimization

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

Trading rule extraction in stock market using the rough set approach

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

CCH CRM Books Online Software Fee Protection Consultancy Advice Lines CPD Books Online Software Fee Protection Consultancy Advice Lines CPD

Domain 1: Designing a SQL Server Instance and a Database Solution

LECTURE 13: Cross-validation

Hypothesis testing. Null and alternative hypotheses

How To Extract From Data From A College Course

Evaluating Model for B2C E- commerce Enterprise Development Based on DEA

How To Solve The Homewor Problem Beautifully

ODBC. Getting Started With Sage Timberline Office ODBC

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

Determining the sample size

Hypergeometric Distributions

Generalization Dynamics in LMS Trained Linear Networks

A probabilistic proof of a binomial identity

Asymptotic Growth of Functions

Data Mining Techniques in Fraud Detection

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

1 Computing the Standard Deviation of Sample Means

Lecture 2: Karger s Min Cut Algorithm

Chatpun Khamyat Department of Industrial Engineering, Kasetsart University, Bangkok, Thailand

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Engineering Data Management

Reliability Analysis in HPC clusters

Automatic Tuning for FOREX Trading System Using Fuzzy Time Series

Chapter 7 Methods of Finding Estimators

Detecting Voice Mail Fraud. Detecting Voice Mail Fraud - 1

Pre-Suit Collection Strategies

The Impact of Feature Selection on Web Spam Detection

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

facing today s challenges As an accountancy practice, managing relationships with our clients has to be at the heart of everything we do.

Lesson 17 Pearson s Correlation Coefficient

Output Analysis (2, Chapters 10 &11 Law)

MTO-MTS Production Systems in Supply Chains

Chapter XIV: Fundamentals of Probability and Statistics *

How to read A Mutual Fund shareholder report

Your organization has a Class B IP address of Before you implement subnetting, the Network ID and Host ID are divided as follows:

Statistical inference: example 1. Inferential Statistics

summary of cover CONTRACT WORKS INSURANCE

Properties of MLE: consistency, asymptotic normality. Fisher information.

Systems Design Project: Indoor Location of Wireless Devices

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

Plug-in martingales for testing exchangeability on-line

Soving Recurrence Relations

Study on the application of the software phase-locked loop in tracking and filtering of pulse signal

Clustering Algorithm Analysis of Web Users with Dissimilarity and SOM Neural Networks

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

DAME - Microsoft Excel add-in for solving multicriteria decision problems with scenarios Radomir Perzina 1, Jaroslav Ramik 2

ANALYTICS. Insights that drive your business

1 Correlation and Regression Analysis

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

A Mathematical Perspective on Gambling

Incremental calculation of weighted mean and variance

ADAPTIVE NETWORKS SAFETY CONTROL ON FUZZY LOGIC

5 Boolean Decision Trees (February 11)

JJMIE Jordan Journal of Mechanical and Industrial Engineering

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

CHAPTER 3 DIGITAL CODING OF SIGNALS

Research Article Sign Data Derivative Recovery

Spam Detection. A Bayesian approach to filtering spam

SEQUENCES AND SERIES

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

Data Analysis and Statistical Behaviors of Stock Market Fluctuations

Chair for Network Architectures and Services Institute of Informatics TU München Prof. Carle. Network Security. Chapter 2 Basics

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments

Institute of Actuaries of India Subject CT1 Financial Mathematics

Entropy of bi-capacities

AN INTELLIGENT MODEL FOR SALES AND INVENTORY MANAGEMENT

Forecasting. Forecasting Application. Practical Forecasting. Chapter 7 OVERVIEW KEY CONCEPTS. Chapter 7. Chapter 7

FM4 CREDIT AND BORROWING

A Fuzzy Model of Software Project Effort Estimation

Baan Service Master Data Management

5: Introduction to Estimation

BENEFIT-COST ANALYSIS Financial and Economic Appraisal using Spreadsheets

Effective Hybrid Intrusion Detection System: A Layered Approach

Transcription:

Detectig Auto Isurace Fraud by Data Miig Techiques Rekha Bhowmik Computer Sciece Departmet Uiversity of Texas at Dallas, USA rxb080100@utdallas.edu ABSTRACT The paper presets fraud detectio method to predict ad aalyze fraud patters from data. To geerate classifiers, we apply the Naïve Bayesia Classificatio, ad Decisio Tree-Based algorithms. A brief descriptio of the algorithm is provided alog with its applicatio i detectig fraud. The same data is used for both the techiques. We aalyze ad iterpret the classifier predictios. The model predictio is supported by Bayesia Naïve Visualizatio, Decisio Tree visualizatio, ad Rule-Based Classificatio. We evaluate techiques to solve fraud detectio i automobile isurace. Keywords: Rule-based Algorithm, Bayesia Network, C4.5, Fraud Detectio 1. INTRODUCTION There are a umber of data miig techiques like clusterig, eural etworks, regressio, multiple predictive models. Here, we discuss oly few techiques of data miig which would be cosidered importat to hadle fraud detectio. Data Miig is associated with (a) supervised learig based o traiig data of kow fraud ad legal cases ad (b) usupervised learig with data that are ot labeled to be fraud or legal. Bedford s law ca be iterpreted as a example of usupervised learig [1]. Isurace fraud, credit card fraud, telecommuicatios fraud, ad check forgery are some of the mai types of fraud. Isurace fraud is commo i automobile, travel. Fraud detectio ivolves three types of offeders: i) Crimial offeders, ii) orgaized crimial offeders who are resposible for major fraud, ad iii) offeders who commit fraud (called soft fraud) whe sufferig from fiacial hardship. Soft fraud is the hardest to lesse because the cost for each suspected icidet is usually higher tha the cost of the fraud. Types i) ad ii) offeders, called hard fraud, avoid ati-fraud measures [2]. We preset data miig techiques which are most appropriate for fraud aalysis. We preset automobile isurace example. Here, the data miig techiques used for fraud aalysis are: i) Bayesia etwork, ad ii) Decisio tree. Bayesia etwork is the techique used for classificatio task. Classificatio, give a set of predefied categorical classes, determies which of these classes a specific data belogs to. Decisio trees are used to create descriptive models. Descriptive models are created to describe the characteristics of fault. The remaider of this paper is orgaized as follows. I Sectio 2, we preset the existig fraud detectio systems ad techiques. Sectio 3 icludes the algorithms ad applicatio. Sectio 4 presets the model. Fially, i sectio 5, we discuss the importat features of our work. 2. EXISTING FRAUD DETECTION SYSTEMS The hot spots methodology[3] performed a three step process: i) k-meas clusterig algorithm for cluster detectio is used because the other clusterig algorithms ted to be expesive for very large datasets, ii) C4.5 algorithm, the resultig decisio tree ca be coverted to a rule set ad prued, ad iii) visualizatio tools for rule evaluatio, buildig statistical summaries of the etities associated with each rule. The credit fraud model[4] suggested a classificatio techique with fraud/legal attribute, ad a clusterig followed by a classificatio techique with o fraud/legal attribute. Kohoe's Self-Orgaizig Feature Map [5] was used to categorize automobile ijury claims depedig o the type of fraud. Classificatio techiques have proved to be very effective i fraud detectio[6] ad therefore, ca be applied to categorize crime data. The distributed data miig model[6] uses a realistic cost model to evaluate C4.5, CART, ad aïve Bayesia classificatio models. The method was applied to credit card trasactios. The eural data miig approach[7] uses rule-based associatio rules to mie symbolic data. The approach discusses the importace of use of o-umeric data i fraud detectio. SAS Eterprise Mier Software[8] depeds o associatio rules, cluster detectio ad classificatio techiques to detect fraudulet claims. The Bayesia Belief Network (BBN) ad Artificial Neural Network (ANN) study used the STAGE algorithm for BBN i fraud detectio ad backpropagatio for ANN[9]. The result shows that BBNs were much faster to trai, but were slower whe applied to ew istaces. The ASPECT group[10] focused o eural etworks to trai curret user profiles ad user profiles histories. A caller s curret profile ad the profile history are compared to fid probable fraud. [11] build o the adaptive fraud detectio framework[12, 13], by applyig a evet-drive approach of assigig fraud scores to detect fraud. The [11] framework ca also detect types of fraud usig rules. [14] 156

used dyamic BBNs called Mass Detectio tool to detect fraudulet claims, which the used a rule geerator called Suspicio Buildig Tool. Iteral fraud detectio cosists i determiig fraudulet fiacial reportig by maagemet[15], ad abormal retail trasactios by employees[16]. There are four types of isurace fraud detectio: home isurace[17], crop isurace [18], automobile isurace fraud detectio[19], ad health isurace[20]. A sigle meta-classifier[21] is used to select the best base classifiers, ad the combied with these base classifiers predictios to improve cost savigs. Credit card fraud detectio refers to screeig credit applicatios, ad/or logged credit card trasactios [22]. Credit trasactioal fraud detectio has bee preseted by [22]. Literature focus o video-o-demad websites[23] ad IP-based telecommuicatio services[24]. Olie sellers[25] ad olie buyers[26] ca be moitored by automated systems. Fraud detectio i govermet orgaisatios such as tax[27] ad customs[28] has also bee reported. 2.1 Bayesia Belief Networks Naïve Bayesia classificatio assumes that the attributes of a istace are idepedet, give the target attribute[29]. The aim is to assig a ew istace to the class that has the highest posterior probability. The algorithm is very effective ad ca give better predictive accuracy whe compared to C4.5 decisio trees ad backpropagatio 2.2 Decisio Trees Decisio trees are machie learig techiques that express idepedet attributes ad a depedet attribute i a tree-shaped structure. Classificatio rules, extracted from decisio trees, are IF-THEN expressios i which the precoditios are logically ANDed ad all the tests have to succeed if each rule is to be geerated. The related applicatios iclude the aalysis of istaces from drug smugglig, govermetal fiacial trasactios[30], ad customs declaratio fraud[28] to more serious crimes such as drug related homicides, serial sex crimes[31], ad homelad security[31, 30]. C4.5 [32] is used to divide data ito segmets based ad to geerate descriptive classificatio rules that ca be used to classify a ew istace. C4.5 ca help to make predictios ad to extract crime patters. It geerates rules from trees [33] ad hadles umeric attributes, missig values, pruig, ad estimatig error rates. The learig ad classificatio steps are geerally fast. However, performace decrease ca occur whe C4.5 is applied to large datasets. C5.0 shows margial improvemets to decisio tree iductio. 3. APPLICATION The steps i crime detectio are: i) classifiers, ii) itegrate multiple classifiers, iii) ANN approach to clusterig, ad iv) visualizatio techiques to describe the patters. 3.1 Bayesia Network For the purpose of fraud detectio, we costruct two Bayesia etworks to describe the behavior of auto isurace. First, a Bayesia etwork is costructed to model behavior uder the assumptio that the driver is fraudulet ad aother model uder the assumptio the driver is a legal. The fraud et is set up by usig expert kowledge. The legal et is set up by usig data from legal drivers. By isertig evidece i these etworks, we ca get the probability of the measuremet E uder two above metioed hypotheses. This meas, we obtai judgmets to what degree a observed user behavior meets typical fraudulet or legal behavior. These quatities we call P(E output = legal) ad P(E output = fraud). By postulatig the probability of fraud P(output = fraud ) ad P(output = legal) = 1 - P(output = fraud ) i geeral ad by applyig Bayes rule, we get the probability of fraud, give the measuremet E, P(output = fraud E) = P(output = fraud ) P(E output = fraud) / P(E) where, the deomiator P(E) ca be calculated as: P(E) = P(output = fraud) P(E output = fraud) + P(output = legal) P(E output = legal) The chai rule of probabilities is: Suppose there are two outputs O 1, O 2 for fraud ad legal respectively. Give a istace E = (E 1, E 2,, E ), each row is represeted by a attribute A = (A 1, A 2,, A ) The classificatio is to derive the maximum P(O i X) which ca be derived from Bayes theorem. 3.1.1 Applicatio We preset Bayesia learig algorithm to predict occurrece of fraud. Cosider the two output attributes, fraud ad legal. The geeral equatio for computig the probability that the output attribute is legal or fraud is: i) P(output = fraud E) = [P(E output = fraud) P(output = fraud)] / P(E P(output = legal E) = [P(E output = legal) P(output = legal)] / P(E) ii) The a priori probability, show as P(output=fraud), is the probability of a fraud customer without kowig the history of the istace. Here, the a priori probability is the fractio of the total populatio that is fraud, that is: P(fraud) = d i / d d is the total populatio ad d i is the umber of fraud. iii) A simplified assumptio of o depedet relatioships betwee attributes is made. Thus, 157

P(E output = fraud) = k1 P(x k output = fraud) P(E output = legal) = P(x k output = legal) k1 The probabilities P(x 1 output = fraud), P(x 2 output = fraud) ca be estimated from the database usig: P(x k output = fraud) = d ik / d i Here, d i is the umber of records for output fraud ad d ik is the umber of records of output class fraud havig the value x k for the attributes. iv) Repeat step iii) for computig P(E output = legal) [P(E output = fraud) P(output = fraud)] ad [P(E output = legal) P(output = legal)] eed to be optimized as P(E ) is costat. Cosider the data i Table 1, which is a subset of auto isurace database. We use Output attribute whose value is to be predicted. E= (policyholder = 1, driverratig = 0, reportfiled = 0.33) to be either fraud or legal. P(fraud) = d i / d = 3/20 = 0.15 P(legal) = d i / d = 17/20 = 0.85 From step iii) of the algorithm, P(policyHolder = 1/ output=fraud) = 3/3 = 1 P(E output = fraud) = P(x k output = fraud) = 0 k1 From step iv) of the algorithm, P(policyholder = 1/ output=legal) = 12/17= 0.706 P(E output = fraud) = P(x k output = legal) k1 = 0.0068 Therefore, [P(E output = fraud) P(output = fraud)] = 0 [P(E output = legal) P(output = legal)] = 0.0058 Based o these probabilities, we classify the ew tuple as legal. The probabilities for P(E output = fraud) is always 0. The Laplace estimator improves the value by addig 1 to the umerator ad the total umber of attribute value types to the deomiator of P(E output = fraud) ad P(E output = fraud) [33]. 0.0026 Based o step iii) of the algorithm, P(policyHolder = 1/ output=fraud) = 0.8 From step iv) of the algorithm, P(policyholder = 1/ output=legal) = 0.684 [P(E output = fraud) P(output = fraud)] = [P(E output = legal) P(output = legal)] = 0.0016 Thus, istace E is more likely to be Fraud. Likelihood of beig legal =0.0351 Likelihood of beig fraud = 0.050 We estimate P(E) by summig up these idividuals likelihood values sice E will be either legal of fraud: P(E) = 0.0351 + 0.050 = 0.0851 Fially, we obtai the actual probabilities of each evet: P(output = legal E) = (0.039 *0.9)/ 0.0851= 0.412 P(output = fraud E) = (0.500 *0.1)/ 0.0851= 0.588 Bayesia classifier ca hadle missig values i traiig datasets. To demostrate this, seve missig values appear i dataset. The Naïve Bayes approach is easy to use ad oly oe sca of the data is required. The approach ca hadle missig values by simply omittig that probability whe calculatig the likelihoods of membership i each class. 3.2 Decisio Tree-Based Algorithm Solvig the classificatio problem is a two-step process: i) decisio tree iductio- costruct a Decisio Tree(DT), ad ii) apply the DT to determie its class. Rules ca be geerated that are easy to iterpret. The basic algorithm for decisio tree is as follows: i) Suppose there are two outputs, fraud ad legal. The tree starts as a sigle ode N represetig the dataset. If the istaces are of the same type fraud, the the ode becomes a leaf ad is labeled as fraud. ii) Otherwise, the algorithm uses a Etropy, Gii Idex, ad Classificatio Error to measure degree of impurity for selectig the attribute that will best separate the data ito idividual classes. iii) Etropy is calculated as the sum of the coditioal probabilities of a evet (p i ) times its iformatio required for the evet i subsets (b i ). Note that b i = - log 2 p i i the cases of a simple (biary) split ito two classes. Etropy(p 1,p 2,...,p ) = p 1 * b 1 + p 2 * b 2... + p * b = - p 1 logp 1 - p 2 logp 2... - p logp Table 1a. Data for Bayes Classifier istace Policy Driver Report Output Holder Ratig Filed 1 1 0 0 legal 2 1 1 1 fraud 3 0 0 0 legal 4 1 0.33 1 legal 5 1 0.66 0 legal E 1 0 0.33? 158

Table 1b. Data for Bayes Classifier istace Policy Driver Report Vehicle Output Holder Ratig Filed AgePrice 1 1 0 0 0.33 legal 2 1 1 1 0.5 fraud 3 0 0 0 0.75 legal 4 1 0.33 1 0.5 legal 5 1 0.66 0 0.5 legal E 1 0 0.33 0.5? 3.2.1 C4.5 Algorithm The Etropy, or expected iformatio eeded to classify a give istace is: P(fraud, legal)= (fraudistaces / Istaces) log 2 (fraudistaces / Istaces) (legalistaces / Istaces) log 2 (legalistaces / Istaces) Expected iformatio or etropy by attribute: E(A)= [{(fraudattributes / Istaces) + (legalattributes/ Istaces)} * {E(fraudAttributes, legalattributes)}] iv) The value (or cotributio to iformatio) of a attribute is calculated as gai(attr) = (iformatio before split) - (iformatio after split) Expected reductio i etropy is: gai(attr) = Etropy of paret table E(A) The algorithm computes the iformatio gai of each attribute. The attribute with the highest iformatio gai is the oe selected for test attribute. v) A brach is created for each kow value of the test attribute. The algorithm uses the same process iteratively to form a decisio tree at each partitio. Oce a attribute has occurred at a ode, it eed ot be cosidered i ay of the ode s descedets. vi) The iterative partitioig stops whe oe of the coditios is true: a) all examples for a give ode belog to the same class, or b) there are o remaiig attributes o which samples may be further partitioed, ad c) there are o samples for the brach test-attribute. 3.2.2 Applicatio From Table 1b, the probability of each output class is: etropy = -0.1 log (0.1) 0.9log(0.9) = - 0.1*3.32-0.9* 0.152 =0.469 E(vehicleAgePrice) = (9/20) etropy(1, 8) = (9/20) (-1/9 log 2 1/9-8/9 log 2 8/9) =.225 The iformatio gai of attribute VehicleAgePrice is computed as follows: 0.469 [(9/20) (-1/9 log 2 1/9-8/9 log 2 8/9)] = 0.244 prob(output = fraud) = 2/20 = 0.1 2 gii idex = 1 (prob) j j = (0.1 2 + 0.9 2 ) = 0.18 Classificatio error = 1- max{prob j } = 1- max{0.1, 0.9} = 0.9 Etropy, Gii Idex, ad Classificatio Error Idex of sigle class is zero. They reach maximum value whe all the classes i the table have equal probability. The attribute VehicleAgePrice has four values. Based o step v) of C4.5 algorithm, a decisio tree ca be created. Each ode is either i) a leaf ode - (output class), or ii)a decisio ode 3.3 Rule Based Algorithm Oe way to perform classificatio is to geerate if-the rules. 3.3.1 Geeratig Rules from a Decisio Tree The followig rules are geerated for the Decisio Tree: If (driver_age 40) ) (driver_ratig =1) ) (vehicle_age =2), the class = fraud If (driver_age > 40) ) (driver_age 50) ) (driver_ratig = 0.33), the class = legal 4. MODEL PERFORMANCE Cofusio Matrix There are two ways to examie the performace of classifiers: i) cofusio matrix, ad ii) to use a ROC graph. Give a class, C j, ad a tuple, t i, that tuple may or may ot be assiged to that class while its actual membership may or may ot be i that class. With two classes, there are four possible outcomes with the classificatio as: i) true positives (hits), ii) false positives (false alarms), iii) true egatives (correct rejectios), ad iv) false egatives. Table 2a, cotais iformatio about actual ad predicted classificatios. Performace is evaluated usig the data i the matrix. Table 2b shows cofusio matrix built o simulated data. The model commits some errors ad has a accuracy of 78%. We also applied the model to the same data, but to the egative class with respect to class skew i the data. The quality of a model highly depeds o the choice of the test data. A umber of model performace metrics ca be derived from the cofusio matrix. Table 2a. Cofusio Matrix Observed legal fraud predicted legal TP FP fraud FN TN 159

Table 2b. Cofusio matrix of a model applied to test dataset Observed legal fraud accuracy: 0.78 predicted legal 3100 1125 recall: 0.86 fraud 395 2380 precisio: 0.70 The accuracy determied i (Table 2b) may ot be a adequate performace measure whe the umber of egative cases is much greater tha the umber of positive cases. Suppose there are 1500 cases, 1460 of which are egative cases ad 40 of which are positive cases. If the system classifies them all as egative, the accuracy would be 97.3%, eve though the classifier missed all positive cases. Other performace measures are geometric mea (g-mea), ad F-Measure. For calculatig F-measure, β has a value from 0 to ad is used to cotrol the weight assiged to TP ad P. Ay classifier evaluated usig g- mea or F-measure will have a value of 0, if all positive cases are classified icorrectly. To easily view ad uderstad the output, visualizatio of the results is helpful. Naïve Bayesia visualizatio provides a iteractive view of the predictio results. The attributes ca be sorted by the predictor ad evidece items ca be sorted by the umber of items i its storage bi. Attribute colum graphs help to fid the sigificat attributes i eural etworks. Decisio tree visualizatio builds trees by splittig attributes from C4.5 classifiers. Cumulative gais ad lift charts are visual aids for measurig model performace. Lift is a measure of a predictive model calculated as the ratio betwee the results obtaied with or without the predictive model. For istace, if 105 of all samples are actually fraud ad a aïve Bayesia classifier could correctly predict 20 fraud samples per 100 samples, the that correspods to a lift of 4. Table 3c: Performace metrics model performace metrics Accuracy(AC) Recall or true positive rate(tp) False positive rate(fp) True egative rate(tn) False egative rate(fn) Precisio(P) geometric mea(g-mea) F-measure Classificatio models are ofte evaluated o accuracy rates, error rates, false egative rates, ad false positive rates. Table 3 shows that True Positives (hits) ad False Positives (false alarms) require cost per ivestigatio. False alarms cost are the most expesive because both ivestigatio ad claim costs are required. False Negatives (misses) ad True Negatives(correct rejectio) are the cost of claim. Table 3: Cost/ Beefit Decisio Summary of Predictios fraud True Positive(Hit) cost = umber of hits * average cost per ivestigatio False Negative(miss) cost = umber of misses * average cost per claim 5. CONCLUSIONS legal False Positive(False alarm) cost =umber of false alarms * (Average cost per ivestigatio + average cost per claim) True Negative(correct rejectio) cost = umber of correct rejectio claims * average cost per claim We studied the existig fraud detectio systems. To predict ad preset fraud we used Naïve Bayesia classifier ad Decisio Tree-Based algorithms. We looked at model performace metrics derived from the cofusio matrix. Performace metrics such as accuracy, recall, ad precisio are derived from the cofusio matrix. It is strog with respect to class skew, makig it a reliable performace metric i may importat fraud detectio applicatio areas. REFERENCES [1] Bolto, R., Had, D.: Statistical Fraud Detectio: A Review. Statistical Sciece 17(3): 235--255(2002). [2] Sparrow, M. K.: Fraud Cotrol i the Health Care Idustry: Assessig the State of the Art, i Shichor et al(eds), Readigs i white-collar Crime, Wavelad Press, Illiois(2002). [3] Williams, G.: Evolutioary Hot Spots Data Miig: A Architecture for Explorig for Iterestig Discoveries. I: 3rd Pacific-Asia Coferece i Kowledge Discovery ad Data Miig, Beijig, Chia(1999). [4] Groth, R.: Data Miig: A Hads-o Approach for Busiess Professioals, Pretice Hall, pp. 209-212(1998). 160

[5] Brockett, P., Derrig, R., Golde, L., Levie, A. & Alpert, M.: Fraud Classificatio usig Pricipal Compoet Aalysis of RIDITs. Joural of Risk ad Isurace 69(3): 341-371(2002). [6] Che, R., Chiu, M., Huag, Y., Che, L.: Detectig Credit Card Fraud by Usig Questioaire-Respoded Trasactio Model Based o Support Vector Machies. I: IDEAL2004, 800--806(2004). [7] Brause, R., Lagsdorf, T., Hepp, M.: Neural Data Miig for Credit Card Fraud Detectio. I: 11th IEEE Iteratioal Coferece o Tools with Artificial Itelligece(1999). [8] SAS, e-itelligecedata Miig i the Isurace idustry: Solvig Busiess problems usig SAS Eterprise Mier Software. White Paper(2000). [9] Maes, S., Tuyls, K., Vaschoewikel, B. & Maderick, B.: Credit Card Fraud Detectio usig Bayesia ad Neural Networks. Proc. of the 1st Iteratioal NAISO Cogress o Neuro Fuzzy Techologies (2002). [10] Weatherford, M.: Miig for Fraud. I: IEEE Itelliget Systems(2002). [11] Cahill, M., Che, F., Lambert, D., Piheiro, J. & Su, D.: Detectig Fraud i the Real World. Hadbook of Massive Datasets 911-930(2002) [12] Fawcett, T.: ROC graphs: Notes ad practical cosideratios for researchers. Machie Learig, 3(2004). [13] Fawcett, T., Flach, P. A.: A respose to web ad Tig s o the applicatio of ROC aalysis to predict classificatio performace uder varyig class distributios. Machie Learig, 58(1), 33--38(2005). [14] Ormerod T., Morley N., Ball L., Lagley C., Speser C.: Usig Ethography To Desig a Mass Detectio Tool (MDT) for the Early Discovery of Isurace Fraud. Computer Huma Iteractio, Ft. Lauderdale, Florida(2003). [15] Li, J., Hwag, M., Becker, J.: A Fuzzy Neural Network for Assessig the Risk of Fraudulet Fiacial Reportig. J. of Maagerial Auditig, 18(8), 657--665(2003). [16] Kim, H., Pag, S., Je, H., Kim, D. & Bag, S.: Costructig Support Vector Machie Esemble. Patter Recogitio 36: 2757-2767(2003).Kim, J., Og, A. & Overill, R. (2003). Desig of a Artificial Immue System as a Novel Aomaly Detector for Combatig Fiacial Fraud i Retail Sector. Cogress o Evolutioary Computatio. [17]Betley, P., Kim, J., Jug., G., Choi, J.: Fuzzy Darwiia Detectio of Credit Card Fraud. I: 14th Aual Fall Symposium of the Korea Iformatio Processig Society(2000). [18] Little, B., Johsto, W., Lovell, A., Rejesus, R. & Steed, S.: Collusio i the US Crop Isurace Program: Applied Data Miig. Proc. of SIGKDD02, 594-598(2002). [19] Viaee, S., Derrig, R., Dedee, G.: A Case Study of Applyig Boostig Naive Bayes to Claim Fraud Diagosis. I: IEEE Trasactios o Kowledge ad Data Egieerig, 16(5), 612--620(2004). [20] Yamaishi, K., Takeuchi, J., Williams, G., Mile, P.: O-Lie Usupervised Outlier Detectio Usig Fiite Mixtures with Discoutig Learig Algorithms. Data Miig ad Kowledge Discovery, 8, 275-- 300(2004). [21] Phua, C., Alahakoo, D., Lee, V.: Miority Report i Fraud Detectio: Classificatio of Skewed Data. I: SIGKDD Exploratios, 6(1), 50--59(2004). [22] Foster, D. & Stie, R.: Variable Selectio i Data Miig: Buildig a Predictive Model for Bakruptcy. J. of America Statistical Associatio 99, 303-- 313(2004). [23] Barse, E., Kvarstrom, H., Josso, E.: Sythesizig Test Data for Fraud Detectio Systems. I: 19th Aual Computer Security Applicatios Coferece, 384--395(2003). [24] McGibey, J., Heare, S.: A Approach to Rulesbased Fraud Maagemet i Emergig Coverged Networks. I: IEI/IEEE ITSRS (2003). [25] Bhargava, B., Zhog, Y., Lu, Y.: Fraud Formalizatio ad Detectio. I: DaWaK2003, 330--339(2003). [26] Sherma, E.: Fightig Web Fraud. Newsweek, Jue 10(2002). [27] Bochi, F., Giaotti, F., Maietto, G., Pedreschi, D.: A Classificatio-based Methodology for Plaig Auditig Strategies i Fraud Detectio. I: SIGKDD99, 175--184(1999). [28] Shao, H., Zhao, H., Chag, G.: Applyig Data Miig to Detect Fraud Behavior i Customs Declaratio. I: 1 st Iteratioal Coferece o Machie Learig ad Cyberetics, 1241-- 1244(2002). [29] Feelders, A. J.: Statistical Cocepts. Berthold M. ad Had D. (eds), Itelliget Data Aalysis, Spriger- Verlag, Berli, Germay, pp. 17-68, 2003. 161

[30] Mea J.: Data miig for Homelad Security. Executive Briefig, VA(2003). Mea J.: Ivestigative Data Miig for Security ad Crimial Detectio, Butterworth Heiema, MA(2003). [31] SPSS: Data miig ad Crime aalysis i the Richmod Police Departmet, White Paper, Virgiia(2003). [32] Quila, J. R.: C4.5 Programs for Machie Learig, Morga Kauffma, CA, USA(1993). [33] Witte, I., Frak, E.: Data Miig: Practical Machie Learig Tools ad Techiques, 2d Editio, Morga Kaufma(2005). [31]James F.: FBI has eye o busiess databases. Chicago Tribue, Kight Ridder/ Tribue Busiess News(2002). 162