Statistics in Retail Finance. Chapter 7: Fraud Detection in Retail Credit

Size: px
Start display at page:

Download "Statistics in Retail Finance. Chapter 7: Fraud Detection in Retail Credit"

Transcription

1 Statistics in Retail Finance Chapter 7: Fraud Detection in Retail Credit 1

2 Overview > Detection of fraud remains an important issue in retail credit. Methods similar to scorecard development may be employed, but there are some problems specific to this application area. In this chapter we discuss:- Types of fraud and size of the problem. Automated fraud detection. Two-class and one-class classifiers for fraud detection. Parzen density estimation. Evaluation issues for fraud detection. 2

3 References > There is not too much material on fraud detection in retail finance. The following sources should be useful. Fraud The Facts (2012) Financial Fraud Action UK report ( Anderson R (2007) The Credit Scoring Toolkit: theory and practice for retail credit risk management and decision automation. NY: OUP. Hit em where it hurts: Using analytics to lock up fraudsters. SAS white paper 2012 Dorronsoro JR, Ginel F, Sanchez C and Santa Cruz C, Neural fraud detection in credit card operations, IEEE transactions on Neural Networks, Vol.8, no.4, July Juszczak P, Adams NM, Hand DJ, Whitrow C, Weston DJ, Off-the-peg and bespoke classifiers for fraud detection, Computational statistics and data analysis 52 (2008)

4 Types of fraud > Theft fraud. A credit card is physically stolen or lost and used by someone other than the card holder. Card mail non-receipt fraud. A type of theft, but before the genuine card holder gets the card. Counterfeit fraud. A credit card is physically faked and used. Application fraud. An individual applies for credit deliberately using false information. Bankruptcy fraud. A person receives and uses credit knowing that they will be personally bankrupt in future. 4

5 Behavioural fraud / Card-not-present (CNP) fraud. Credit card details are taken and used remotely by someone other than card holder. Common in telephone sales, internet commerce and mail order. Example of real fraud 5

6 Cost and detection of fraud > The loss due to credit card fraud is strongly related increasingly with the length of time from the time the fraud starts to the time the fraud is detected and the credit is stopped. When is fraud detected? For stolen or lost cards, a card can be stopped as soon as it is reported missing. For application and bankruptcy fraud, a problem may only become apparent when payments become due and are not met. For a personal loan, the whole amount could be lost. Counterfeit and behavioural fraud may only be detected when a customer spots an anomalous transaction on his/her account statement and reports this to the bank. Analytic methods in banks can be used to detect fraudulent behaviour. 6

7 million Statistics in Retail Finance Size of the fraud problem > Cost of retail credit fraud in UK (2001 to 2011) Mail non-receipt Card ID theft Lost/stolen Counterfeit Card-not-present Source: FFA UK (2012) Note: In 2004, chip-and-pin was introduced and this has been quoted as part of the reason for reduction in fraud losses from

8 Automated fraud detection > Automated methods are applied to detect behavioural fraud. The main issue here is the timeliness of the detection, to shorten the amount of time the fraud is operating. Usually automated methods generate fraud alerts that are followed up manually. Note, not all fraud alerts will turn out to be genuine fraud; many will be false alarms. This is a type of classification problem, to distinguish between legitimate transactions ( ) and fraudulent transactions ( ). 8

9 Special considerations for fraud detection > There are some special problems for fraud detection: 1.Need to process millions of transactions in real time. 2.Highly imbalanced classification problem. Ratio of fraudulent to legitimate transactions is typically less than 1: Nature of fraud is reflexive. That is, fraudsters adapt to the detection methods applied by banks to stop them. However, unlike application model development, there is less need to build an explanatory model, therefore complex structured non-linear models can be considered. 9

10 Automated fraud detection methods > There are four categories of methods:- 1.Business rules 2.Predictive models 3.Anomaly detection 4.Social network analysis 10

11 Method 1: Business rules > The simplest approach is to use expert knowledge to implement business knowledge of fraudulent behaviour as part of a computer-based expert system. A typical rule is:- Generate a fraud alert if a credit card is used abroad and it has not been used in that country in the past year and the credit card holder has not told the bank they will be visiting that country. 11

12 Method 2: Predictive models > We treat fraud detection as a classification problem and use a two-class classifier. The result is a fraud scorecard. Usually the fraud score is used with low scores indicating higher level of fraud risk and higher scores indicating lower level of fraud risk. Choose a classifier based on a model with functional form, such that ( ) for a transaction and some model parameters. Estimate fraud. based on a training data of past transactions that included 12

13 To deal with the high imbalance between classes, a simple filter can be applied first to detect and remove obviously legitimate transactions and so increase the ratio of fraudulent to legitimate transactions in the training data. o For example, inactive accounts and low value or repeated transactions could be removed. Research results and past experience show that models based on linear combinations of predictor variables such as OLS and logistic regression are not sufficient. Non-linear classifiers such as artificial neural networks (ANN) are effective and used in practice (eg SAS fraud tools). We do not have the scope to present ANNs in this course. 13

14 We can expect to have good results for types of fraud that are the same as the ones in the training data. This is because the two-class classifier is a model of the fraudulent behaviour observed. However, it is not expected to perform well if new types of fraud emerge over time. They will not have been modelled. 14

15 Method 3: Anomaly detection > An alternative to predictive modelling is to model only the legitimate transactions then report anomalies in new cases as potential fraudulent transactions. This method has the advantage that fraud is not explicitly modelled, so in principle it should be adaptable to new types of fraud that emerge. Additionally, the highly unbalanced nature of the data is not a problem since model is only based on the legitimate transactions. The one major problem is that it will not be sensitive to frauds which appear very similar to legitimate ones. One-class classifiers are used to build a model of legitimate transactions. Typically these work by modelling the probability density function (PDF) over the predictor variables for legitimate transactions. In this chapter we will use the common Parzen density estimator. 15

16 Anomaly detection process > A typical anomaly detection process is given as follows:- 1.Use an outlier detector to remove extreme cases from the training data (these may be errors, genuine outliers or fraudulent transactions). 2.Let ( ) be a training sequence of legitimate transactions (with outliers removed) 3.Denote outcome by { } where 1 denotes a legitimate transaction and 0 a fraudulent one. 4.Estimate PDF ( ) where is an estimation parameter. 5.A classification decision on a new observation is made as ( ( ) ) for some threshold on the density,. 16

17 The threshold can be set based on the (sensible) strategy of controlling the fraction of legitimate cases to be classified as anomalous, based on training data. This controls the false alert rate and also can be informed by how many alerts can be followed-up manually, which is constrained by business resources (eg how many staff are employed to do follow-up). We write this as the optimization task ( ( ) ) ( ) Note: The inequality is used here only for cases where the sum does not give an exact value of ( ). Because is minimized, the sum always gives a value as close to ( ) as possible. 17

18 Parzen density estimator > We could base the estimate on just the empirical frequency, but 1.This only works for univariate data and 2.It is a somewhat crude estimator of the underlying PDF: ( ) ( ) Instead we use a Parzen estimator that smooths over a multivariate sample to generate a distribution. ( ) ( ) where is some kernel which is symmetric, ( ) ( ), and integrates to 1, ( ), is a bandwidth parameter and is the dimensionality of (ie the number of predictor variables). 18

19 For any point in the variable space,, each value in the training sequence contributes to the estimate, but its contribution is weighted by its distance from, given by. The bandwidth controls the scaling of that distance within the kernel function. A typical kernel function is the multivariate normal distribution: ( ) ( ) ( ) In the R statistical language, the function density implements Parzen density estimation. 19

20 Exercise 9.1 Prove that ( ) 20

21 Example 9.1. This R code demonstrates Parzen density estimation and the use of bandwidth. The example simulates 200 observations from a mixture of two normal distributions. x <- c(rnorm(100,-2,1), rnorm(100,2,1)) par(mfrow=c(2,2)) hist(x) plot(density(x,bw=0.1), main="density estimate") plot(density(x,bw=0.5), main="density estimate") plot(density(x,bw=1.5), main="density estimate") 21

22 The following output is produced: 22

23 Method 4: Social network analysis > Very recently banks have been accessing publicly available social network data. This allows them to determine transactions that have some association with other individuals or accounts that are known to be fraudulent or suspect. This would reduce the fraud score of such transactions. Statistical methods that are evolving to deal with this data:- o Social network analysis, o Dynamic network analysis. This is a very new area and we will not investigate these topics further in this course. 23

24 Available data for fraud detection > Accounts data Including type of account, application details and aggregate behavioural characteristics. Transaction data Including spending and repayment patterns. Personal data Data the bank has about person holding the account, some of which may have been provided by a credit bureau. Location data Information about where the transaction was performed and the borrower lives. 24

25 Evaluation > Although, essentially a classification problem, the fraud problem has some characteristics that make evaluation of performance slightly different: 1.The timeliness of detection has an effect on the cost of the fraud. 2.The cost of monitoring automated fraud alerts is important. 3.It is necessary to ensure false alerts are kept to a minimum in order to not upset/alienate legitimate customers. At the moment there is no clear agreement about the best performance measure. As with scorecard development, typically base measures on the two CDFs: ( ) ( ) for some fraud score (remember lower value means more risk of fraud), and for each outcome { } (remember means legitimate). 25

26 Thus, plotting ( ) against ( ) gives the receiver-operating characteristics (ROC) curve and the area under the ROC curve (AUC) as classification performance measure: ( ) ( ) However, the ROC curve and AUC does not take into account the special points (1) to (3) given above. We consider a measure based on these terms: The false alarm rate is given by ( ). The undetected fraud rate is given by ( ). The alert rate, which is linked to the monitoring cost, is ( ) ( ). Notice that ( ) ( ) ( ) ( ) ( ). 26

27 Performance curve > The performance curve is an alternative to the ROC curve. Plot ( ) against ( ). o This plots monitoring cost (point 2) against proportion of frauds not detected. o Also, since ( ) ( ) ( ) and ( ) this also shows some control on false alarms (point 3). The point ( ( )) is the perfect performance: all detected at minimal possible cost. The line must pass through ( ) when no frauds are detected since no detection is performed. The performance given by a random classifier is where ( ) ( ). Hence this is the diagonal from (0,1) to (1,0). 27

28 Best performance is given by curves below this line, but area under the performance curve is a penalty measure: ( ) ( ) The x-axis is called a timeline since it captures an aspect of detection over time (point 1). o Basically as frauds are detected this increases the proportion of undetected frauds left in the data, so over time we expect to move along the x-axis. o This is similar to performance curves in engineering (eg stress versus performance curves). 28

29 Cost-based evaluation > The financial cost of fraud can be estimated directly. Based on history of past fraud or total exposure of account at time of fraud. This is based on past accounting data for those cases that have been correctly detected in the past. 29

30 Example 9.2 This is an example of a comparison between a one-class classifier, using Parzen density estimator a with two-class classifier. Uses the performance curve as an evaluation method. Based on Juszczak et al (2008). Data set: 11,383 accounts with 646,729 transactions with 3,217 (28.3%) fraudulent accounts and 18,501 (2.9%) fraudulent transacations. Transaction records over a 6 month period. Use Parzen density estimator as one-class classifier. 30

31 F( c) Statistics in Retail Finance Outcome of model build and test on hold-out sample:- Performance curve F(c) F0( c) Now consider forecasts over time and in comparison with comparable twoclass classifier (in this case a density-based Parzen classifier). 31

32 Cost F(c ) Statistics in Retail Finance Fixing ( )=0.2 and plotting cost against forecast ahead months Months One-class Two-class This shows that initially the two-class classifier gives slightly better performance. However, its performance deterioriates over time in comparison to the one-class classifier which is more robust. Our hypothesis is that the two-class classifier is not sensitive to new types of fraud. 32

33 Exercise 9.2 Suppose and ( ) { ( ) for { }. Let ( ) be a sequence of instances of, which correspond to legitimate transactions. 1.Show that is a kernel function for Parzen density estimation for random variable with bandwidth. 2.Using, compute the threshold that gives a false positive rate up to. 33

34 Review of Chapter 9 > In this chapter we have investigated:- Types of fraud and size of the problem. Automated fraud detection. Two-class and one-class classifiers for fraud detection. Parzen density estimation. Evaluation issues for fraud detection. 34

Fraud - Consequences of Cutting Edge Solutions

Fraud - Consequences of Cutting Edge Solutions Detection using Peer Group analysis David Weston, Niall Adams, David Hand, Christopher Whitrow, Piotr Juszczak 19 September, 2007 19/09/07 1 / 69 EPSRC Think Crime Peer Group Crime Prevention & Detection

More information

Plastic Card Fraud Detection using Peer Group analysis

Plastic Card Fraud Detection using Peer Group analysis Plastic Card Fraud Detection using Peer Group analysis David Weston, Niall Adams, David Hand, Christopher Whitrow, Piotr Juszczak 29 August, 2007 29/08/07 1 / 54 EPSRC Think Crime Peer Group - Peer Group

More information

Intrusion Detection via Machine Learning for SCADA System Protection

Intrusion Detection via Machine Learning for SCADA System Protection Intrusion Detection via Machine Learning for SCADA System Protection S.L.P. Yasakethu Department of Computing, University of Surrey, Guildford, GU2 7XH, UK. s.l.yasakethu@surrey.ac.uk J. Jiang Department

More information

Statistics in Retail Finance. Chapter 6: Behavioural models

Statistics in Retail Finance. Chapter 6: Behavioural models Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural

More information

Statistics in Retail Finance. Chapter 2: Statistical models of default

Statistics in Retail Finance. Chapter 2: Statistical models of default Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision

More information

Fraud Detection for Online Retail using Random Forests

Fraud Detection for Online Retail using Random Forests Fraud Detection for Online Retail using Random Forests Eric Altendorf, Peter Brende, Josh Daniel, Laurent Lessard Abstract As online commerce becomes more common, fraud is an increasingly important concern.

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

An effective approach to preventing application fraud. Experian Fraud Analytics

An effective approach to preventing application fraud. Experian Fraud Analytics An effective approach to preventing application fraud Experian Fraud Analytics The growing threat of application fraud Fraud attacks are increasing across the world Application fraud is a rapidly growing

More information

Gerry Hobbs, Department of Statistics, West Virginia University

Gerry Hobbs, Department of Statistics, West Virginia University Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit

More information

SAS Fraud Framework for Banking

SAS Fraud Framework for Banking SAS Fraud Framework for Banking Including Social Network Analysis John C. Brocklebank, Ph.D. Vice President, SAS Solutions OnDemand Advanced Analytics Lab SAS Fraud Framework for Banking Agenda Introduction

More information

POINT OF SALE FRAUD PREVENTION

POINT OF SALE FRAUD PREVENTION 2002 IEEE Systems and Information Design Symposium University of Virginia POINT OF SALE FRAUD PREVENTION Student team: Jeff Grossman, Dawn Herndon, Andrew Kaplan, Mark Michalski, Carlton Pate Faculty Advisors:

More information

How to Design Better Financial Regulation COST-SENSITIVE CLASSIFIERS AND RIA: CREDIT FRAUD DETECTION CASE STUDY. Pietro Scabellone

How to Design Better Financial Regulation COST-SENSITIVE CLASSIFIERS AND RIA: CREDIT FRAUD DETECTION CASE STUDY. Pietro Scabellone How to Design Better Financial Regulation COST-SENSITIVE CLASSIFIERS AND RIA: CREDIT FRAUD DETECTION CASE STUDY Pietro Scabellone Ljubljana, September 12-14, 2007 ABSTRACT Classification methods are of

More information

Dan French Founder & CEO, Consider Solutions

Dan French Founder & CEO, Consider Solutions Dan French Founder & CEO, Consider Solutions CONSIDER SOLUTIONS Mission Solutions for World Class Finance Footprint Financial Control & Compliance Risk Assurance Process Optimization CLIENTS CONTEXT The

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Despite its emphasis on credit-scoring/rating model validation,

Despite its emphasis on credit-scoring/rating model validation, RETAIL RISK MANAGEMENT Empirical Validation of Retail Always a good idea, development of a systematic, enterprise-wide method to continuously validate credit-scoring/rating models nonetheless received

More information

Credit Risk Models. August 24 26, 2010

Credit Risk Models. August 24 26, 2010 Credit Risk Models August 24 26, 2010 AGENDA 1 st Case Study : Credit Rating Model Borrowers and Factoring (Accounts Receivable Financing) pages 3 10 2 nd Case Study : Credit Scoring Model Automobile Leasing

More information

Classification of Bad Accounts in Credit Card Industry

Classification of Bad Accounts in Credit Card Industry Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition

More information

Ira J. Haimowitz Henry Schwarz

Ira J. Haimowitz Henry Schwarz From: AAAI Technical Report WS-97-07. Compilation copyright 1997, AAAI (www.aaai.org). All rights reserved. Clustering and Prediction for Credit Line Optimization Ira J. Haimowitz Henry Schwarz General

More information

MHI3000 Big Data Analytics for Health Care Final Project Report

MHI3000 Big Data Analytics for Health Care Final Project Report MHI3000 Big Data Analytics for Health Care Final Project Report Zhongtian Fred Qiu (1002274530) http://gallery.azureml.net/details/81ddb2ab137046d4925584b5095ec7aa 1. Data pre-processing The data given

More information

Soft Computing Tools in Credit card fraud & Detection Rashmi G.Dukhi G.H.Raisoni Institute of Information & Technology, Nagpur rashmidukhi25@gmail.

Soft Computing Tools in Credit card fraud & Detection Rashmi G.Dukhi G.H.Raisoni Institute of Information & Technology, Nagpur rashmidukhi25@gmail. Soft Computing Tools in Credit card fraud & Detection Rashmi G.Dukhi G.H.Raisoni Institute of Information & Technology, Nagpur rashmidukhi25@gmail.com Abstract Fraud is one of the major ethical issues

More information

DATA MINING APPLICATION IN CREDIT CARD FRAUD DETECTION SYSTEM

DATA MINING APPLICATION IN CREDIT CARD FRAUD DETECTION SYSTEM Journal of Engineering Science and Technology Vol. 6, No. 3 (2011) 311-322 School of Engineering, Taylor s University DATA MINING APPLICATION IN CREDIT CARD FRAUD DETECTION SYSTEM FRANCISCA NONYELUM OGWUELEKA

More information

Prediction of Stock Performance Using Analytical Techniques

Prediction of Stock Performance Using Analytical Techniques 136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University

More information

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

More information

INTRODUCTION TO RATING MODELS

INTRODUCTION TO RATING MODELS INTRODUCTION TO RATING MODELS Dr. Daniel Straumann Credit Suisse Credit Portfolio Analytics Zurich, May 26, 2005 May 26, 2005 / Daniel Straumann Slide 2 Motivation Due to the Basle II Accord every bank

More information

Unsupervised Profiling Methods for Fraud Detection

Unsupervised Profiling Methods for Fraud Detection Unsupervised Profiling Methods for Fraud Detection Richard J. Bolton and David J. Hand Department of Mathematics Imperial College London {r.bolton, d.j.hand}@ic.ac.uk Abstract Credit card fraud falls broadly

More information

Data Mining. Nonlinear Classification

Data Mining. Nonlinear Classification Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15

More information

Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios

Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios By: Michael Banasiak & By: Daniel Tantum, Ph.D. What Are Statistical Based Behavior Scoring Models And How Are

More information

Adaptive Anomaly Detection for Network Security

Adaptive Anomaly Detection for Network Security International Journal of Computer and Internet Security. ISSN 0974-2247 Volume 5, Number 1 (2013), pp. 1-9 International Research Publication House http://www.irphouse.com Adaptive Anomaly Detection for

More information

Credit Card Fraud Detection Using Self Organised Map

Credit Card Fraud Detection Using Self Organised Map International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 13 (2014), pp. 1343-1348 International Research Publications House http://www. irphouse.com Credit Card Fraud

More information

Local outlier detection in data forensics: data mining approach to flag unusual schools

Local outlier detection in data forensics: data mining approach to flag unusual schools Local outlier detection in data forensics: data mining approach to flag unusual schools Mayuko Simon Data Recognition Corporation Paper presented at the 2012 Conference on Statistical Detection of Potential

More information

II. Methods - 2 - X (i.e. if the system is convective or not). Y = 1 X ). Usually, given these estimates, an

II. Methods - 2 - X (i.e. if the system is convective or not). Y = 1 X ). Usually, given these estimates, an STORMS PREDICTION: LOGISTIC REGRESSION VS RANDOM FOREST FOR UNBALANCED DATA Anne Ruiz-Gazen Institut de Mathématiques de Toulouse and Gremaq, Université Toulouse I, France Nathalie Villa Institut de Mathématiques

More information

A Statistical Method for Profiling Network Traffic

A Statistical Method for Profiling Network Traffic THE ADVANCED COMPUTING SYSTEMS ASSOCIATION The following paper was originally published in the Proceedings of the Workshop on Intrusion Detection and Network Monitoring Santa Clara, California, USA, April

More information

Uncovering More Insurance Fraud with Predictive Analytics Strategies for Improving Results and Reducing Losses

Uncovering More Insurance Fraud with Predictive Analytics Strategies for Improving Results and Reducing Losses white paper Uncovering More Insurance Fraud with Predictive Analytics Strategies for Improving Results and Reducing Losses April 2012 Summary Predictive analytics are a powerful tool for detecting more

More information

Knowledge Discovery in Stock Market Data

Knowledge Discovery in Stock Market Data Knowledge Discovery in Stock Market Data Alfred Ultsch and Hermann Locarek-Junge Abstract This work presents the results of a Data Mining and Knowledge Discovery approach on data from the stock markets

More information

Detecting Credit Card Fraud by Decision Trees and Support Vector Machines

Detecting Credit Card Fraud by Decision Trees and Support Vector Machines Detecting Credit Card Fraud by Decision Trees and Support Vector Machines Y. Sahin and E. Duman Abstract With the developments in the Information Technology and improvements in the communication channels,

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

The State of Play in Cyber Payments Fraud Improving Security for Online & Card Not Present Transactions

The State of Play in Cyber Payments Fraud Improving Security for Online & Card Not Present Transactions The State of Play in Cyber Payments Fraud Improving Security for Online & Card Not Present Transactions Mark Greene, Ph.D CEO, FICO Federal Reserve Bank of Chicago 26 September 2011 Cybercrime Costs 431

More information

A CHASE PAYMENTECH WHITE PAPER. Expanding internationally: Strategies to combat online fraud

A CHASE PAYMENTECH WHITE PAPER. Expanding internationally: Strategies to combat online fraud A CHASE PAYMENTECH WHITE PAPER Expanding internationally: Strategies to combat online fraud Fraud impacts nearly eight in every ten international online retailers 1. It hampers prospects for growth, restricts

More information

Decision Support Systems

Decision Support Systems Decision Support Systems 50 (2011) 602 613 Contents lists available at ScienceDirect Decision Support Systems journal homepage: www.elsevier.com/locate/dss Data mining for credit card fraud: A comparative

More information

Neural Network Predictor for Fraud Detection: A Study Case for the Federal Patrimony Department

Neural Network Predictor for Fraud Detection: A Study Case for the Federal Patrimony Department DOI: 10.5769/C2012010 or http://dx.doi.org/10.5769/c2012010 Neural Network Predictor for Fraud Detection: A Study Case for the Federal Patrimony Department Antonio Manuel Rubio Serrano (1,2), João Paulo

More information

How To Make A Credit Risk Model For A Bank Account

How To Make A Credit Risk Model For A Bank Account TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző csaba.fozo@lloydsbanking.com 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions

More information

Fraudulent accounts in collections: improve detection and reduce collector workload. An Experian briefing paper

Fraudulent accounts in collections: improve detection and reduce collector workload. An Experian briefing paper Fraudulent accounts in collections: improve detection and reduce collector workload An Experian briefing paper September 2010 1. The inter-relationship of fraud and collections The fraud and collections

More information

THE USE OF PREDICTIVE MODELLING TO BOOST DEBT COLLECTION EFFICIENCY

THE USE OF PREDICTIVE MODELLING TO BOOST DEBT COLLECTION EFFICIENCY CREDIT SCORING AND CREDIT CONTROL XIII EDINBURGH 28-30 AUGUST 2013 THE USE OF PREDICTIVE MODELLING TO BOOST DEBT COLLECTION EFFICIENCY MARCIN NADOLNY SAS INSTITUTE POLAND Many executives fear that the

More information

Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

More information

Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus

Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information

More information

OUTLIER ANALYSIS. Data Mining 1

OUTLIER ANALYSIS. Data Mining 1 OUTLIER ANALYSIS Data Mining 1 What Are Outliers? Outlier: A data object that deviates significantly from the normal objects as if it were generated by a different mechanism Ex.: Unusual credit card purchase,

More information

Detecting Credit Card Fraud

Detecting Credit Card Fraud Case Study Detecting Credit Card Fraud Analysis of Behaviometrics in an online Payment environment Introduction BehavioSec have been conducting tests on Behaviometrics stemming from card payments within

More information

A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model

A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model Twinkle Patel, Ms. Ompriya Kale Abstract: - As the usage of credit card has increased the credit card fraud has also increased

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Guide to credit card security

Guide to credit card security Contents Click on a title below to jump straight to that section. What is credit card fraud? Types of credit card fraud Current scams Keeping your card and card details safe Banking and shopping securely

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Introduction to time series analysis

Introduction to time series analysis Introduction to time series analysis Margherita Gerolimetto November 3, 2010 1 What is a time series? A time series is a collection of observations ordered following a parameter that for us is time. Examples

More information

Acknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues

Acknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues Data Mining with Regression Teaching an old dog some new tricks Acknowledgments Colleagues Dean Foster in Statistics Lyle Ungar in Computer Science Bob Stine Department of Statistics The School of the

More information

Why is Internal Audit so Hard?

Why is Internal Audit so Hard? Why is Internal Audit so Hard? 2 2014 Why is Internal Audit so Hard? 3 2014 Why is Internal Audit so Hard? Waste Abuse Fraud 4 2014 Waves of Change 1 st Wave Personal Computers Electronic Spreadsheets

More information

WHY YOUR CREDIT HISTORY MATTERS AND HOW TO IMPROVE IT.

WHY YOUR CREDIT HISTORY MATTERS AND HOW TO IMPROVE IT. WHY YOUR CREDIT HISTORY MATTERS AND HOW TO IMPROVE IT. CONTENTS. 1 WHY YOUR CREDIT HISTORY MATTERS 1 2 WHAT S CREDIT? 2 3 WHAT IS A CREDIT REPORT? 3 4 CHECKING YOUR CREDIT REPORT 4 5 IMPROVING YOUR CREDIT

More information

Financial Statement Fraud Detection: An Analysis of Statistical and Machine Learning Algorithms

Financial Statement Fraud Detection: An Analysis of Statistical and Machine Learning Algorithms Financial Statement Fraud Detection: An Analysis of Statistical and Machine Learning Algorithms Johan Perols Assistant Professor University of San Diego, San Diego, CA 92110 jperols@sandiego.edu April

More information

Take Charge of Credit Cards

Take Charge of Credit Cards Take Charge of Credit Cards Get Ready to Take Charge of Your Finances Introductory Level What is Credit? Credit- something is received in exchange for a promise to pay back money in the future Borrower

More information

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail

More information

PROBLEM REDUCTION IN ONLINE PAYMENT SYSTEM USING HYBRID MODEL

PROBLEM REDUCTION IN ONLINE PAYMENT SYSTEM USING HYBRID MODEL PROBLEM REDUCTION IN ONLINE PAYMENT SYSTEM USING HYBRID MODEL Sandeep Pratap Singh 1, Shiv Shankar P. Shukla 1, Nitin Rakesh 1 and Vipin Tyagi 2 1 Department of Computer Science and Engineering, Jaypee

More information

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

More information

CHAPTER VII CONCLUSIONS

CHAPTER VII CONCLUSIONS CHAPTER VII CONCLUSIONS To do successful research, you don t need to know everything, you just need to know of one thing that isn t known. -Arthur Schawlow In this chapter, we provide the summery of the

More information

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING

More information

6 Hedging Using Futures

6 Hedging Using Futures ECG590I Asset Pricing. Lecture 6: Hedging Using Futures 1 6 Hedging Using Futures 6.1 Types of hedges using futures Two types of hedge: short and long. ECG590I Asset Pricing. Lecture 6: Hedging Using Futures

More information

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4. Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví Pavel Kříž Seminář z aktuárských věd MFF 4. dubna 2014 Summary 1. Application areas of Insurance Analytics 2. Insurance Analytics

More information

Monitoring the Behaviour of Credit Card Holders with Graphical Chain Models

Monitoring the Behaviour of Credit Card Holders with Graphical Chain Models Journal of Business Finance & Accounting, 30(9) & (10), Nov./Dec. 2003, 0306-686X Monitoring the Behaviour of Credit Card Holders with Graphical Chain Models ELENA STANGHELLINI* 1. INTRODUCTION Consumer

More information

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

1,000 ajobse]dd. Accenture 2013. All rights reserved. Commercial in confidence. Subject to contract. Oct 2012 1

1,000 ajobse]dd. Accenture 2013. All rights reserved. Commercial in confidence. Subject to contract. Oct 2012 1 1,000 ajobse]dd Oct 2012 1 1. Executive Summary The Department of Social Protection (DSP) is responsible for the provision of income supports and employment services in Ireland and process around 20billion

More information

Bond valuation and bond yields

Bond valuation and bond yields RELEVANT TO ACCA QUALIFICATION PAPER P4 AND PERFORMANCE OBJECTIVES 15 AND 16 Bond valuation and bond yields Bonds and their variants such as loan notes, debentures and loan stock, are IOUs issued by governments

More information

Using Analytics to detect and prevent Healthcare fraud. Copyright 2010 SAS Institute Inc. All rights reserved.

Using Analytics to detect and prevent Healthcare fraud. Copyright 2010 SAS Institute Inc. All rights reserved. Using Analytics to detect and prevent Healthcare fraud Copyright 2010 SAS Institute Inc. All rights reserved. Agenda Introductions International Fraud Trends Overview of the use of Analytics in Healthcare

More information

Modern Fraud Prevention from a Bank s Point of View

Modern Fraud Prevention from a Bank s Point of View Modern Fraud Prevention from a Bank s Point of View Extract from an interview between Alexey Golenishev, Payment Schemes Relationships, Head of Department, Alfa-Bank and PLUS Magazine #8 [148] September

More information

SAMPLE SELECTION BIAS IN CREDIT SCORING MODELS

SAMPLE SELECTION BIAS IN CREDIT SCORING MODELS SAMPLE SELECTION BIAS IN CREDIT SCORING MODELS John Banasik, Jonathan Crook Credit Research Centre, University of Edinburgh Lyn Thomas University of Southampton ssm0 The Problem We wish to estimate an

More information

White Paper. Predictive Modeling for True-Name Fraud An Equifax Analytical Services Research Paper

White Paper. Predictive Modeling for True-Name Fraud An Equifax Analytical Services Research Paper White Paper Predictive Modeling for True-Name Fraud An Equifax Analytical Services Research Paper Dave Whitin, Consultant Michiko Wolcott, Statistician September 2006 Table of contents Executive summary...................................

More information

Predictive time series analysis of stock prices using neural network classifier

Predictive time series analysis of stock prices using neural network classifier Predictive time series analysis of stock prices using neural network classifier Abhinav Pathak, National Institute of Technology, Karnataka, Surathkal, India abhi.pat93@gmail.com Abstract The work pertains

More information

The New Reality of Synthetic ID Fraud How to Battle the Leading Identity Fraud Tactic in The Digital Age

The New Reality of Synthetic ID Fraud How to Battle the Leading Identity Fraud Tactic in The Digital Age How to Battle the Leading Identity Fraud Tactic in The Digital Age In the 15 years since synthetic identity fraud emerged as a significant threat, it has become the predominant tactic for fraudsters. The

More information

E-commerce Transaction Anomaly Classification

E-commerce Transaction Anomaly Classification E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Lecture 15 - ROC, AUC & Lift Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-17-AUC

More information

Using kernel methods to visualise crime data

Using kernel methods to visualise crime data Submission for the 2013 IAOS Prize for Young Statisticians Using kernel methods to visualise crime data Dr. Kieran Martin and Dr. Martin Ralphs kieran.martin@ons.gov.uk martin.ralphs@ons.gov.uk Office

More information

Data Mining + Business Intelligence. Integration, Design and Implementation

Data Mining + Business Intelligence. Integration, Design and Implementation Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution

More information

Application of Hidden Markov Model in Credit Card Fraud Detection

Application of Hidden Markov Model in Credit Card Fraud Detection Application of Hidden Markov Model in Credit Card Fraud Detection V. Bhusari 1, S. Patil 1 1 Department of Computer Technology, College of Engineering, Bharati Vidyapeeth, Pune, India, 400011 Email: vrunda1234@gmail.com

More information

THE HYBRID CART-LOGIT MODEL IN CLASSIFICATION AND DATA MINING. Dan Steinberg and N. Scott Cardell

THE HYBRID CART-LOGIT MODEL IN CLASSIFICATION AND DATA MINING. Dan Steinberg and N. Scott Cardell THE HYBID CAT-LOGIT MODEL IN CLASSIFICATION AND DATA MINING Introduction Dan Steinberg and N. Scott Cardell Most data-mining projects involve classification problems assigning objects to classes whether

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

BinBase.com REPORT: credit card fraud

BinBase.com REPORT: credit card fraud BinBase.com REPORT: credit card fraud Whether you are a security specialist, an e-commerce web developer, or an online merchant, a knowledge of how credit card fraud works and what you can do to prevent

More information

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19 PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations

More information

When Your Child s Identity Is Stolen

When Your Child s Identity Is Stolen When Your Child s Identity Is Stolen Consumer Information Sheet 3B May 2015 What Is Child Identity Theft? Adults are not the only targets of identity theft. In fact, children under the age of 18 can also

More information

Take Charge of Credit Cards Note Taking Guide

Take Charge of Credit Cards Note Taking Guide 2.4.1.L1 Note taking guide Take Charge of Credit Cards Note Taking Guide Total Points Earned Total Points Possible Percentage What is credit? A credit card is a form of credit! What is interest? What is

More information

Recognize the many faces of fraud

Recognize the many faces of fraud Recognize the many faces of fraud Detect and prevent fraud by finding subtle patterns and associations in your data Contents: 1 Introduction 2 The many faces of fraud 3 Detect healthcare fraud easily and

More information

Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar

Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar Prepared by Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc. www.data-mines.com Louise.francis@data-mines.cm

More information

Chapter 4: Vector Autoregressive Models

Chapter 4: Vector Autoregressive Models Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...

More information

Credit Card Fraud Detection using Hidden Morkov Model and Neural Networks

Credit Card Fraud Detection using Hidden Morkov Model and Neural Networks Credit Card Fraud Detection using Hidden Morkov Model and Neural Networks R.RAJAMANI Assistant Professor, Department of Computer Science, PSG College of Arts & Science, Coimbatore. Email: rajamani_devadoss@yahoo.co.in

More information

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Paper Jean-Louis Amat Abstract One of the main issues of operators

More information

Active Learning SVM for Blogs recommendation

Active Learning SVM for Blogs recommendation Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the

More information

Predictive Analytics Modeling Methodology Document

Predictive Analytics Modeling Methodology Document Predictive Analytics Modeling Methodology Document Campaign Response Modeling 17 October- 2012 Version details Version number Date Author Reviewer name 1.0 16 October- 2012 Vikash chandra CONTENTS 1. TRAINING

More information

A Property & Casualty Insurance Predictive Modeling Process in SAS

A Property & Casualty Insurance Predictive Modeling Process in SAS Paper AA-02-2015 A Property & Casualty Insurance Predictive Modeling Process in SAS 1.0 ABSTRACT Mei Najim, Sedgwick Claim Management Services, Chicago, Illinois Predictive analytics has been developing

More information

DECISION TREE ANALYSIS: PREDICTION OF SERIOUS TRAFFIC OFFENDING

DECISION TREE ANALYSIS: PREDICTION OF SERIOUS TRAFFIC OFFENDING DECISION TREE ANALYSIS: PREDICTION OF SERIOUS TRAFFIC OFFENDING ABSTRACT The objective was to predict whether an offender would commit a traffic offence involving death, using decision tree analysis. Four

More information

Virtual Site Event. Predictive Analytics: What Managers Need to Know. Presented by: Paul Arnest, MS, MBA, PMP February 11, 2015

Virtual Site Event. Predictive Analytics: What Managers Need to Know. Presented by: Paul Arnest, MS, MBA, PMP February 11, 2015 Virtual Site Event Predictive Analytics: What Managers Need to Know Presented by: Paul Arnest, MS, MBA, PMP February 11, 2015 1 Ground Rules Virtual Site Ground Rules PMI Code of Conduct applies for this

More information

Improving Credit Card Fraud Detection with Calibrated Probabilities

Improving Credit Card Fraud Detection with Calibrated Probabilities Improving Credit Card Fraud Detection with Calibrated Probabilities Alejandro Correa Bahnsen, Aleksandar Stojanovic, Djamila Aouada and Björn Ottersten Interdisciplinary Centre for Security, Reliability

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

2.1 The Present Value of an Annuity

2.1 The Present Value of an Annuity 2.1 The Present Value of an Annuity One example of a fixed annuity is an agreement to pay someone a fixed amount x for N periods (commonly months or years), e.g. a fixed pension It is assumed that the

More information

Some Statistical Applications In The Financial Services Industry

Some Statistical Applications In The Financial Services Industry Some Statistical Applications In The Financial Services Industry Wenqing Lu May 30, 2008 1 Introduction Examples of consumer financial services credit card services mortgage loan services auto finance

More information