Datamining. Gabriel Bacq CNAMTS

Similar documents
FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

Overview. Evaluation Connectionist and Statistical Language Processing. Test and Validation Set. Training and Test Set

Fraud Detection for Online Retail using Random Forests

Virtual Site Event. Predictive Analytics: What Managers Need to Know. Presented by: Paul Arnest, MS, MBA, PMP February 11, 2015

Detecting Fraud in Health Insurance Data: Learning to Model Incomplete Benford s Law Distributions

International Carriers

Health Spring Meeting May 2008 Session # 42: Dental Insurance What's New, What's Important

Knowledge Discovery and Data Mining

Ambulance Services, Medical Transport Mainform Application

Plastic Card Fraud Detection using Peer Group analysis

WORKERS COMPENSATION SUPPLEMENTAL APPLICATION

Ambulance Services, Medical Transport Mainform Application

New Substance Abuse Screening and Intervention Benefit Covered by BadgerCare Plus and Medicaid

Dawn Broschard, EdD Senior Research Analyst Office of Retention and Graduation Success

The Impact of Medicare Part D on the Percent Gross Margin Earned by Texas Independent Pharmacies for Dual Eligible Beneficiary Claims

Chapter 12 Discovering New Knowledge Data Mining

Rossmoor Website SEO Tracking Sheet Updated: April 1, 2014

DISABILITY CLAIM FORM

Please review the applicable anti-fraud statements on the reverse side of this form.

Avoiding Rehospitalizations in LTC Chris Osterberg, RN BSN Pathway Health Services

KENTUCKY BOARD OF NURSING 312 Whittington Parkway, Suite 300 Louisville, Kentucky ADVISORY OPINION STATEMENT

Autism Insurance Act Frequently Asked Questions and Answers

MEDICAL SOCIETY OF THE STATE OF NEW YORK

Circle Of Life SM : Cancer Education and Wellness for American Indian and Alaska Native Communities

Application for Limited Professional Liability Coverage Insured Paramedical Employee

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management

1. Name of applicant Last First Middle. Home Phone FAX number address. Complete title of your medical professional designation

FRAUD DETECTION IN MOBILE TELECOMMUNICATION

Pay Guide - Nurses Award 2010 [MA000034]

Understanding the effects of fraud and abuse on your group benefits plan

Justice Griffith. Youth Justice Conferences and Indigenous Over-representation: Micro Simulation Case Study.

Gonzaga Student Accidental / Injury Insurance Plan IMPORTANT NOTE

Data Mining Algorithms Part 1. Dejan Sarka


The drugs tracking system in Italy Brussels, 14th September

Medicare Advantage and Part D Fraud, Waste, and Abuse Training. October 2010

FICO Falcon Fraud Manager for Retail Banking

not possible or was possible at a high cost for collecting the data.

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics

UNITED STATES DISTRICT COURT NORTHERN DISTRICT OF ILLINOIS EASTERN DIVISION CRIMINAL COMPLAINT

Allied Health Professional Liability Insurance Application Form

Reinventing Business Intelligence through Big Data

1. Name of applicant Last First Middle. Complete title of your medical professional designation. 12:01 AM E.S.T. on Month Day Year

Cornerstone Visiting Nurse Association. JOB TITLE: Hospice RN/LPN. Lap top, various medical equipment, instruments, machines and a vehicle

Brochure More information from

Consistent Binary Classification with Generalized Performance Metrics

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics

High Deductible Health Plan (HDHP) with Health Savings Account (HSA) FREQUENTLY ASKED QUESTIONS

Analytical Test Method Validation Report Template

Using Data Mining to Predict Automobile Insurance Fraud

Allied Health Professional Liability Insurance Application Form

Machine Learning and Data Mining. Regression Problem. (adapted from) Prof. Alexander Ihler

CALCULATION OF VOLUME- WEIGHTED AVERAGE SALES PRICE FOR MEDICARE PART B PRESCRIPTION DRUGS

FP7 GRANT AGREEMENT ANNEX VII - FORM D - TERMS OF REFERENCE FOR THE CERTIFICATE OF FINANCIAL STATEMENTS

Financial Statement Fraud Detection: An Analysis of Statistical and Machine Learning Algorithms

Accident Claim Filing Instructions

Scoring (manual, automated, automated with manual review)

Accident Claim Filing Instructions

Part 2: Analysis of Relationship Between Two Variables

DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM

Kaukauna Area School District Employee Benefits Booklet Kaukauna Area School District EMPLOYEE BENEFITS GUIDE

ENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS

The Data Mining Process

IBA Business Analytics Data Challenge

ForwardHealth Provider Portal Professional Claims

CHAPTER M20 EXTRA HELP - MEDICARE PART D LOW-INCOME SUBSIDY

A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model

Competitive Acquisition Program (CAP) for Part B Drugs & Biologicals Training for Supplemental Insurance Companies August 2007

The Predictive Fraud and Abuse Analytic and Risk Management System

>

Gerard Mc Nulty Systems Optimisation Ltd BA.,B.A.I.,C.Eng.,F.I.E.I

Lecture 10: Regression Trees

The Terms of Reference should be completed by the Beneficiary and be agreed with the Auditor

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

TNO Work and Employment

HEALTH INSURANCE AND THIRD PARTY ADMINISTRATORS IN INDIA: AWARENESS AND PERCEPTION OF POLICY HOLDERS

GROUP LIFE INSURANCE CLAIM PACKET (Death)

Electronic Payment Fraud Detection Techniques

Extended Health Care Insurance

Choosing the Right Health Insurance Plan What is the different between PPO, HMO, POS and HSA plans?

What is Data Mining? Data Mining (Knowledge discovery in database) Data mining: Basic steps. Mining tasks. Classification: YES, NO

DR.G.J. MIDIWO General Manager, Benefits & Quality Assurance

Transcription:

Datamining Gabriel Bacq CNAMTS

In a few words DCCRF uses two ways to detect fraud cases: one which is fully implemented and another one which is experimented: 1. Database queries (fully implemented) Example: extraction from the database of practitioners who earn more than x. 2. Dataming (experimented at a regional level for daily allowances, and at a national level for Additional universal sickness coverage - CMUC) How it works: briefly: the machine «makes the job» for us! We make a group of non defrauders and a group of defrauders by injecting some defrauders and some non defrauders into the system. Then, when a new person is entered, the machine is supposed to classify him/her in the appropriate group. Problem: As we do not have enough examples of defrauders, the system is not relevant yet.

Fraud detection Fundamental step that has to be done before setting up of a litigation control plan. Gives the possibility of identifying some atypic stakeholders

Data N of the prescriber healthcare professional Health insurance reimbursement basis N of the executive healthcare professional N of the beneficiary Prescription date Execution date Drug name Number of boxes Paid amount 751XXXX 752XXX P49123456 2nd June 2011 4th June 2011 Doliprane 2 3.60 541XXXX 542XXX M25495868 3rd August 2011 7th August 2011 Gaviscon 3 9.12 561XXXX 222XXX J987654321 5th May 2011 6th May 2011 Celestène 4 9.6 Are but healthcare consumption

Methods Reason-based targeting queries: Database queries Datamining : Mathematical or statistical method with an automatic learning

Reason-based targeting queries A few examples Targeting general practitioners whose reimbursements exceed a certain amount Targeting nurses who have made an aberrant number of acts Targeting carriers with a very important number of daily transportations by vehicle Targeting beneficiaries with an unusual reimbursed amount

Threshold determination Example of targeting related to the number of acts performed by nurses Targeting threshold: a +9 standard deviation around the average

Datamining Experiments of new detection methods Identification of new fraud procedures Important quantity of data Automatic or semi automatic methods

Supervised learning: Learning basis made up of : Proven fraud cases Non fraud cases Fraud patterning from predictors Use of the pattern to identify potential fraudsters Validation of potential fraudsters who are detected through investigations in the field

The major steps of a DATAMINING project Data collection aiming at creating a learning basis Fraudulent behaviours patterning through several models submitted to competition New fraud case identification from the chosen pattern Assessment of the chosen pattern after investigations in the field

Learning basis Data gathering to create a learning basis: Recognized fraud cases Feedback related to fraud cases refered to a court and locally identified with the whole defrauders characteristics. Reference cases of non defrauders Ideally: local identification of recognized non defrauders, after investigations in the field. Or random drawing from a database of potential non defrauders

Fraudulent behaviours patterning Patterning through several learning patterns set up by a specific software. Decision tree Example: yes Temperature < 37.5 no sore throat sick yes no sick healthy Logistic regression (statistical pattern that gives the possibiliy of assigning every new person who is entered in the system a probability of frauding > we then pay attention to people whose probability of frauding is high)

Pattern performance indicators Measure of error or misclassification rate Error rate TP FP FN FN FP Specificity and sensitivity measure Defrauders Sensitivity = TN Tests pattern performance on: TP TP + FN Non defrauders Specificity = Measure of negative and positive predictive values Tests predictive power of the pattern among: T N FP + TN Defrauders TPP TP TP FP Non defrauders FN TPN TN FN