Performance Measures in Data Mining

Similar documents

Performance Measures for Machine Learning

Evaluation & Validation: Credibility: Evaluating what has been learned

Overview. Evaluation Connectionist and Statistical Language Processing. Test and Validation Set. Training and Test Set

Cross Validation. Dr. Thomas Jensen Expedia.com

Chapter 6. The stacking ensemble approach

Knowledge Discovery and Data Mining

Data Mining - Evaluation of Classifiers

Data Mining Practical Machine Learning Tools and Techniques

Performance Metrics. number of mistakes total number of observations. err = p.1/1

Data Mining - The Next Mining Boom?

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation. Lecture Notes for Chapter 4. Introduction to Data Mining

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product

A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions

Data Mining Algorithms Part 1. Dejan Sarka

An analysis of suitable parameters for efficiently applying K-means clustering to large TCPdump data set using Hadoop framework

An Approach to Detect Spam s by Using Majority Voting

PREDICTING SUCCESS IN THE COMPUTER SCIENCE DEGREE USING ROC ANALYSIS

Online Performance Anomaly Detection with

Azure Machine Learning, SQL Data Mining and R

The Operational Value of Social Media Information. Social Media and Customer Interaction

Using Random Forest to Learn Imbalanced Data

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

How To Understand The Impact Of A Computer On Organization

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

Data Mining. Nonlinear Classification

Health Care and Life Sciences

Experiments in Web Page Classification for Semantic Web

MACHINE LEARNING IN HIGH ENERGY PHYSICS

Business Case Development for Credit and Debit Card Fraud Re- Scoring Models

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Getting Even More Out of Ensemble Selection

BUILDING CLASSIFICATION MODELS FROM IMBALANCED FRAUD DETECTION DATA

Lecture 5: Evaluation

A Decision Tree for Weather Prediction

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing Classifier

T : Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari :

1. Classification problems

Performance Evaluation Metrics for Software Fault Prediction Studies

Stock Market Forecasting Using Machine Learning Algorithms

Application of Data Mining based Malicious Code Detection Techniques for Detecting new Spyware

Discovering Criminal Behavior by Ranking Intelligence Data

Data Mining Practical Machine Learning Tools and Techniques

Data Mining as a tool to Predict the Churn Behaviour among Indian bank customers

Active Learning SVM for Blogs recommendation

Knowledge Discovery and Data Mining

Tweaking Naïve Bayes classifier for intelligent spam detection

CSC574 - Computer and Network Security Module: Intrusion Detection

Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) ( ) Roman Kern. KTI, TU Graz

Towards better accuracy for Spam predictions

Predicting Deadline Transgressions Using Event Logs

Keywords Data mining, Classification Algorithm, Decision tree, J48, Random forest, Random tree, LMT, WEKA 3.7. Fig.1. Data mining techniques.

Feature Subset Selection in Spam Detection

A Hybrid Data Mining Model to Improve Customer Response Modeling in Direct Marketing

Evaluation of selected data mining algorithms implemented in Medical Decision Support Systems Kamila Aftarczuk

Mining the Software Change Repository of a Legacy Telephony System

Knowledge Discovery and Data Mining

Measuring Intrusion Detection Capability: An Information-Theoretic Approach

Copyright 2006, SAS Institute Inc. All rights reserved. Predictive Modeling using SAS

Consolidated Tree Classifier Learning in a Car Insurance Fraud Detection Domain with Class Imbalance

Data Science and Prediction*

Nonlinear Regression Functions. SW Ch 8 1/54/

Lecture 8: Signal Detection and Noise Assumption

ROC Curve, Lift Chart and Calibration Plot

MAXIMIZING RETURN ON DIRECT MARKETING CAMPAIGNS

International Journal of Advance Research in Computer Science and Management Studies

Advanced analytics at your hands

The Relationship Between Precision-Recall and ROC Curves

Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.

Addressing the Class Imbalance Problem in Medical Datasets

Supervised Learning (Big Data Analytics)

W6.B.1. FAQs CS535 BIG DATA W6.B If the distance of the point is additionally less than the tight distance T 2, remove it from the original set

Statistical Validation and Data Analytics in ediscovery. Jesse Kornblum

Machine Learning. Term 2012/2013 LSI - FIB. Javier Béjar cbea (LSI - FIB) Machine Learning Term 2012/ / 34

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

Big Data & Scripting Part II Streaming Algorithms

Data Mining Lab 5: Introduction to Neural Networks

ViviSight: A Sophisticated, Data-driven Business Intelligence Tool for Churn and Loan Default Prediction

CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.

E-discovery Taking Predictive Coding Out of the Black Box

Prerequisites. Course Outline

Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

Consistent Binary Classification with Generalized Performance Metrics

Leak Detection Theory: Optimizing Performance with MLOG

Knowledge Discovery and Data Mining

Using Machine Learning on Sensor Data

Server Load Prediction

On Entropy in Network Traffic Anomaly Detection

Maschinelles Lernen mit MATLAB

Data and Analysis. Informatics 1 School of Informatics, University of Edinburgh. Part III Unstructured Data. Ian Stark. Staff-Student Liaison Meeting

A Decision Tree Classification Model for University Admission System

CYBER SCIENCE 2015 AN ANALYSIS OF NETWORK TRAFFIC CLASSIFICATION FOR BOTNET DETECTION

Removing Web Spam Links from Search Engine Results

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Transcription:

Performance Measures in Data Mining Common Performance Measures used in Data Mining and Machine Learning Approaches L. Richter J.M. Cejuela Department of Computer Science Technische Universität München Master Lab Course Data Mining, SS 2015, Jul 1st

Outline Item Set and Association Rule Weights Simple Measures Complex Measures Basic Performance Measures Complex Measures Performance Curves Regression Assessment Strategies

Item Set and Association Rule Weights Simple Measures Measures for Item Sets Various algorithms can yield frequent item sets. From frequent item sets c and c {i} you can derive if c then {i}. Typically there is only one item in the RHS (right hand side of the rule). Support (of an item set): sup(x Y ) = sup(y X) = P(X Y ) how many times the item set is found in the database Confidence (of a rule): conf (X Y ) = P(Y X) = P(X and Y )/P(X) = sup(x Y )/sup(x)

Item Set and Association Rule Weights Simple Measures Measures for Item Sets Various algorithms can yield frequent item sets. From frequent item sets c and c {i} you can derive if c then {i}. Typically there is only one item in the RHS (right hand side of the rule). Support (of an item set): sup(x Y ) = sup(y X) = P(X Y ) how many times the item set is found in the database Confidence (of a rule): conf (X Y ) = P(Y X) = P(X and Y )/P(X) = sup(x Y )/sup(x)

Item Set and Association Rule Weights Simple Measures Measures for Item Sets Various algorithms can yield frequent item sets. From frequent item sets c and c {i} you can derive if c then {i}. Typically there is only one item in the RHS (right hand side of the rule). Support (of an item set): sup(x Y ) = sup(y X) = P(X Y ) how many times the item set is found in the database Confidence (of a rule): conf (X Y ) = P(Y X) = P(X and Y )/P(X) = sup(x Y )/sup(x)

Item Set and Association Rule Weights Complex Measures Measures for Item Sets cont. d Measures how frequent an item set / how interesting a rule is in comparison to the expected occurrence (interesting): Leverage (of an item set): lev(x Y ) = P(X and Y ) (P(X)P(Y )) Lift (of a rule): lift(x Y ) = lift(y X) = P(X Y )/(P(X)P(Y )) = conf (X Y )/sup(y ) = conf (Y X)/sup(X) Conviction (of a rule): Similar to Lift, but directed Compares the probability that X appears without Y, if they were independent with the observed frequency of X and Y. conviction(x Y ) = P(X)P( Y )/P(X Y ) = (1 sup(y )/(1 conf (X Y ))

Item Set and Association Rule Weights Complex Measures Measures for Item Sets cont. d Measures how frequent an item set / how interesting a rule is in comparison to the expected occurrence (interesting): Leverage (of an item set): lev(x Y ) = P(X and Y ) (P(X)P(Y )) Lift (of a rule): lift(x Y ) = lift(y X) = P(X Y )/(P(X)P(Y )) = conf (X Y )/sup(y ) = conf (Y X)/sup(X) Conviction (of a rule): Similar to Lift, but directed Compares the probability that X appears without Y, if they were independent with the observed frequency of X and Y. conviction(x Y ) = P(X)P( Y )/P(X Y ) = (1 sup(y )/(1 conf (X Y ))

Item Set and Association Rule Weights Complex Measures Measures for Item Sets cont. d Measures how frequent an item set / how interesting a rule is in comparison to the expected occurrence (interesting): Leverage (of an item set): lev(x Y ) = P(X and Y ) (P(X)P(Y )) Lift (of a rule): lift(x Y ) = lift(y X) = P(X Y )/(P(X)P(Y )) = conf (X Y )/sup(y ) = conf (Y X)/sup(X) Conviction (of a rule): Similar to Lift, but directed Compares the probability that X appears without Y, if they were independent with the observed frequency of X and Y. conviction(x Y ) = P(X)P( Y )/P(X Y ) = (1 sup(y )/(1 conf (X Y ))

Item Set and Association Rule Weights Complex Measures Supplementary Material J-Measure empirically observed accuracy of rule Cross-entropy (measuring how good a distribution approximates another distribution) between the binary variables φ and θ with vs. without conditioning on event θ ( J(θ φ) = p(θ) p(φ θ)log p(φ θ) ) 1 p(φ θ) + (1 p(φ θ)) log p(φ) 1 p(φ)

Basic Performance Measures Basic Building Blocks for Performance Measures True Positive (TP): positive instances predicted as positive True Negative (TN): negative instances predicted as negative False Positive (FP): negative instances predicted as positive False Negative (FN): positive instances predicted as negative Confusion Matrix: Predicted a Predicted b Real a TP FN Real b FP TN

Basic Performance Measures Performance Measures Accuracy, acc = = Error rate, err = = TP + TN TP + FN + FP + TN Number of correct predictions Total number of predictions FN + FP TP + FN + FP + TN = 1 acc Number of wrong predictions Total number of predictions

Basic Performance Measures Performance Measures cont d True Positive Rate, TPR, Sensitivity = True Negative Rate, TNR, Specificity = False Positive Rate, FPR = False Negative Rate, FNR = FP TN + FP TP TP + FN FN TP + FN TN TN + FP

Basic Performance Measures Performance Measures cont d True Positive Rate, TPR, Sensitivity = True Negative Rate, TNR, Specificity = False Positive Rate, FPR = False Negative Rate, FNR = FP TN + FP TP TP + FN FN TP + FN TN TN + FP

Basic Performance Measures Performance Measures cont d True Positive Rate, TPR, Sensitivity = True Negative Rate, TNR, Specificity = False Positive Rate, FPR = False Negative Rate, FNR = FP TN + FP TP TP + FN FN TP + FN TN TN + FP

Basic Performance Measures Performance Measures cont d True Positive Rate, TPR, Sensitivity = True Negative Rate, TNR, Specificity = False Positive Rate, FPR = False Negative Rate, FNR = FP TN + FP TP TP + FN FN TP + FN TN TN + FP

Basic Performance Measures Performance Measures cont d F 1 measure = Precision, p = Recall, r = 2rp r + p = TP TP + FP TP TP + FN 2 TP 2 TP + FP + FN

Basic Performance Measures Performance Measures cont d F 1 measure = Precision, p = Recall, r = 2rp r + p = TP TP + FP TP TP + FN 2 TP 2 TP + FP + FN

Basic Performance Measures Performance Measures cont d F 1 measure = Precision, p = Recall, r = 2rp r + p = TP TP + FP TP TP + FN 2 TP 2 TP + FP + FN

Complex Measures Performance Curves Performance Curves "costs" of different error types are different prediction behaviour changes over the test set performance display in 2D different domains prefer different chart types

Complex Measures Performance Curves Lift Charts Taken from http://www.dmg.org/rfc/pmml-3.2/modelexplanation.html from marketing area to evaluate mailing success y-axis: number or percentage of responders x-axis: sample red diagonal: random lift green line: optimum lift

Complex Measures Performance Curves ROC Curves Receiver Operator Characteristics y-axis: TPR x-axis: FPR diagonal: random guessing (TPR=FPR) Taken from http://en.wikipedia.org/wiki/file:roccurves.png

Complex Measures Performance Curves Sensitivity vs. Specificity preferred in medicine y-axis: TPR x-axis: TNR (specificity) also frequently as ROC curve with 1 - specificity Taken from http://i.stack.imgur.com/fnud2.png

Complex Measures Performance Curves Recall Precision Curves Taken from http://scikit-learn.github.io/scikit-learn.org/ preferred in information retrieval positives are the documents retrieved in response to a query true positives are documents really relevant to the query y-axis: precision x-axis: recall

Regression Error Measures for Regression Mean squared error, MSE = (p 1 a 1 ) 2 + + (p n a n ) 2 n Root mean squared error, RMSE = (p1 a 1 ) 2 + + (p n a n ) 2 n Mean absolute error, MAE = p 1 a 1 + + p n a n n

Regression Error Measures for Regression Mean squared error, MSE = (p 1 a 1 ) 2 + + (p n a n ) 2 n Root mean squared error, RMSE = (p1 a 1 ) 2 + + (p n a n ) 2 n Mean absolute error, MAE = p 1 a 1 + + p n a n n

Regression Error Measures for Regression Mean squared error, MSE = (p 1 a 1 ) 2 + + (p n a n ) 2 n Root mean squared error, RMSE = (p1 a 1 ) 2 + + (p n a n ) 2 n Mean absolute error, MAE = p 1 a 1 + + p n a n n

Regression Relative Error Measures Relative squared error = (p 1 a 1 ) 2 + + (p n a n ) 2 (a 1 ā) 2 + + (a n ā) 2 Root relative squared error = (p 1 a 1 ) 2 + + (p n a n ) 2 (a 1 ā) 2 + + (a n ā) 2 Relative absolute error = p 1 a 1 + + p n a n a 1 ā + + a n ā

Assessment Strategies General Problem each algorithm abstracts from observations (instances) the aspects kept and the aspect discarded differ between the learning scheme (inductive bias) this means also: information about individual instances are contained in the model, too individual instance information leads to overfitting

Assessment Strategies Solution Strategies use fresh date, i.e. instances not used for the training for very large numbers of instances: simple split in test and training set most common: 10-fold cross validation LOOCV: Leave one out cross validation