Performance Measures in Data Mining

Size: px

Start display at page:

Download "Performance Measures in Data Mining"

Abner Mathews
10 years ago
Views:

1 Performance Measures in Data Mining Common Performance Measures used in Data Mining and Machine Learning Approaches L. Richter J.M. Cejuela Department of Computer Science Technische Universität München Master Lab Course Data Mining, SS 2015, Jul 1st

2 Outline Item Set and Association Rule Weights Simple Measures Complex Measures Basic Performance Measures Complex Measures Performance Curves Regression Assessment Strategies

3 Item Set and Association Rule Weights Simple Measures Measures for Item Sets Various algorithms can yield frequent item sets. From frequent item sets c and c {i} you can derive if c then {i}. Typically there is only one item in the RHS (right hand side of the rule). Support (of an item set): sup(x Y ) = sup(y X) = P(X Y ) how many times the item set is found in the database Confidence (of a rule): conf (X Y ) = P(Y X) = P(X and Y )/P(X) = sup(x Y )/sup(x)

Typically there is only one item in the RHS (right hand side of the rule).

4 Item Set and Association Rule Weights Simple Measures Measures for Item Sets Various algorithms can yield frequent item sets. From frequent item sets c and c {i} you can derive if c then {i}. Typically there is only one item in the RHS (right hand side of the rule). Support (of an item set): sup(x Y ) = sup(y X) = P(X Y ) how many times the item set is found in the database Confidence (of a rule): conf (X Y ) = P(Y X) = P(X and Y )/P(X) = sup(x Y )/sup(x)

5 Item Set and Association Rule Weights Simple Measures Measures for Item Sets Various algorithms can yield frequent item sets. From frequent item sets c and c {i} you can derive if c then {i}. Typically there is only one item in the RHS (right hand side of the rule). Support (of an item set): sup(x Y ) = sup(y X) = P(X Y ) how many times the item set is found in the database Confidence (of a rule): conf (X Y ) = P(Y X) = P(X and Y )/P(X) = sup(x Y )/sup(x)

6 Item Set and Association Rule Weights Complex Measures Measures for Item Sets cont. d Measures how frequent an item set / how interesting a rule is in comparison to the expected occurrence (interesting): Leverage (of an item set): lev(x Y ) = P(X and Y ) (P(X)P(Y )) Lift (of a rule): lift(x Y ) = lift(y X) = P(X Y )/(P(X)P(Y )) = conf (X Y )/sup(y ) = conf (Y X)/sup(X) Conviction (of a rule): Similar to Lift, but directed Compares the probability that X appears without Y, if they were independent with the observed frequency of X and Y. conviction(x Y ) = P(X)P( Y )/P(X Y ) = (1 sup(y )/(1 conf (X Y ))

lev(x Y ) = P(X and Y ) (P(X)P(Y )) Lift (of a rule): lift(x Y ) = lift(y X) = P(X Y )/(P(X)P(Y )) = conf (X Y )/sup(y ) = conf (Y X)/sup(X) Conviction

7 Item Set and Association Rule Weights Complex Measures Measures for Item Sets cont. d Measures how frequent an item set / how interesting a rule is in comparison to the expected occurrence (interesting): Leverage (of an item set): lev(x Y ) = P(X and Y ) (P(X)P(Y )) Lift (of a rule): lift(x Y ) = lift(y X) = P(X Y )/(P(X)P(Y )) = conf (X Y )/sup(y ) = conf (Y X)/sup(X) Conviction (of a rule): Similar to Lift, but directed Compares the probability that X appears without Y, if they were independent with the observed frequency of X and Y. conviction(x Y ) = P(X)P( Y )/P(X Y ) = (1 sup(y )/(1 conf (X Y ))

8 Item Set and Association Rule Weights Complex Measures Measures for Item Sets cont. d Measures how frequent an item set / how interesting a rule is in comparison to the expected occurrence (interesting): Leverage (of an item set): lev(x Y ) = P(X and Y ) (P(X)P(Y )) Lift (of a rule): lift(x Y ) = lift(y X) = P(X Y )/(P(X)P(Y )) = conf (X Y )/sup(y ) = conf (Y X)/sup(X) Conviction (of a rule): Similar to Lift, but directed Compares the probability that X appears without Y, if they were independent with the observed frequency of X and Y. conviction(x Y ) = P(X)P( Y )/P(X Y ) = (1 sup(y )/(1 conf (X Y ))

9 Item Set and Association Rule Weights Complex Measures Supplementary Material J-Measure empirically observed accuracy of rule Cross-entropy (measuring how good a distribution approximates another distribution) between the binary variables φ and θ with vs. without conditioning on event θ ( J(θ φ) = p(θ) p(φ θ)log p(φ θ) ) 1 p(φ θ) + (1 p(φ θ)) log p(φ) 1 p(φ)

approximates another distribution) between the binary variables φ and θ with vs.

10 Basic Performance Measures Basic Building Blocks for Performance Measures True Positive (TP): positive instances predicted as positive True Negative (TN): negative instances predicted as negative False Positive (FP): negative instances predicted as positive False Negative (FN): positive instances predicted as negative Confusion Matrix: Predicted a Predicted b Real a TP FN Real b FP TN

negative False Positive (FP): negative instances predicted as positive False Negative (FN):

11 Basic Performance Measures Performance Measures Accuracy, acc = = Error rate, err = = TP + TN TP + FN + FP + TN Number of correct predictions Total number of predictions FN + FP TP + FN + FP + TN = 1 acc Number of wrong predictions Total number of predictions

correct predictions Total number of predictions FN + FP TP + FN

12 Basic Performance Measures Performance Measures cont d True Positive Rate, TPR, Sensitivity = True Negative Rate, TNR, Specificity = False Positive Rate, FPR = False Negative Rate, FNR = FP TN + FP TP TP + FN FN TP + FN TN TN + FP

Rate, TNR, Specificity = False Positive Rate, FPR = False

13 Basic Performance Measures Performance Measures cont d True Positive Rate, TPR, Sensitivity = True Negative Rate, TNR, Specificity = False Positive Rate, FPR = False Negative Rate, FNR = FP TN + FP TP TP + FN FN TP + FN TN TN + FP

14 Basic Performance Measures Performance Measures cont d True Positive Rate, TPR, Sensitivity = True Negative Rate, TNR, Specificity = False Positive Rate, FPR = False Negative Rate, FNR = FP TN + FP TP TP + FN FN TP + FN TN TN + FP

15 Basic Performance Measures Performance Measures cont d True Positive Rate, TPR, Sensitivity = True Negative Rate, TNR, Specificity = False Positive Rate, FPR = False Negative Rate, FNR = FP TN + FP TP TP + FN FN TP + FN TN TN + FP

16 Basic Performance Measures Performance Measures cont d F 1 measure = Precision, p = Recall, r = 2rp r + p = TP TP + FP TP TP + FN 2 TP 2 TP + FP + FN

17 Basic Performance Measures Performance Measures cont d F 1 measure = Precision, p = Recall, r = 2rp r + p = TP TP + FP TP TP + FN 2 TP 2 TP + FP + FN

18 Basic Performance Measures Performance Measures cont d F 1 measure = Precision, p = Recall, r = 2rp r + p = TP TP + FP TP TP + FN 2 TP 2 TP + FP + FN

19 Complex Measures Performance Curves Performance Curves "costs" of different error types are different prediction behaviour changes over the test set performance display in 2D different domains prefer different chart types

prediction behaviour changes over the test set

Complex Measures Performance Curves Lift Charts Taken from http://www.dmg.org/rfc/pmml-3.2/modelexplanation.

20 Complex Measures Performance Curves Lift Charts Taken from from marketing area to evaluate mailing success y-axis: number or percentage of responders x-axis: sample red diagonal: random lift green line: optimum lift

html from marketing area to evaluate mailing success y-axis: number

21 Complex Measures Performance Curves ROC Curves Receiver Operator Characteristics y-axis: TPR x-axis: FPR diagonal: random guessing (TPR=FPR) Taken from

22 Complex Measures Performance Curves Sensitivity vs. Specificity preferred in medicine y-axis: TPR x-axis: TNR (specificity) also frequently as ROC curve with 1 - specificity Taken from

23 Complex Measures Performance Curves Recall Precision Curves Taken from preferred in information retrieval positives are the documents retrieved in response to a query true positives are documents really relevant to the query y-axis: precision x-axis: recall

24 Regression Error Measures for Regression Mean squared error, MSE = (p 1 a 1 ) (p n a n ) 2 n Root mean squared error, RMSE = (p1 a 1 ) (p n a n ) 2 n Mean absolute error, MAE = p 1 a p n a n n

25 Regression Error Measures for Regression Mean squared error, MSE = (p 1 a 1 ) (p n a n ) 2 n Root mean squared error, RMSE = (p1 a 1 ) (p n a n ) 2 n Mean absolute error, MAE = p 1 a p n a n n

26 Regression Error Measures for Regression Mean squared error, MSE = (p 1 a 1 ) (p n a n ) 2 n Root mean squared error, RMSE = (p1 a 1 ) (p n a n ) 2 n Mean absolute error, MAE = p 1 a p n a n n

27 Regression Relative Error Measures Relative squared error = (p 1 a 1 ) (p n a n ) 2 (a 1 ā) (a n ā) 2 Root relative squared error = (p 1 a 1 ) (p n a n ) 2 (a 1 ā) (a n ā) 2 Relative absolute error = p 1 a p n a n a 1 ā + + a n ā

28 Assessment Strategies General Problem each algorithm abstracts from observations (instances) the aspects kept and the aspect discarded differ between the learning scheme (inductive bias) this means also: information about individual instances are contained in the model, too individual instance information leads to overfitting

29 Assessment Strategies Solution Strategies use fresh date, i.e. instances not used for the training for very large numbers of instances: simple split in test and training set most common: 10-fold cross validation LOOCV: Leave one out cross validation

Performance Measures for Machine Learning

Performance Measures for Machine Learning 1 Performance Measures Accuracy Weighted (Cost-Sensitive) Accuracy Lift Precision/Recall F Break Even Point ROC ROC Area 2 Accuracy Target: 0/1, -1/+1, True/False,