Evaluation in Machine Learning (1) Evaluation in machine learning. Aims. 14s1: COMP9417 Machine Learning and Data Mining.
|
|
- Rudolf Malone
- 7 years ago
- Views:
Transcription
1 Acknowledgements Evaluation in Machine Learning (1) 14s1: COMP9417 Machine Learning and Data Mining School of Computer Science and Engineering, University of New South Wales March 19, 2014 Material derived from slides for the book Machine Learning by T. Mitchell McGraw-Hill (1997) Material derived from slides by Eibe Frank Material derived from slides for the book Machine Learning by P. Flach Cambridge University Press (2012) COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 Aims 1. Aims COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / Introduction Evaluation in machine learning This lecture will enable you to apply standard measures and graphical methods for the evaluation of machine learning algorithms. Following it you should be able to: define sample error and true error describe the problem of estimating hypothesis accuracy or error define and calculate typically-used evaluation measures, such as accuracy, precision, recall, etc. define and calculate typically-used compound measures and their visualisation, such as lift charts, coverage plots, ROC curves, F-measure, AUC, etc. In theory, there is no difference between theory and practice. But in practice, there is. COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49
2 2. Introduction 2. Introduction Estimating Hypothesis Accuracy Two Definitions of Error The sample error of h with respect to target function f and data sample S is the proportion of examples h misclassifies how well does a hypothesis generalize beyond the training set? need to estimate off-training-set error what is the probable error in this estimate? if one hypothesis is more accurate than another on a data set, how probable is this difference in general? error S (h) 1 δ(f (x) = h(x)) n Where δ(f (x) = h(x)) is 1 if f (x) = h(x), and 0 otherwise (cf. 0 1 loss). x S The true error of hypothesis h with respect to target function f and distribution D is the probability that h will misclassify an instance drawn at random according to D. error D (h) Pr [f (x) = h(x)] x D Question: How well does error S (h) estimate error D (h)? COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / Introduction Estimators COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / Introduction Problems Estimating Error Experiment: 1. choose sample S of size n according to distribution D 2. measure error S (h) error S (h) is a random variable (i.e., result of an experiment) error S (h) is an unbiased estimator for error D (h) Given observed error S (h) what can we conclude about error D (h)? 1. Bias: If S is training set, error S (h) is optimistically biased bias E[error S (h)] error D (h) For unbiased estimate, h and S must be chosen independently 2. Variance: Even with selection of S to give unbiased estimate, error S (h) may still vary from error D (h) COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49
3 2. Introduction Problems Estimating Error Making the most of the data Note: Estimation bias not to be confused with Inductive bias former is a numerical quantity [comes from statistics], latter is a set of assertions [comes from concept learning]. More on this in a later lecture. Once evaluation is complete, all the data can be used to build the final classifier Generally, the larger the training data the better the classifier (but returns diminish) The larger the test data the more accurate the error estimate Holdout procedure: method of splitting original data into training and test set Dilemma: ideally we want both, a large training and a large test set COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 Loss functions COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 Quadratic loss function Most common performance measure: predictive accuracy (cf. sample error) Also called 0 1 loss function: 0 if prediction is correct 1 if prediction is incorrect Classifiers can produce class probabilities i What is the accuracy of the probability estimates? 0-1 loss is not appropriate p 1,...,p k are probability estimates of all possible outcomes for an instance c is the index of the instance s actual class i.e. a 1,...,a k are zero, except for a c which is 1 the quadratic loss is: E (p j a j ) 2 = j p 2 j j =c + (1 p c ) 2 leads to preference for predictors giving best guess at true probabilities COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49
4 Informational loss function Which loss function? ] the informational loss function is log(p c ), where c is the index of the actual class of an instance number of bits required to communicate the actual class if p1,...,p are the true class probabilities k then the expected value of the informational loss function is: p 1 log 2 (p 1)... p k log 2 (p k) which is minimized for p j = p j giving the entropy of the true distribution p 1 log 2 (p 1 )... p k log 2 (p k ) quadratic loss functions takes into account all the class probability estimates for an instance informational loss focuses only on the probability estimate for the actual class quadratic loss is bounded by 1 + j p 2, can never exceed 2 j informational loss can be infinite informational loss related to MDL principle (can use bits for complexity as well as accuracy) COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 Costs of predictions COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 Confusion matrix Two-class prediction case: In practice, different types of classification errors often incur different costs Examples: Medical diagnosis (has cancer vs. not) Loan decisions Fault diagnosis Promotional mailing Predicted Class Actual Class Yes No Yes True Positive (TP) False Negative (FN) No False Positive (FP) True Negative (TN) Two kinds of error: False Positive and False Negative may have different costs. Two kinds of correct prediction: True Positive and True Negative may have different benefits. Note: total number of test set examples N = TP+ FN+ FP+ TN COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49
5 Accuracy Error rate equivalent to 1 - Accuracy, i.e., TP+ TN N FP+ FN N Precision TP TP+ FP (also called: Correctness, Positive Predictive Value) Recall TP TP+ FN (also called: TP rate, Hit rate, Sensitivity, Completeness) COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 Sensitivity True Positive (TP) Rate Specificity equivalent to 1 - FP rate (also called: TN rate) TP TP+ FN TN TN+ FP False Positive (FP) Rate equivalent to 1 - Specificity, i.e., (also called: False alarm rate) TP TP+ FN FP FP+ TN COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49
6 Negative Predictive Value Coverage TN TN+ FN Predicted Class Actual Class Yes No Yes TP FN No FP TN TP+ FP N Note: this is not an exhaustive list... same measures used under different names in different disciplines E.g., in concept learning, the number of instances in a sample predicted to be in (resp. not in) the concept is the sum of the first (resp. second) column. The number of positive (resp. negative) examples of the concept in a sample is the sum of the first (resp. second) row. N pred = TP+ FP N not_pred = FN+ TN Npos = TP+ FN Nneg = FP+ TN COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 Trade-off We can treat the evaluation measures as conditional probabilities: P(pred pos) = P(pred neg) = P(not_pred pos) = P(not_pred neg) = P(pos pred) = P(neg pred) = TP TP+FN FP FP+TN FN TP+FN TN FP+TN TP TP+FP FP TP+FP (Sensitivity) (FP rate) (FN rate) (Specificity) (Pos. Pred. Value) Trade-off good coverage of positive examples: increase TP at risk of increasing FP i.e. increase generality good proportion of positive examples: decrease FP at risk of decreasing TP i.e. decrease generality, i.e. increase specificity Different techniques give different trade-offs and can be plotted as two different lines on any of the graphical charts: Lift, ROC or recall-precision curves. P(pos not_pred) = P(neg not_pred) = FN FN+TN TN FN+TN (FN rate) (Neg. Pred. Value) COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49
7 F-measure combines precision and recall Lift charts In practice, costs are rarely known precisely Information Retrieval K. van Rijsbergen (1979) precision recall F = 2 precision + recall Combines both precision and recall in a single measure giving equal weight to both (there are variants to weight each component differently). Inverse relation of precision and recall: can change a classification rule to increase precision (positive predictive value) by reducing number of data points classified, hence reducing recall (sensitivity), and vice versa. Note: both measures ignore TN. Instead decisions often made by comparing possible scenarios Lift comes from market research, where a typical goal is to identify a profitable target sub-group out of the total population Example: promotional mailout to population of 1,000,000 potential respondents Baseline is that 0.1% of all households in total population will respond (1000) Situation 1: classifier 1 identifies target sub-group of 100,000 most promising households of which 0.4% will respond (400) Situation 2: classifier 2 identifies target sub-group of 400,000 most promising households of which 0.2% will respond (800) COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 Lift charts COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 Hypothetical Lift Chart Lift = response rate of target sub-group response rate of total population Situation 1 gives lift of = 4 Situation 2 gives lift of = 2 Note that which situation is more profitable depends on cost estimates A lift chart allows for a visual comparison Typically, use lift to see how well a classifier is doing compared to base-level (e.g., guessing most common class as in ZeroR). COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49
8 Generating a lift chart ROC curves Instances are sorted according to their predicted probability of being a true positive: Rank Predicted probability Actual class Yes Yes No Yes In lift chart, x axis is sample size and y axis is number of true positives ROC curves are related to lift charts ROC stands for receiver operating characteristic Used in signal detection to show tradeoff between hit rate and false alarm rate over noisy channel Differences to lift chart: y axis shows percentage of true positives in sample (rather than absolute number) x axis shows percentage of false positives in sample (rather than sample size) COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 A sample ROC curve COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 ROC curves Applications: if the classifier predicts the positive class with an attached value denoting strength of the prediction, for example: probability that the instance is in the target class, e.g., logistic regression, Bayesian classifiers margin by which the instance is in the target class, e.g., ensemble learners, support vector machines if classifiers are learned by the same method with a range of different parameter settings; each outcome (e.g., accuracy) is a point on the ROC curve COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49
9 Area under the ROC Curve (AUC) Area under the ROC Curve (AUC) Applicable for any learning method that can generate an ROC curve. Method: test set of P positive and N negative examples sort examples in descending order of strength of being in the positive class, as predicted by the classifier for each positive example p i P, count the number of negative examples ranked strictly below it (count 1 2 if negative example tied with a positive), call this c i sum all these counts, C = P AUC = C P N i=1 c i How should we understand AUC? A natural probabilistic interpretation of this measure: the probability that the classifier will rank a randomly-drawn positive example higher than a randomly-drawn negative one. the AUC will be 1 if all positive examples were ranked above all negative examples (compare with the ROC curve diagram) this is a version of a non-parametric statistical procedure (see Mann-Whitney U-test, or Wilcoxon rank-sum test) COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 Numeric prediction evaluation measures Numeric prediction evaluation measures Based on differences between predicted (p i ) and actual (a i ) values on a test set of n examples: Mean squared error Root mean squared error (p 1 a 1 ) (p n a n ) 2 n (p 1 a 1 ) (p n a n ) 2 n Mean absolute error p 1 a p n a n n Relative absolute error p 1 a p n a n, where ā = 1 a 1 ā a n ā n plus others, see, e.g., Weka i a i COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49
10 Table 2.1, p.52 Predictive machine learning scenarios Classification Task Label space Output space Learning problem Classification L = C Y = C learn an approximation ĉ : X C to the true labelling function c Scoring and ranking L = C Y = R C learn a model that outputs a score vector over classes Probability L = C Y = [0,1] C learn a model that outputs estimation a probability vector over classes Regression L = R Y = R learn an approximation f ˆ : X R to the true labelling function f A classifier is a mapping ĉ : X C, where C = {C 1,C 2,...,C k } is a finite and usually small set of class labels. We will sometimes also use C i to indicate the set of examples of that class. We use the hat to indicate that ĉ(x) is an estimate of the true but unknown function c(x). Examples for a classifier take the form (x,c(x)), where x X is an instance and c(x) is the true class of the instance (sometimes contaminated by noise). Learning a classifier involves constructing the function ĉ such that it matches c as closely as possible (and not just on the training set, but ideally on the entire instance space X ). COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 Figure 2.1, p.53 A decision tree Table 2.2, p.54 Contingency table!viagra"!viagra" spam: 20 ham: 40!lottery" =0 =0 =1 spam: 10 ham: 5 =1 spam: 20 ham: 5 #(x) = ham!lottery" =0 =0 =1 #(x) = spam =1 #(x) = spam (left) A feature tree with training set class distribution in the leaves. (right) A decision tree obtained using the majority class decision rule. Predicted Predicted Actual Actual (left) A two-class contingency table or confusion matrix depicting the performance of the decision tree in Figure 2.1. Numbers on the descending diagonal indicate correct predictions, while the ascending diagonal concerns prediction errors. (right) A contingency table with the same marginals but independent rows and columns. COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49
11 Example 2.1, p.56 Accuracy as a weighted average Table 2.3, p.57 Performance measures I Suppose a classifier s predictions on a test set are as in the following table: Predicted Predicted Actual Actual From this table, we see that the true positive rate is tpr = 60/75 = 0.80 and the true negative rate is tnr = 15/25 = The overall accuracy is acc = ( )/100 = 0.75, which is no longer the average of true positive and negative rates. However, taking into account the proportion of positives pos = 0.75 and the proportion of negatives neg = 1 pos = 0.25, we see that acc = pos tpr + neg tnr Measure Definition Equal to Estimates number of positives Pos = x Te I [c(x) = ] number of negatives Neg = x Te I [c(x) = ] Te Pos number of true positives TP = x Te I [ĉ(x) = c(x) = ] number of true negatives TN = x Te I [ĉ(x) = c(x) = ] number of false positives FP = x Te I [ĉ(x) =,c(x) = ] Neg TN number of false negatives FN = x Te I [ĉ(x) =,c(x) = ] Pos TP proportion of positives pos = Te 1 x Te I [c(x) = ] Pos/ Te P(c(x) = ) proportion of negatives neg = Te 1 x Te I [c(x) = ] 1 pos P(c(x) = ) class ratio clr = pos/neg Pos/Neg (*) accuracy acc = Te 1 (*) error rate err = Te 1 x Te I [ĉ(x) = c(x)] P(ĉ(x) = c(x)) x Te I [ĉ(x) = c(x)] 1 acc P(ĉ(x) = c(x)) COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 Table 2.3, p.57 Performance measures II Figure 2.2, p.58 A coverage plot Measure Definition Equal to Estimates x Te true positive rate, tpr = I [ĉ(x)=c(x)= ] x Te I [c(x)= ] TP/Pos P(ĉ(x) = c(x) = ) sensitivity, recall x Te true negative rate, tnr = I [ĉ(x)=c(x)=] x Te I [c(x)=] TN/Neg P(ĉ(x) = c(x) = ) specificity x Te false positive rate, fpr = I [ĉ(x)=,c(x)=] x Te I [c(x)=] FP/Neg = 1 tnr P(ĉ(x) = c(x) = ) false alarm rate x Te I [ĉ(x)=,c(x)= ] false negative rate fnr = x Te I [c(x)= ] FN/Pos = 1 tpr P(ĉ(x) = c(x) = ) x Te precision, confidence prec = I [ĉ(x)=c(x)= ] x Te I [ĉ(x)= ] TP/(TP + FP) P(c(x) = ĉ(x) = ) Positives 0 TP2 TP1 Pos C1 C2 0 FP1 FP2 Neg Negatives Positives 0 TP3 Pos C3 0 FP3 Neg Negatives Table: A summary of different quantities and evaluation measures for classifiers on a test set Te. Symbols starting with a capital letter denote absolute frequencies (counts), while lower-case symbols denote relative frequencies or ratios. All except those indicated with (*) are defined only for binary classification. (left) A coverage plot depicting the two contingency tables in Table 2.2. The plot is square because the class distribution is uniform. (right) Coverage plot for Example 2.1, with a class ratio clr = 3. COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49
12 Figure 2.3, p.59 An ROC plot Important points to remember Positives 0 TP2 TP1 TP3 Pos C3 C1 C2 0 FP1 FP2-3 Neg Negatives True positive rate 0 tpr2 tpr1 tpr3 1 C3 C1 C2 0 fpr1 fpr2-3 1 False positive rate In a coverage plot, classifiers with the same accuracy are connected by line segments with slope 1. In a ROC plot (i.e., a normalised coverage plot), line segments with slope 1 connect classifiers with the same average recall. (left) C1 and C3 both dominate C2, but neither dominates the other. The diagonal line indicates that C1 and C3 achieve equal accuracy. (right) The same plot with normalised axes. We can interpret this plot as a merger of the two coverage plots in Figure 2.2, employing normalisation to deal with the different class distributions. The diagonal line now indicates that C1 and C3 have the same average recall. COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / Summary Figure 2.4, p.61 Comparing coverage and ROC plots Summary Positives 0 TP1 TP2-3 Pos C2 C3 C1 0 FP1 FP2 FP3 Neg Negatives Positives 0 tpr1 tpr2-3 Pos C2 C3 C1 0 fpr1 fpr2 fpr3 Neg Negatives (left) In a coverage plot, accuracy isometrics have a slope of 1, and average recall isometrics are parallel to the ascending diagonal. (right) In the corresponding ROC plot, average recall isometrics have a slope of 1; the accuracy isometric here has a slope of 3, corresponding to the ratio of negatives to positives in the data set. Evaluation for machine learning and data mining is a complex issue Some commonly used methods have been found to work well in practice fold cross-validation corrected resampled t-test (Weka) Issues to be aware of multiple testing: from a large set of hypotheses, some will appear good at random does the off-training distribution match that of the training set? COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49 COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49
13 5. Summary Summary A cautionary tale... In a WW2 survey carried out by RAF Bomber Command all bombers returning from bombing raids over Germany over a particular period were inspected. All damage inflicted by German air defences was noted and an initial recommendation was given that armour be added in the most heavily damaged areas. Further analysis instead made the surprising and counter-intuitive recommendation that the armour be placed in the areas which were completely untouched by damage. The reasoning was that the survey was biased, since it only included aircraft that successfully came back from Germany. The untouched areas were probably vital areas, which, if hit, would result in the loss of the aircraft. COMP9417 ML & DM (CSE, UNSW) Evaluation in Machine Learning (1) March 19, / 49
Evaluation & Validation: Credibility: Evaluating what has been learned
Evaluation & Validation: Credibility: Evaluating what has been learned How predictive is a learned model? How can we evaluate a model Test the model Statistical tests Considerations in evaluating a Model
More informationPerformance Measures in Data Mining
Performance Measures in Data Mining Common Performance Measures used in Data Mining and Machine Learning Approaches L. Richter J.M. Cejuela Department of Computer Science Technische Universität München
More informationData Mining Practical Machine Learning Tools and Techniques
Credibility: Evaluating what s been learned Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Issues: training, testing,
More informationOverview. Evaluation Connectionist and Statistical Language Processing. Test and Validation Set. Training and Test Set
Overview Evaluation Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes training set, validation set, test set holdout, stratification
More informationPerformance Measures for Machine Learning
Performance Measures for Machine Learning 1 Performance Measures Accuracy Weighted (Cost-Sensitive) Accuracy Lift Precision/Recall F Break Even Point ROC ROC Area 2 Accuracy Target: 0/1, -1/+1, True/False,
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Lecture 15 - ROC, AUC & Lift Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-17-AUC
More informationData Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
More informationPerformance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification
Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati tnpatil2@gmail.com, ss_sherekar@rediffmail.com
More informationSupervised Learning (Big Data Analytics)
Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationChapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
More informationA General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions
A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions Jing Gao Wei Fan Jiawei Han Philip S. Yu University of Illinois at Urbana-Champaign IBM T. J. Watson Research Center
More informationT-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari : 245577
T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier Santosh Tirunagari : 245577 January 20, 2011 Abstract This term project gives a solution how to classify an email as spam or
More informationPerformance Metrics. number of mistakes total number of observations. err = p.1/1
p.1/1 Performance Metrics The simplest performance metric is the model error defined as the number of mistakes the model makes on a data set divided by the number of observations in the data set, err =
More informationArtificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier
International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6, January 2013 Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing
More informationData Mining. Nonlinear Classification
Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15
More informationExperiments in Web Page Classification for Semantic Web
Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address
More informationData Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
More informationKeywords Data mining, Classification Algorithm, Decision tree, J48, Random forest, Random tree, LMT, WEKA 3.7. Fig.1. Data mining techniques.
International Journal of Emerging Research in Management &Technology Research Article October 2015 Comparative Study of Various Decision Tree Classification Algorithm Using WEKA Purva Sewaiwar, Kamal Kant
More information1. Classification problems
Neural and Evolutionary Computing. Lab 1: Classification problems Machine Learning test data repository Weka data mining platform Introduction Scilab 1. Classification problems The main aim of a classification
More informationAzure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
More informationData Mining Practical Machine Learning Tools and Techniques
Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea
More informationHealth Care and Life Sciences
Sensitivity, Specificity, Accuracy, Associated Confidence Interval and ROC Analysis with Practical SAS Implementations Wen Zhu 1, Nancy Zeng 2, Ning Wang 2 1 K&L consulting services, Inc, Fort Washington,
More informationAnalysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News
Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News Sushilkumar Kalmegh Associate Professor, Department of Computer Science, Sant Gadge Baba Amravati
More informationEnsemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05
Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification
More informationMining the Software Change Repository of a Legacy Telephony System
Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,
More informationIntroduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk
Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk Ensembles 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training
More informationPractical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
More informationROC Curve, Lift Chart and Calibration Plot
Metodološki zvezki, Vol. 3, No. 1, 26, 89-18 ROC Curve, Lift Chart and Calibration Plot Miha Vuk 1, Tomaž Curk 2 Abstract This paper presents ROC curve, lift chart and calibration plot, three well known
More informationPerformance Metrics for Graph Mining Tasks
Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics Supervised Learning Performance Metrics Unsupervised Learning Performance Metrics Optimizing Metrics Statistical
More informationBusiness Case Development for Credit and Debit Card Fraud Re- Scoring Models
Business Case Development for Credit and Debit Card Fraud Re- Scoring Models Kurt Gutzmann Managing Director & Chief ScienAst GCX Advanced Analy.cs LLC www.gcxanalyacs.com October 20, 2011 www.gcxanalyacs.com
More informationSTATISTICA. Financial Institutions. Case Study: Credit Scoring. and
Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT
More informationMACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
More informationDecision Trees from large Databases: SLIQ
Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 11 Sajjad Haider Fall 2013 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right
More informationIdentifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100
Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Erkan Er Abstract In this paper, a model for predicting students performance levels is proposed which employs three
More informationCI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.
CI6227: Data Mining Lesson 11b: Ensemble Learning Sinno Jialin PAN Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore Acknowledgements: slides are adapted from the lecture notes
More informationBoosting. riedmiller@informatik.uni-freiburg.de
. Machine Learning Boosting Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut für Informatik Technische Fakultät Albert-Ludwigs-Universität Freiburg riedmiller@informatik.uni-freiburg.de
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationBIDM Project. Predicting the contract type for IT/ITES outsourcing contracts
BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an
More informationNon-Parametric Tests (I)
Lecture 5: Non-Parametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of Distribution-Free Tests (ii) Median Test for Two Independent
More informationData Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product
Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product Sagarika Prusty Web Data Mining (ECT 584),Spring 2013 DePaul University,Chicago sagarikaprusty@gmail.com Keywords:
More informationMachine Learning. Mausam (based on slides by Tom Mitchell, Oren Etzioni and Pedro Domingos)
Machine Learning Mausam (based on slides by Tom Mitchell, Oren Etzioni and Pedro Domingos) What Is Machine Learning? A computer program is said to learn from experience E with respect to some class of
More informationW6.B.1. FAQs CS535 BIG DATA W6.B.3. 4. If the distance of the point is additionally less than the tight distance T 2, remove it from the original set
http://wwwcscolostateedu/~cs535 W6B W6B2 CS535 BIG DAA FAQs Please prepare for the last minute rush Store your output files safely Partial score will be given for the output from less than 50GB input Computer
More informationProjects Involving Statistics (& SPSS)
Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,
More informationDiscovering process models from empirical data
Discovering process models from empirical data Laura Măruşter (l.maruster@tm.tue.nl), Ton Weijters (a.j.m.m.weijters@tm.tue.nl) and Wil van der Aalst (w.m.p.aalst@tm.tue.nl) Eindhoven University of Technology,
More informationHow To Solve The Class Imbalance Problem In Data Mining
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART C: APPLICATIONS AND REVIEWS 1 A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches Mikel Galar,
More informationData Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation. Lecture Notes for Chapter 4. Introduction to Data Mining
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data
More informationMachine Learning Final Project Spam Email Filtering
Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE
More informationQuality and Complexity Measures for Data Linkage and Deduplication
Quality and Complexity Measures for Data Linkage and Deduplication Peter Christen and Karl Goiser Department of Computer Science, Australian National University, Canberra ACT 0200, Australia {peter.christen,karl.goiser}@anu.edu.au
More informationIntroduction to Learning & Decision Trees
Artificial Intelligence: Representation and Problem Solving 5-38 April 0, 2007 Introduction to Learning & Decision Trees Learning and Decision Trees to learning What is learning? - more than just memorizing
More informationALGEBRA. sequence, term, nth term, consecutive, rule, relationship, generate, predict, continue increase, decrease finite, infinite
ALGEBRA Pupils should be taught to: Generate and describe sequences As outcomes, Year 7 pupils should, for example: Use, read and write, spelling correctly: sequence, term, nth term, consecutive, rule,
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Evaluating the Accuracy of a Classifier Holdout, random subsampling, crossvalidation, and the bootstrap are common techniques for
More informationReview of Fundamental Mathematics
Review of Fundamental Mathematics As explained in the Preface and in Chapter 1 of your textbook, managerial economics applies microeconomic theory to business decision making. The decision-making tools
More informationGamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
More informationAn Approach to Detect Spam Emails by Using Majority Voting
An Approach to Detect Spam Emails by Using Majority Voting Roohi Hussain Department of Computer Engineering, National University of Science and Technology, H-12 Islamabad, Pakistan Usman Qamar Faculty,
More informationNonlinear Regression Functions. SW Ch 8 1/54/
Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General
More informationHow To Understand The Impact Of A Computer On Organization
International Journal of Research in Engineering & Technology (IJRET) Vol. 1, Issue 1, June 2013, 1-6 Impact Journals IMPACT OF COMPUTER ON ORGANIZATION A. D. BHOSALE 1 & MARATHE DAGADU MITHARAM 2 1 Department
More informationThe Relationship Between Precision-Recall and ROC Curves
Jesse Davis jdavis@cs.wisc.edu Mark Goadrich richm@cs.wisc.edu Department of Computer Sciences and Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, 2 West Dayton Street,
More informationBetter credit models benefit us all
Better credit models benefit us all Agenda Credit Scoring - Overview Random Forest - Overview Random Forest outperform logistic regression for credit scoring out of the box Interaction term hypothesis
More informationUsing Random Forest to Learn Imbalanced Data
Using Random Forest to Learn Imbalanced Data Chao Chen, chenchao@stat.berkeley.edu Department of Statistics,UC Berkeley Andy Liaw, andy liaw@merck.com Biometrics Research,Merck Research Labs Leo Breiman,
More information11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial Least Squares Regression
Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c11 2013/9/9 page 221 le-tex 221 11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial
More informationCOMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction
COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationModel Combination. 24 Novembre 2009
Model Combination 24 Novembre 2009 Datamining 1 2009-2010 Plan 1 Principles of model combination 2 Resampling methods Bagging Random Forests Boosting 3 Hybrid methods Stacking Generic algorithm for mulistrategy
More informationSpyware Prevention by Classifying End User License Agreements
Spyware Prevention by Classifying End User License Agreements Niklas Lavesson, Paul Davidsson, Martin Boldt, and Andreas Jacobsson Blekinge Institute of Technology, Dept. of Software and Systems Engineering,
More informationData mining techniques: decision trees
Data mining techniques: decision trees 1/39 Agenda Rule systems Building rule systems vs rule systems Quick reference 2/39 1 Agenda Rule systems Building rule systems vs rule systems Quick reference 3/39
More informationMethods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL
Paper SA01-2012 Methods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL ABSTRACT Analysts typically consider combinations
More information!"!!"#$$%&'()*+$(,%!"#$%$&'()*""%(+,'-*&./#-$&'(-&(0*".$#-$1"(2&."3$'45"
!"!!"#$$%&'()*+$(,%!"#$%$&'()*""%(+,'-*&./#-$&'(-&(0*".$#-$1"(2&."3$'45"!"#"$%&#'()*+',$$-.&#',/"-0%.12'32./4'5,5'6/%&)$).2&'7./&)8'5,5'9/2%.%3%&8':")08';:
More informationJournal of Engineering Science and Technology Review 7 (4) (2014) 89-96
Jestr Journal of Engineering Science and Technology Review 7 (4) (2014) 89-96 JOURNAL OF Engineering Science and Technology Review www.jestr.org Applying Fuzzy Logic and Data Mining Techniques in Wireless
More informationActive Learning SVM for Blogs recommendation
Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the
More informationPREDICTING SUCCESS IN THE COMPUTER SCIENCE DEGREE USING ROC ANALYSIS
PREDICTING SUCCESS IN THE COMPUTER SCIENCE DEGREE USING ROC ANALYSIS Arturo Fornés arforser@fiv.upv.es, José A. Conejero aconejero@mat.upv.es 1, Antonio Molina amolina@dsic.upv.es, Antonio Pérez aperez@upvnet.upv.es,
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More informationDynamic Predictive Modeling in Claims Management - Is it a Game Changer?
Dynamic Predictive Modeling in Claims Management - Is it a Game Changer? Anil Joshi Alan Josefsek Bob Mattison Anil Joshi is the President and CEO of AnalyticsPlus, Inc. (www.analyticsplus.com)- a Chicago
More informationAdvanced analytics at your hands
2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously
More informationSTANDARDISATION AND CLASSIFICATION OF ALERTS GENERATED BY INTRUSION DETECTION SYSTEMS
STANDARDISATION AND CLASSIFICATION OF ALERTS GENERATED BY INTRUSION DETECTION SYSTEMS Athira A B 1 and Vinod Pathari 2 1 Department of Computer Engineering,National Institute Of Technology Calicut, India
More informationData Mining for Knowledge Management. Classification
1 Data Mining for Knowledge Management Classification Themis Palpanas University of Trento http://disi.unitn.eu/~themis Data Mining for Knowledge Management 1 Thanks for slides to: Jiawei Han Eamonn Keogh
More informationMeasurement in ediscovery
Measurement in ediscovery A Technical White Paper Herbert Roitblat, Ph.D. CTO, Chief Scientist Measurement in ediscovery From an information-science perspective, ediscovery is about separating the responsive
More informationMining Life Insurance Data for Customer Attrition Analysis
Mining Life Insurance Data for Customer Attrition Analysis T. L. Oshini Goonetilleke Informatics Institute of Technology/Department of Computing, Colombo, Sri Lanka Email: oshini.g@iit.ac.lk H. A. Caldera
More informationData Mining Methods: Applications for Institutional Research
Data Mining Methods: Applications for Institutional Research Nora Galambos, PhD Office of Institutional Research, Planning & Effectiveness Stony Brook University NEAIR Annual Conference Philadelphia 2014
More informationMachine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer
Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next
More informationNonparametric statistics and model selection
Chapter 5 Nonparametric statistics and model selection In Chapter, we learned about the t-test and its variations. These were designed to compare sample means, and relied heavily on assumptions of normality.
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationOrdinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
More informationIntroduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu
Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics
More informationASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS
DATABASE MARKETING Fall 2015, max 24 credits Dead line 15.10. ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS PART A Gains chart with excel Prepare a gains chart from the data in \\work\courses\e\27\e20100\ass4b.xls.
More informationTwo Correlated Proportions (McNemar Test)
Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with
More informationA Two-Pass Statistical Approach for Automatic Personalized Spam Filtering
A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering Khurum Nazir Junejo, Mirza Muhammad Yousaf, and Asim Karim Dept. of Computer Science, Lahore University of Management Sciences
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationData Mining Techniques for Prognosis in Pancreatic Cancer
Data Mining Techniques for Prognosis in Pancreatic Cancer by Stuart Floyd A Thesis Submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUE In partial fulfillment of the requirements for the Degree
More informationMAXIMIZING RETURN ON DIRECT MARKETING CAMPAIGNS
MAXIMIZING RETURN ON DIRET MARKETING AMPAIGNS IN OMMERIAL BANKING S 229 Project: Final Report Oleksandra Onosova INTRODUTION Recent innovations in cloud computing and unified communications have made a
More informationData Mining - The Next Mining Boom?
Howard Ong Principal Consultant Aurora Consulting Pty Ltd Abstract This paper introduces Data Mining to its audience by explaining Data Mining in the context of Corporate and Business Intelligence Reporting.
More informationHow To Cluster
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationQuestion 2 Naïve Bayes (16 points)
Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the
More informationLeveraging Ensemble Models in SAS Enterprise Miner
ABSTRACT Paper SAS133-2014 Leveraging Ensemble Models in SAS Enterprise Miner Miguel Maldonado, Jared Dean, Wendy Czika, and Susan Haller SAS Institute Inc. Ensemble models combine two or more models to
More informationROC Graphs: Notes and Practical Considerations for Data Mining Researchers
ROC Graphs: Notes and Practical Considerations for Data Mining Researchers Tom Fawcett Intelligent Enterprise Technologies Laboratory HP Laboratories Palo Alto HPL-23-4 January 7 th, 23* E-mail: tom_fawcett@hp.com
More information2 Decision tree + Cross-validation with R (package rpart)
1 Subject Using cross-validation for the performance evaluation of decision trees with R, KNIME and RAPIDMINER. This paper takes one of our old study on the implementation of cross-validation for assessing
More information