Classification 1. Jun Du The University of Western Ontario

Size: px
Start display at page:

Download "Classification 1. Jun Du The University of Western Ontario"

Transcription

1 Classification 1 Jun Du The University of Western Ontario jdu43@uwo.ca

2 Outline Supervised Learning: Classification vs Regression Decision Tree (Classification) Naïve Bayes (Classification) Instance-Based Classifiers --- KNN (Classification) 1

3 Supervised Learning Given a collection of examples/instances (training set) Each example contains a set of attributes/features, one of the attributes is the class/label. Find a model for class attribute as a function of the values of other attributes. Goal: previously unseen examples should be assigned a class as accurately as possible. A test set is used to determine the accuracy of the model. The given data set is divided into training and test sets, with training set used to build the model and test set used to validate it. 2

4 10 10 Illustrating Supervised Learning Task Tid Attrib1 Attrib2 Attrib3 Class 1 Yes Large 125K No 2 No Medium 100K No Learning algorithm 3 No Small 70K No 4 Yes Medium 120K No 5 No Large 95K Yes Induction 6 No Medium 60K No 7 Yes Large 220K No 8 No Small 85K Yes 9 No Medium 75K No Learn Model 10 No Small 90K Yes Training Set Tid Attrib1 Attrib2 Attrib3 Class Apply Model Model 11 No Small 55K? 12 Yes Medium 80K? 13 Yes Large 110K? Deduction 14 No Small 95K? 15 No Large 67K? Test Set 3

5 2 Types of Supervised Learning There are usually 2 types of supervised learning tasks Classification Regression Classification: To predict discrete / nominal value Regression: To predict continuous / numeric value Although the difference seems to be insignificant, the models (and the techniques to build the models) are totally different. 4

6 10 10 Recall Illustration What makes the difference between classification and regression in the illustration? Tid Attrib1 Attrib2 Attrib3 Class 1 Yes Large 125K No 2 No Medium 100K No Learning algorithm 3 No Small 70K No 4 Yes Medium 120K No 5 No Large 95K Yes Induction 6 No Medium 60K No 7 Yes Large 220K No 8 No Small 85K Yes 9 No Medium 75K No Learn Model 10 No Small 90K Yes Training Set Tid Attrib1 Attrib2 Attrib3 Class Apply Model Model 11 No Small 55K? 12 Yes Medium 80K? 13 Yes Large 110K? Deduction 14 No Small 95K? 15 No Large 67K? Test Set 5

7 Examples of Different Tasks Predict tumor cells as benign or malignant Predict tumor size Predict credit card transactions as legitimate or fraudulent Predict credit score Predict whether a cell phone customer will switch to other telecommunication company Predict how much profit a cell phone customer can bring 6

8 Classification: Decision Tree Naïve Bayes K Nearest Neighbor Neural Networks Support Vector Machines Ensemble Methods Regression Linear Regression Algorithms 7

9 Outline Supervised Learning: Classification vs Regression Decision Tree (Classification) Naïve Bayes (Classification) Instance-Based Classifiers --- KNN (Classification) 8

10 10 Example of Decision Tree Tid Refund Marital Status Taxable Income Cheat Splitting Attributes 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes Refund Yes No NO MarSt Single, Divorced TaxInc < 80K > 80K Married NO 9 No Married 75K No 10 No Single 90K Yes NO YES Training Data Model: Decision Tree 9

11 10 Another Example Tid Refund Marital Status Taxable Income 1 Yes Single 125K No 2 No Married 100K No Cheat Married NO MarSt Yes Single, Divorced Refund No 3 No Single 70K No NO TaxInc 4 Yes Married 120K No < 80K > 80K 5 No Divorced 95K Yes 6 No Married 60K No NO YES 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes There could be more than one tree that fits the same data! 10

12 10 10 Decision Tree Classification Task Tid Attrib1 Attrib2 Attrib3 Class 1 Yes Large 125K No 2 No Medium 100K No Tree Induction algorithm 3 No Small 70K No 4 Yes Medium 120K No 5 No Large 95K Yes Induction 6 No Medium 60K No 7 Yes Large 220K No 8 No Small 85K Yes 9 No Medium 75K No Learn Model 10 No Small 90K Yes Training Set Tid Attrib1 Attrib2 Attrib3 Class Apply Model Model Decision Tree 11 No Small 55K? 12 Yes Medium 80K? 13 Yes Large 110K? Deduction 14 No Small 95K? 15 No Large 67K? Test Set 11

13 10 Apply Model to Test Data (1) Basic idea: start from the root of tree, follow the corresponding branches, and reach an external node Refund Test Data Yes NO No MarSt Single, Divorced Married Refund Marital Status Taxable Income No Married 80K? Cheat TaxInc < 80K > 80K NO NO YES 12

14 10 Apply Model to Test Data (2) Test Data Refund Marital Status Taxable Income Cheat Yes Refund No No Married 80K? NO Single, Divorced MarSt Married TaxInc < 80K > 80K NO NO YES 13

15 10 Apply Model to Test Data (3) Test Data Refund Marital Status Taxable Income Cheat Yes Refund No No Married 80K? NO Single, Divorced MarSt Married TaxInc < 80K > 80K NO NO YES 14

16 10 Apply Model to Test Data (4) Test Data Refund Marital Status Taxable Income Cheat Yes Refund No No Married 80K? NO Single, Divorced MarSt Married TaxInc < 80K > 80K NO NO YES 15

17 10 Apply Model to Test Data (5) Test Data Refund Marital Status Taxable Income Cheat Yes Refund No No Married 80K? NO Single, Divorced MarSt Married TaxInc < 80K > 80K NO NO YES 16

18 10 Apply Model to Test Data (6) Test Data Refund Marital Status Taxable Income Cheat Yes Refund No No Married 80K? NO Single, Divorced MarSt Married Assign Cheat to No TaxInc NO < 80K > 80K NO YES 17

19 10 10 Decision Tree Classification Task Tid Attrib1 Attrib2 Attrib3 Class 1 Yes Large 125K No 2 No Medium 100K No Tree Induction algorithm 3 No Small 70K No 4 Yes Medium 120K No 5 No Large 95K Yes Induction 6 No Medium 60K No 7 Yes Large 220K No 8 No Small 85K Yes 9 No Medium 75K No Learn Model 10 No Small 90K Yes Training Set Tid Attrib1 Attrib2 Attrib3 Class Apply Model Model Decision Tree 11 No Small 55K? 12 Yes Medium 80K? 13 Yes Large 110K? Deduction 14 No Small 95K? 15 No Large 67K? Test Set 18

20 Decision Tree Induction Many Algorithms: ID3 C4.5 C5 CART 19

21 10 Basic Algorithm Yes Don t Cheat Don t Cheat Refund Single, Divorced Y: 3 N: 1 Y: 3 N: 7 Cheat Y: 0 N: 3 No Marital Status Yes Don t Cheat Refund Married Don t Cheat Y: 0 N: 3 No Don t Cheat Yes Don t Cheat Y: 3 N: 4 Refund Single, Divorced Taxable Income No Marital Status < 80K >= 80K Married Don t Cheat Tid Refund Marital Status Taxable Income Cheat 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes Don t Cheat Cheat 20

22 Key issue: Tree Induction Determine which feature to select to build the tree Short answer: Design a tree splitting criterion Select the feature that best meets the criterion to expand the tree Greedy strategy --- based on certain criterion, greedily select features to split examples. 21

23 Which Attribute to Select -- Intuition Before Splitting: 10 examples of class 0 10 examples of class 1 Which split is better? Why? 22

24 Splitting Criterion Greedy approach: Nodes with homogeneous / pure class distribution are preferred Need a measure of node purity: C0: 5 C1: 5 Non-homogeneous Low degree of purity C0: 9 C1: 1 Homogeneous High degree of purity 23

25 Which Attribute to Select --- Intuition 24

26 Measures of Node Purity Entropy (ID3 / C4.5) Information Gain Gain Ratio Gini Index (CART) Misclassification Error 25

27 Computing Entropy (1) Formula for computing the entropy: entropy( p1, p2,, pn) p1log p1 p2logp2 p n logp n p(yes) = 0.5; p(no) = 0.5 entropy (p(yes), p(no)) = 0.5 log(0.5) 0.5 log(0.5) = 1 26

28 Computing Entropy (2) Formula for computing the entropy: entropy( p1, p2,, pn) p1log p1 p2logp2 p n logp n p(yes) = 1; p(no) = 0 entropy (p(yes), p(no)) = 1 log(1) 0 log(0) = 0 Pure Node Low Entropy Good; Impure Node High Entropy Bad 27

29 Example: Attribute Outlook Outlook = Sunny : info([2,3] ) Outlook = Overcast : info([4,0] ) Outlook = Rainy : entropy(2/ 5,3/5) entropy(1, 0) info([3,2] ) entropy(3/ 5,2/5) Expected info for Outlook --- Weighted Sum info([3,2],[4,0],[3,2]) (5/14) (4/14) 0 (5/14)

30 Computing information gain Information gain: (information before split) (information after split) gain(outlo ok) info([9,5])-info([2,3],[4,0],[3,2]) Intuitively, information gain refers to how much information can be gained by selecting the corresponding attribute to split the data High information gain Good Low information gain Bad 29

31 Example: Attribute Humidity Humidity = High : info([3,4] ) Humidity = Normal : entropy(3/ 7,4/7) info([6,1] ) entropy(6/ 7,1/7) Expected information for Humidity : info([3,4],[6,1]) Information Gain: (7/14) (7/14) info([9,5] ) -info([3,4],[6,1])

32 Select the Attribute Information gain for all attributes from weather data: gain(" Outlook") gain(" Humidity") gain(" Temperature") gain(" Windy") Outlook is selected to build the tree. What s next? 31

33 Continuing to split gain(" Temperature") gain(" Windy") gain(" Humidity")

34 Final Decision Tree Splitting stops when data can t be split any further All examples in the same node belong to the same class No new attribute can be used to split the data (when could it happen?) (or early termination; to be discussed later) 33

35 Not Done Yet Example: ID Outlook Temperature Humidity Windy Play? A sunny hot high false No B sunny hot high true No C overcast hot high false Yes D rain mild high false Yes E rain cool normal false Yes F rain cool normal true No G overcast cool normal true Yes H sunny mild high false No I sunny cool normal false Yes J rain mild normal false Yes K sunny mild normal true Yes L overcast mild high true Yes M overcast hot normal false Yes N rain mild high true No 34

36 Split for ID Code Attribute Entropy of each leaf node = 0 Since each leaf node is pure, having only one case. Information gain is maximized for ID Code The final tree will be like the above. Is it good? Why? 35

37 Limitation of Information Gain Problematic: attributes with a large number of values (extreme case: ID code) Subsets are more likely to be pure if there is a large number of values Information gain is biased towards choosing attributes with a large number of values Solution Gain Ratio 36

38 Gain Ratio Gain ratio: a modification of the information gain that reduces its bias on high-branch attributes Gain ratio takes number and size of branches into account when choosing an attribute It corrects the information gain by taking the Split-Info of a split into account 37

39 Computing Intrinsic Information Split-Info: entropy of distribution of instances into branches. Example: ID Code Split_Info for ID code: split_info ("ID_code") entropy( 1, ,, 1 ) ( 1 log 14 1 )

40 Computing Gain Ratio Gain Ratio Formula: gain_ratio ("Attribute") gain(" Attribute") split_info("attribute") Example: ID Code Gain Ratio for ID code: gain_ratio ("ID_code") gain(" ID_code") split_info("id_code")

41 Gain Ratios for Weather Data Outlook Temperature Info: Info: Gain: Gain: Split info: info([5,4,5]) Split info: info([4,6,4]) Gain ratio: 0.247/ Gain ratio: 0.029/ Humidity Windy Info: Info: Gain: Gain: Split info: info([7,7]) Split info: info([8,6]) Gain ratio: 0.152/ Gain ratio: 0.048/

42 More on Gain Ratio Outlook comes out top (among the 4 attributes) However: ID code still has greater gain ratio Gain Ratio alleviates the bias, but still will select ID code Standard fix: ad hoc test to prevent splitting on that type of attribute 41

43 Recall Measures of Node Purity Entropy (ID3 / C4.5) Information Gain Gain Ratio Gini Index (CART) Misclassification Error 42

44 Comparison among Splitting Criteria For a 2-class problem: 43

45 Summary So Far We know what a decision tree model is. We know how to make predictions by using a decision tree model. We know how to build a decision tree model (based on different splitting criteria) Information gain Gain ratio 44

46 Questions Is it possible that a decision tree has 100% accuracy in the training data? If it is possible, in what situation it will happen, and in what situation it won t happen? If it is not possible, why? Is it good for a predictive model? 45

47 Recall: When to Stop Splitting? Splitting stops when data can t be split any further All examples in the same node belong to the same class No new attribute can be used to split the data Decision tree overfits the training data (including noise), and won t perform well in predicting new data. Overfitting! 46

48 Pre-Pruning (Early Stopping) Stop the algorithm before it becomes a fully-grown tree. More restrictive conditions: Stop if number of instances is less than some userspecified threshold Stop if expanding the current node does not improve impurity measures (e.g., Gini or information gain) Other statistical method (e.g., using 2 test) In Practice, pre-pruning is usually not preferred, due to stop too early. 47

49 Post-pruning Grow decision tree to its entirety Trim the nodes of the decision tree in a bottom-up fashion Class label of leaf node is determined from majority class of instances Different strategies to prune the tree (not discussed) Post-pruning usually has better predictive performance, but is more complex and time-consuming. 48

50 Summary on Decision Tree Other practical issues Handle numeric values Handle missing values Advantages: Simple Inexpensive to construct Extremely fast at classifying unknown records Easy to interpret for small-sized trees Accuracy is comparable to other classification techniques for many simple data sets 49

51 Id3: \Weka\contact_lenses Id3 limitations J48: Demonstration \Weka\contact_lenses (vs Id3) \UCI\autos (numeric value, missing value) \UCI\kr-vs-kp (vs Id3) \UCI\splice (vs Id3) 50

52 Outline Supervised Learning: Classification vs Regression Decision Tree (Classification) Naïve Bayes (Classification) Instance-Based Classifiers --- KNN (Classification) 51

53 Bayesian Classifier A probabilistic framework for solving classification problems, based on Bayes theorem Recall Bayes theorem Conditional Probability: Bayes theorem: ) ( ) ( ) ( ) ( A P C P C A P A C P ) ( ), ( ) ( ; ) ( ), ( ) ( C P A C P C A P A P A C P A C P 52

54 Example of Bayes Theorem Given: A doctor knows that meningitis causes stiff neck 50% of the time Prior probability of any patient having meningitis is 1/50,000 Prior probability of any patient having stiff neck is 1/20 If a patient has stiff neck, what s the probability he/she has meningitis? P( S M ) P( M ) 0.5 1/ P( M S) P( S) 1/

55 Bayesian Classifiers (1) Given a new example with attribute values (a 1, a 2,, a n ) Goal is to predict class C E.g., C = {c 1, c 2 } Estimate P(C a 1, a 2,, a n ). E.g., p(c 1 a 1, a 2,, a n ) = 0.9; p(c 2 a 1, a 2,, a n ) = 0.1; Find the value of C that maximizes P(C a 1, a 2,, a n ) P(c 1 a 1, a 2,, a n ) > P(c 2 A 1, A 2,, A n ) Predicting C= c 1 Can we estimate P(C a 1, a 2,, a n ) from training data? 54

56 Approach: Bayesian Classifiers (2) compute the posterior probability P(C a 1, a 2,, a n ) for all values of C using the Bayes theorem P( C a a 1 2 a P( a1a2 an C) P( C) P( a a a ) Choose value of C that maximizes P(C a 1, a 2,, a n ) Equivalent to choosing value of C that maximizes P(a 1, a 2,, a n C) P(C) How to estimate P(a 1, a 2,, a n C ) and P(C)? n ) 1 2 n 55

57 Estimating P(C) Given n training examples; C={c 1, c 2,, c j, } p( c j # examples ( C ) n c j ) age income student credit_rating Class Example: C = {yes, no} P(C=yes) = 9/14=0.643 P(C=no) = 5/14=0.357 <=30 high no fair no <=30 high no excellent no high no fair yes >40 medium no fair yes >40 low yes fair yes >40 low yes excellent no low yes excellent yes <=30 medium no fair no <=30 low yes fair yes >40 medium yes fair yes <=30 medium yes excellent yes medium no excellent yes high yes fair yes >40 medium no excellent no 56

58 Estimating P(a 1, a 2,, a n C ) Assumption: given the class value, attributes are statistically independent i.e., If the class is known, knowing the value of one attribute says nothing about the value of another Mathematically, Estimating p(a i c j ) from given training data ) ( ) ( ) ( ) ( ),,,, ( j n j j j j n c a p c a p c a p c a p c a a a a p ) ( # ), ( # ) ( j j i i j i c C examples c C a A examples c a p 57

59 Example - Estimating P(a 1,a 2,,a n C) Estimating: P(age <=30, Income=medium, Student=yes, Credit=Fair Class=Yes) age income student credit_rating Class <=30 high no fair no <=30 high no excellent no high no fair yes >40 medium no fair yes P(age<=30 class=yes)=2/9 p(income=medium class=yes)=4/9 p(student=yes class=yes)=6/9 p(credit=fair class=yes)=6/9 >40 low yes fair yes >40 low yes excellent no low yes excellent yes <=30 medium no fair no <=30 low yes fair yes >40 medium yes fair yes <=30 medium yes excellent yes medium no excellent yes P( ) = 2/9 * 4/9 * 6/9 * 6/9 = high yes fair yes >40 medium no excellent no 58

60 Example Naïve Bayes Classifier New example to be classified: X = (age <=30, Income=medium, Student = yes, Credit = Fair) Estimating P(C) P(class=yes)=0.643, p(class=no)=0.357 Estimating P(a 1, a 2,, a n C ) P(X class=yes) = P(X class=no) = Calculating P(C)P(a 1, a 2,, a n C ) P(X class=yes)*p(class=yes) = P(X class=no)*p(class=no) = Therefore, X belongs to class YES age income student credit_rating Class <=30 high no fair no <=30 high no excellent no high no fair yes >40 medium no fair yes >40 low yes fair yes >40 low yes excellent no low yes excellent yes <=30 medium no fair no <=30 low yes fair yes >40 medium yes fair yes <=30 medium yes excellent yes medium no excellent yes high yes fair yes >40 medium no excellent no 59

61 Exercise Naïve Bayes Classifier Name Give Birth Can Fly Live in Water Have Legs Class human yes no no yes mammals python no no no no non-mammals salmon no no yes no non-mammals whale yes no yes no mammals frog no no sometimes yes non-mammals komodo no no no yes non-mammals bat yes yes no yes mammals pigeon no yes no yes non-mammals cat yes no no yes mammals leopard shark yes no yes no non-mammals turtle no no sometimes yes non-mammals penguin no no sometimes yes non-mammals porcupine yes no no yes mammals eel no no yes no non-mammals salamander no no sometimes yes non-mammals gila monster no no no yes non-mammals platypus no no no yes mammals owl no yes no yes non-mammals dolphin yes no yes no mammals eagle no yes no yes non-mammals Give Birth Can Fly Live in Water Have Legs Class yes no yes no? 60

62 Exercise - Answer Name Give Birth Can Fly Live in Water Have Legs Class human yes no no yes mammals python no no no no non-mammals salmon no no yes no non-mammals whale yes no yes no mammals frog no no sometimes yes non-mammals komodo no no no yes non-mammals bat yes yes no yes mammals pigeon no yes no yes non-mammals cat yes no no yes mammals leopard shark yes no yes no non-mammals turtle no no sometimes yes non-mammals penguin no no sometimes yes non-mammals porcupine yes no no yes mammals eel no no yes no non-mammals salamander no no sometimes yes non-mammals gila monster no no no yes non-mammals platypus no no no yes mammals owl no yes no yes non-mammals dolphin yes no yes no mammals eagle no yes no yes non-mammals Give Birth Can Fly Live in Water Have Legs Class yes no yes no? A: attributes M: mammals; N: non-mammals 7 P( M ) P( N) P( A M ) P( A N) P( A M ) P( M ) P( A N) P( N) P(A M)P(M) > P(A N)P(N) => Mammals 61

63 Summary - Naïve Bayes It is call naive due to the strong independence assumption (which makes the estimation much easier). The independence assumption is almost never correct! But this scheme works well in practice. Extension: Handling missing values Handling numeric values 62

64 Demonstration Naïve Bayes: \Weka\contact_lenses (vs Id3) \UCI\autos (vs J48) \UCI\kr-vs-kp (vs Id3, J48) \UCI\splice (vs Id3, J48) 63

65 Outline Supervised Learning: Classification vs Regression Decision Tree (Classification) Naïve Bayes (Classification) Instance-Based Classifiers --- KNN (Classification) 64

66 Instance-Based Classifiers Set of Stored Cases Atr1... AtrN Class A B B C A C B Store the training records Use training records to predict the class labels of unseen cases (without building an explicit model; called lazy learner ) Unseen Case Atr1... AtrN 65

67 Rote-learner Examples Memorize entire training data and performs classification only if the new example is identical to one of the training examples Nearest neighbor Identify k closest points (nearest neighbors), and perform classification (based on majority vote, etc.) 66

68 Intuition If it looks like a duck, walks like a duck, quacks like a duck, then it s probably a duck Compute Distance Test Record Training Records Choose k of the nearest records 67

69 Basic Process Building model No explicit model Set-up Distance Metric (to indentify the nearest neighbors) Unknown record The value of k Making prediction Compute distance to training examples Identify k nearest neighbors Take majority vote to determine the class label 68

70 Distance Metric Most instance-based algorithm use Euclidean distance: a (1) and a (2) : two instances with n attributes Example: (1) (2) 2 (1) (2) 2 (1) (2) 2 ( a1 a1 ) ( a2 a2 )... ( a n an ) a (1) = [2, 9, 7, ]; a (2) = [3, 2, 6, ] Distance(a (1), a (2) ) = (2 3) 2 (9 2) 2 (7 6) 2 Other popular metric: Manhattan metric Distance for nominal features 1 if values are different, 0 if they are equal 69

71 Distance Metric --- Normalization Example: 3 features: height, weight, income x1=[1.6(m), 110(lb), 30,000($)]; x2=[1.9(m), 300(lb), 35,000($)] Distance(x1,x2) =? Normalization Attributes may have to be scaled (normalized) to prevent distance measures from being dominated by one of the attributes vi min vi ai max v min v (v i : the actual value of attribute i) i E.g. Suppose height ranges [1.5, 2] Height(x1) = ( ) / (2 1.5) = 0.2 Height(x2) = ( ) / (2 1.5) = 0.8 i 70

72 Choosing K X X X (a) 1-nearest neighbor (b) 2-nearest neighbor (c) 3-nearest neighbor K is usually chosen from odd numbers (1, 3, 5, 7, 9, ) Don t have to break ties K is usually set manually, according to the performance of the algorithm on a separate validation set 71

73 Voronoi Diagram of 1NN 72

74 Summary K-NN classifiers are lazy learners It does not build models explicitly Unlike eager learners such as decision tree Lazy learners vs eager learners Lazy learners (such as, KNN, etc.) No explicit model; no training; slow testing / predicting Eager learners (such as, Decision tree, etc.) Explicit model; (relative) slow training; fast testing / predicting 73

75 Demonstration IB1, IBk: \Weka\contact_lenses (vs J48) \UCI\autos (vs J48) \UCI\breast-w (vs J48) 74

Data Mining Classification: Decision Trees

Data Mining Classification: Decision Trees Data Mining Classification: Decision Trees Classification Decision Trees: what they are and how they work Hunt s (TDIDT) algorithm How to select the best split How to handle Inconsistent data Continuous

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Evaluating the Accuracy of a Classifier Holdout, random subsampling, crossvalidation, and the bootstrap are common techniques for

More information

Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors

Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors Classification k-nearest neighbors Data Mining Dr. Engin YILDIZTEPE Reference Books Han, J., Kamber, M., Pei, J., (2011). Data Mining: Concepts and Techniques. Third edition. San Francisco: Morgan Kaufmann

More information

Data Mining for Knowledge Management. Classification

Data Mining for Knowledge Management. Classification 1 Data Mining for Knowledge Management Classification Themis Palpanas University of Trento http://disi.unitn.eu/~themis Data Mining for Knowledge Management 1 Thanks for slides to: Jiawei Han Eamonn Keogh

More information

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised

More information

Classification Techniques (1)

Classification Techniques (1) 10 10 Overview Classification Techniques (1) Today Classification Problem Classification based on Regression Distance-based Classification (KNN) Net Lecture Decision Trees Classification using Rules Quality

More information

Decision-Tree Learning

Decision-Tree Learning Decision-Tree Learning Introduction ID3 Attribute selection Entropy, Information, Information Gain Gain Ratio C4.5 Decision Trees TDIDT: Top-Down Induction of Decision Trees Numeric Values Missing Values

More information

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation. Lecture Notes for Chapter 4. Introduction to Data Mining

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation. Lecture Notes for Chapter 4. Introduction to Data Mining Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal

Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether

More information

Decision Trees. JERZY STEFANOWSKI Institute of Computing Science Poznań University of Technology. Doctoral School, Catania-Troina, April, 2008

Decision Trees. JERZY STEFANOWSKI Institute of Computing Science Poznań University of Technology. Doctoral School, Catania-Troina, April, 2008 Decision Trees JERZY STEFANOWSKI Institute of Computing Science Poznań University of Technology Doctoral School, Catania-Troina, April, 2008 Aims of this module The decision tree representation. The basic

More information

Classification and Prediction

Classification and Prediction Classification and Prediction Slides for Data Mining: Concepts and Techniques Chapter 7 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser

More information

Professor Anita Wasilewska. Classification Lecture Notes

Professor Anita Wasilewska. Classification Lecture Notes Professor Anita Wasilewska Classification Lecture Notes Classification (Data Mining Book Chapters 5 and 7) PART ONE: Supervised learning and Classification Data format: training and test data Concept,

More information

Data mining techniques: decision trees

Data mining techniques: decision trees Data mining techniques: decision trees 1/39 Agenda Rule systems Building rule systems vs rule systems Quick reference 2/39 1 Agenda Rule systems Building rule systems vs rule systems Quick reference 3/39

More information

Knowledge-based systems and the need for learning

Knowledge-based systems and the need for learning Knowledge-based systems and the need for learning The implementation of a knowledge-based system can be quite difficult. Furthermore, the process of reasoning with that knowledge can be quite slow. This

More information

Classification algorithm in Data mining: An Overview

Classification algorithm in Data mining: An Overview Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department

More information

Data Mining on Streams

Data Mining on Streams Data Mining on Streams Using Decision Trees CS 536: Machine Learning Instructor: Michael Littman TA: Yihua Wu Outline Introduction to data streams Overview of traditional DT learning ALG DT learning ALGs

More information

Data Mining Practical Machine Learning Tools and Techniques

Data Mining Practical Machine Learning Tools and Techniques Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

More information

Classification: Basic Concepts, Decision Trees, and Model Evaluation

Classification: Basic Concepts, Decision Trees, and Model Evaluation 4 Classification: Basic Concepts, Decision Trees, and Model Evaluation Classification, which is the task of assigning objects to one of several predefined categories, is a pervasive problem that encompasses

More information

Monday Morning Data Mining

Monday Morning Data Mining Monday Morning Data Mining Tim Ruhe Statistische Methoden der Datenanalyse Outline: - data mining - IceCube - Data mining in IceCube Computer Scientists are different... Fakultät Physik Fakultät Physik

More information

Email Spam Detection A Machine Learning Approach

Email Spam Detection A Machine Learning Approach Email Spam Detection A Machine Learning Approach Ge Song, Lauren Steimle ABSTRACT Machine learning is a branch of artificial intelligence concerned with the creation and study of systems that can learn

More information

Foundations of Artificial Intelligence. Introduction to Data Mining

Foundations of Artificial Intelligence. Introduction to Data Mining Foundations of Artificial Intelligence Introduction to Data Mining Objectives Data Mining Introduce a range of data mining techniques used in AI systems including : Neural networks Decision trees Present

More information

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

More information

Decision Trees. Andrew W. Moore Professor School of Computer Science Carnegie Mellon University. www.cs.cmu.edu/~awm awm@cs.cmu.

Decision Trees. Andrew W. Moore Professor School of Computer Science Carnegie Mellon University. www.cs.cmu.edu/~awm awm@cs.cmu. Decision Trees Andrew W. Moore Professor School of Computer Science Carnegie Mellon University www.cs.cmu.edu/~awm awm@cs.cmu.edu 42-268-7599 Copyright Andrew W. Moore Slide Decision Trees Decision trees

More information

Data Mining Techniques Chapter 6: Decision Trees

Data Mining Techniques Chapter 6: Decision Trees Data Mining Techniques Chapter 6: Decision Trees What is a classification decision tree?.......................................... 2 Visualizing decision trees...................................................

More information

Supervised Learning (Big Data Analytics)

Supervised Learning (Big Data Analytics) Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used

More information

Decision Trees from large Databases: SLIQ

Decision Trees from large Databases: SLIQ Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

Data Mining Essentials

Data Mining Essentials This chapter is from Social Media Mining: An Introduction. By Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu. Cambridge University Press, 2014. Draft version: April 20, 2014. Complete Draft and Slides

More information

Lecture 10: Regression Trees

Lecture 10: Regression Trees Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,

More information

Machine Learning in Spam Filtering

Machine Learning in Spam Filtering Machine Learning in Spam Filtering A Crash Course in ML Konstantin Tretyakov kt@ut.ee Institute of Computer Science, University of Tartu Overview Spam is Evil ML for Spam Filtering: General Idea, Problems.

More information

Gerry Hobbs, Department of Statistics, West Virginia University

Gerry Hobbs, Department of Statistics, West Virginia University Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit

More information

Data Mining with Weka

Data Mining with Weka Data Mining with Weka Class 1 Lesson 1 Introduction Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz Data Mining with Weka a practical course on how to

More information

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4. Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví Pavel Kříž Seminář z aktuárských věd MFF 4. dubna 2014 Summary 1. Application areas of Insurance Analytics 2. Insurance Analytics

More information

Data Mining: Foundation, Techniques and Applications

Data Mining: Foundation, Techniques and Applications Data Mining: Foundation, Techniques and Applications Lesson 1b :A Quick Overview of Data Mining Li Cuiping( 李 翠 平 ) School of Information Renmin University of China Anthony Tung( 鄧 锦 浩 ) School of Computing

More information

Big Data Analytics CSCI 4030

Big Data Analytics CSCI 4030 High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising

More information

8. Machine Learning Applied Artificial Intelligence

8. Machine Learning Applied Artificial Intelligence 8. Machine Learning Applied Artificial Intelligence Prof. Dr. Bernhard Humm Faculty of Computer Science Hochschule Darmstadt University of Applied Sciences 1 Retrospective Natural Language Processing Name

More information

MACHINE LEARNING IN HIGH ENERGY PHYSICS

MACHINE LEARNING IN HIGH ENERGY PHYSICS MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Unit # 11 Sajjad Haider Fall 2013 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

More information

Model Combination. 24 Novembre 2009

Model Combination. 24 Novembre 2009 Model Combination 24 Novembre 2009 Datamining 1 2009-2010 Plan 1 Principles of model combination 2 Resampling methods Bagging Random Forests Boosting 3 Hybrid methods Stacking Generic algorithm for mulistrategy

More information

Introduction of Information Visualization and Visual Analytics. Chapter 4. Data Mining

Introduction of Information Visualization and Visual Analytics. Chapter 4. Data Mining Introduction of Information Visualization and Visual Analytics Chapter 4 Data Mining Books! P. N. Tan, M. Steinbach, V. Kumar: Introduction to Data Mining. First Edition, ISBN-13: 978-0321321367, 2005.

More information

How To Cluster

How To Cluster Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main

More information

Implementation of Data Mining Techniques for Weather Report Guidance for Ships Using Global Positioning System

Implementation of Data Mining Techniques for Weather Report Guidance for Ships Using Global Positioning System International Journal Of Computational Engineering Research (ijceronline.com) Vol. 3 Issue. 3 Implementation of Data Mining Techniques for Weather Report Guidance for Ships Using Global Positioning System

More information

Chapter 20: Data Analysis

Chapter 20: Data Analysis Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification

More information

Predicting Student Performance by Using Data Mining Methods for Classification

Predicting Student Performance by Using Data Mining Methods for Classification BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, No 1 Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0006 Predicting Student Performance

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification

More information

Microsoft Azure Machine learning Algorithms

Microsoft Azure Machine learning Algorithms Microsoft Azure Machine learning Algorithms Tomaž KAŠTRUN @tomaz_tsql Tomaz.kastrun@gmail.com http://tomaztsql.wordpress.com Our Sponsors Speaker info https://tomaztsql.wordpress.com Agenda Focus on explanation

More information

Machine Learning. Mausam (based on slides by Tom Mitchell, Oren Etzioni and Pedro Domingos)

Machine Learning. Mausam (based on slides by Tom Mitchell, Oren Etzioni and Pedro Domingos) Machine Learning Mausam (based on slides by Tom Mitchell, Oren Etzioni and Pedro Domingos) What Is Machine Learning? A computer program is said to learn from experience E with respect to some class of

More information

Classification and Prediction

Classification and Prediction Classification and Prediction 1. Objectives...2 2. Classification vs. Prediction...3 2.1. Definitions...3 2.2. Supervised vs. Unsupervised Learning...3 2.3. Classification and Prediction Related Issues...4

More information

Data Preprocessing. Week 2

Data Preprocessing. Week 2 Data Preprocessing Week 2 Topics Data Types Data Repositories Data Preprocessing Present homework assignment #1 Team Homework Assignment #2 Read pp. 227 240, pp. 250 250, and pp. 259 263 the text book.

More information

Data Mining. Nonlinear Classification

Data Mining. Nonlinear Classification Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15

More information

Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms

Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms Yin Zhao School of Mathematical Sciences Universiti Sains Malaysia (USM) Penang, Malaysia Yahya

More information

LCs for Binary Classification

LCs for Binary Classification Linear Classifiers A linear classifier is a classifier such that classification is performed by a dot product beteen the to vectors representing the document and the category, respectively. Therefore it

More information

Data Mining Techniques for Prognosis in Pancreatic Cancer

Data Mining Techniques for Prognosis in Pancreatic Cancer Data Mining Techniques for Prognosis in Pancreatic Cancer by Stuart Floyd A Thesis Submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUE In partial fulfillment of the requirements for the Degree

More information

Clustering. Data Mining. Abraham Otero. Data Mining. Agenda

Clustering. Data Mining. Abraham Otero. Data Mining. Agenda Clustering 1/46 Agenda Introduction Distance K-nearest neighbors Hierarchical clustering Quick reference 2/46 1 Introduction It seems logical that in a new situation we should act in a similar way as in

More information

10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html 10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html

More information

A Splitting Criteria Based on Similarity in Decision Tree Learning

A Splitting Criteria Based on Similarity in Decision Tree Learning JOURNAL OF SOFTWARE, VOL. 7, NO. 8, AUGUST 2012 1775 A Splitting Criteria Based on Similarity in Decision Tree Learning Xinmeng Zhang Cisco School of Informatics, Guangdong University of Foreign Studies,

More information

Data Mining Methods: Applications for Institutional Research

Data Mining Methods: Applications for Institutional Research Data Mining Methods: Applications for Institutional Research Nora Galambos, PhD Office of Institutional Research, Planning & Effectiveness Stony Brook University NEAIR Annual Conference Philadelphia 2014

More information

Classification/Decision Trees (II)

Classification/Decision Trees (II) Classification/Decision Trees (II) Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Right Sized Trees Let the expected misclassification rate of a tree T be R (T ).

More information

1 Maximum likelihood estimation

1 Maximum likelihood estimation COS 424: Interacting with Data Lecturer: David Blei Lecture #4 Scribes: Wei Ho, Michael Ye February 14, 2008 1 Maximum likelihood estimation 1.1 MLE of a Bernoulli random variable (coin flips) Given N

More information

Attribution. Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley)

Attribution. Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley) Machine Learning 1 Attribution Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley) 2 Outline Inductive learning Decision

More information

CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.

CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore. CI6227: Data Mining Lesson 11b: Ensemble Learning Sinno Jialin PAN Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore Acknowledgements: slides are adapted from the lecture notes

More information

Content-Based Recommendation

Content-Based Recommendation Content-Based Recommendation Content-based? Item descriptions to identify items that are of particular interest to the user Example Example Comparing with Noncontent based Items User-based CF Searches

More information

Data Mining in CRM & Direct Marketing. Jun Du The University of Western Ontario jdu43@uwo.ca

Data Mining in CRM & Direct Marketing. Jun Du The University of Western Ontario jdu43@uwo.ca Data Mining in CRM & Direct Marketing Jun Du The University of Western Ontario jdu43@uwo.ca Outline Why CRM & Marketing Goals in CRM & Marketing Models and Methodologies Case Study: Response Model Case

More information

Predicting Student Academic Performance at Degree Level: A Case Study

Predicting Student Academic Performance at Degree Level: A Case Study I.J. Intelligent Systems and Applications, 2015, 01, 49-61 Published Online December 2014 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijisa.2015.01.05 Predicting Student Academic Performance at Degree

More information

Practical Introduction to Machine Learning and Optimization. Alessio Signorini <alessio.signorini@oneriot.com>

Practical Introduction to Machine Learning and Optimization. Alessio Signorini <alessio.signorini@oneriot.com> Practical Introduction to Machine Learning and Optimization Alessio Signorini Everyday's Optimizations Although you may not know, everybody uses daily some sort of optimization

More information

A Study of Detecting Credit Card Delinquencies with Data Mining using Decision Tree Model

A Study of Detecting Credit Card Delinquencies with Data Mining using Decision Tree Model A Study of Detecting Credit Card Delinquencies with Data Mining using Decision Tree Model ABSTRACT Mrs. Arpana Bharani* Mrs. Mohini Rao** Consumer credit is one of the necessary processes but lending bears

More information

Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100

Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Erkan Er Abstract In this paper, a model for predicting students performance levels is proposed which employs three

More information

A Comparative Analysis of Classification Techniques on Categorical Data in Data Mining

A Comparative Analysis of Classification Techniques on Categorical Data in Data Mining A Comparative Analysis of Classification Techniques on Categorical Data in Data Mining Sakshi Department Of Computer Science And Engineering United College of Engineering & Research Naini Allahabad sakshikashyap09@gmail.com

More information

Volume 4, Issue 1, January 2016 International Journal of Advance Research in Computer Science and Management Studies

Volume 4, Issue 1, January 2016 International Journal of Advance Research in Computer Science and Management Studies Volume 4, Issue 1, January 2016 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com Spam

More information

Comparison of Data Mining Techniques used for Financial Data Analysis

Comparison of Data Mining Techniques used for Financial Data Analysis Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract

More information

Machine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu

Machine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu Machine Learning CUNY Graduate Center, Spring 2013 Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning Logistics Lectures M 9:30-11:30 am Room 4419 Personnel

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05 Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

More information

Machine Learning using MapReduce

Machine Learning using MapReduce Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

More information

Clustering Via Decision Tree Construction

Clustering Via Decision Tree Construction Clustering Via Decision Tree Construction Bing Liu 1, Yiyuan Xia 2, and Philip S. Yu 3 1 Department of Computer Science, University of Illinois at Chicago, 851 S. Morgan Street, Chicago, IL 60607-7053.

More information

Data Mining as a tool to Predict the Churn Behaviour among Indian bank customers

Data Mining as a tool to Predict the Churn Behaviour among Indian bank customers Data Mining as a tool to Predict the Churn Behaviour among Indian bank customers Manjit Kaur Department of Computer Science Punjabi University Patiala, India manjit8718@gmail.com Dr. Kawaljeet Singh University

More information

PARALLEL AND SCALABALE RULES BASED CLASSIFIER USING MAP-REDUCE PARADIGM ON HADOOP CLOUD

PARALLEL AND SCALABALE RULES BASED CLASSIFIER USING MAP-REDUCE PARADIGM ON HADOOP CLOUD PARALLEL AND SCALABALE RULES BASED CLASSIFIER USING MAP-REDUCE PARADIGM ON HADOOP CLOUD V.B. Nikam 1, B.B.Meshram 2 1,2 Department of Computer Engineering and Information Technology Veermata Jijabai Technological

More information

Data Mining with R. Decision Trees and Random Forests. Hugh Murrell

Data Mining with R. Decision Trees and Random Forests. Hugh Murrell Data Mining with R Decision Trees and Random Forests Hugh Murrell reference books These slides are based on a book by Graham Williams: Data Mining with Rattle and R, The Art of Excavating Data for Knowledge

More information

Linear Classification. Volker Tresp Summer 2015

Linear Classification. Volker Tresp Summer 2015 Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

!"!!"#$$%&'()*+$(,%!"#$%$&'()*""%(+,'-*&./#-$&'(-&(0*".$#-$1"(2&."3$'45"

!!!#$$%&'()*+$(,%!#$%$&'()*%(+,'-*&./#-$&'(-&(0*.$#-$1(2&.3$'45 !"!!"#$$%&'()*+$(,%!"#$%$&'()*""%(+,'-*&./#-$&'(-&(0*".$#-$1"(2&."3$'45"!"#"$%&#'()*+',$$-.&#',/"-0%.12'32./4'5,5'6/%&)$).2&'7./&)8'5,5'9/2%.%3%&8':")08';:

More information

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

Collaborative Filtering. Radek Pelánek

Collaborative Filtering. Radek Pelánek Collaborative Filtering Radek Pelánek 2015 Collaborative Filtering assumption: users with similar taste in past will have similar taste in future requires only matrix of ratings applicable in many domains

More information

Neural Networks and Support Vector Machines

Neural Networks and Support Vector Machines INF5390 - Kunstig intelligens Neural Networks and Support Vector Machines Roar Fjellheim INF5390-13 Neural Networks and SVM 1 Outline Neural networks Perceptrons Neural networks Support vector machines

More information

Why Ensembles Win Data Mining Competitions

Why Ensembles Win Data Mining Competitions Why Ensembles Win Data Mining Competitions A Predictive Analytics Center of Excellence (PACE) Tech Talk November 14, 2012 Dean Abbott Abbott Analytics, Inc. Blog: http://abbottanalytics.blogspot.com URL:

More information

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015 RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

Specific Usage of Visual Data Analysis Techniques

Specific Usage of Visual Data Analysis Techniques Specific Usage of Visual Data Analysis Techniques Snezana Savoska 1 and Suzana Loskovska 2 1 Faculty of Administration and Management of Information systems, Partizanska bb, 7000, Bitola, Republic of Macedonia

More information

A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model

A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model Twinkle Patel, Ms. Ompriya Kale Abstract: - As the usage of credit card has increased the credit card fraud has also increased

More information

Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

More information

Introduction to data mining. Example of remote sensing image analysis

Introduction to data mining. Example of remote sensing image analysis Ocean's Big Data Mining, 2014 (Data mining in large sets of complex oceanic data: new challenges and solutions) 8-9 Sep 2014 Brest (France) Monday, September 8, 2014, 4:00 pm - 5:30 pm Introduction to

More information

Agenda. Mathias Lanner Sas Institute. Predictive Modeling Applications. Predictive Modeling Training Data. Beslutsträd och andra prediktiva modeller

Agenda. Mathias Lanner Sas Institute. Predictive Modeling Applications. Predictive Modeling Training Data. Beslutsträd och andra prediktiva modeller Agenda Introduktion till Prediktiva modeller Beslutsträd Beslutsträd och andra prediktiva modeller Mathias Lanner Sas Institute Pruning Regressioner Neurala Nätverk Utvärdering av modeller 2 Predictive

More information

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com

More information

Less naive Bayes spam detection

Less naive Bayes spam detection Less naive Bayes spam detection Hongming Yang Eindhoven University of Technology Dept. EE, Rm PT 3.27, P.O.Box 53, 5600MB Eindhoven The Netherlands. E-mail:h.m.yang@tue.nl also CoSiNe Connectivity Systems

More information

Using Ensemble of Decision Trees to Forecast Travel Time

Using Ensemble of Decision Trees to Forecast Travel Time Using Ensemble of Decision Trees to Forecast Travel Time José P. González-Brenes Guido Matías Cortés What to Model? Goal Predict travel time at time t on route s using a set of explanatory variables We

More information

How To Make A Credit Risk Model For A Bank Account

How To Make A Credit Risk Model For A Bank Account TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző csaba.fozo@lloydsbanking.com 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions

More information