# More Data Mining with Weka

Size: px
Start display at page:

Transcription

1 More Data Mining with Weka Class 5 Lesson 1 Simple neural networks Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz

2 Lesson 5.1: Simple neural networks Class 1 Exploring Weka s interfaces; working with big data Class 2 Class 3 Discretization and text classification Classification rules, association rules, and clustering Lesson 5.1 Simple neural networks Lesson 5.2 Multilayer Perceptrons Lesson 5.3 Learning curves Class 4 Class 5 Selecting attributes and counting the cost Neural networks, learning curves, and performance optimization Lesson 5.4 Performance optimization Lesson 5.5 ARFF and XRFF Lesson 5.6 Summary

3 Lesson 5.1: Simple neural networks Many people love neural networks (not me) the very name is suggestive of intelligence!

4 Lesson 5.1: Simple neural networks Perceptron: simplest form Determine the class using a linear combination of attributes for test instance a, x = w0 + w1a 1 + w2a wk a if x > 0 then class 1, if x < 0 then class 2 Works most naturally with numeric attributes k = k j= 0 w j a j Set all weights to zero Until all instances in the training data are classified correctly For each instance i in the training data If i is classified incorrectly If i belongs to the first class add it to the weight vector else subtract it from the weight vector Perceptron convergence theorem converges if you cycle repeatedly through the training data provided the problem is linearly separable

5 Lesson 5.1: Simple neural networks Linear decision boundaries Recall Support Vector Machines (Data Mining with Weka, lesson 4.5) also restricted to linear decision boundaries but can get more complex boundaries with the Kernel trick (not explained) Perceptron can use the same trick to get non-linear boundaries Voted perceptron (in Weka) Store all weight vectors and let them vote on test examples weight them according to their survival time Claimed to have many of the advantages of Support Vector Machines faster, simpler, and nearly as good

6 Lesson 5.1: Simple neural networks How good is VotedPerceptron? VotedPerceptron SMO Ionosphere dataset ionosphere.arff 86% 89% German credit dataset credit-g.arff 70% 75% Breast cancer dataset breast-cancer.arff 71% 70% Diabetes dataset diabetes.arff 67% 77% Is it faster? yes

7 Lesson 5.1: Simple neural networks History of the Perceptron 1957: Basic perceptron algorithm Derived from theories about how the brain works A perceiving and recognizing automaton Rosenblatt Principles of neurodynamics: Perceptrons and the theory of brain mechanisms 1970: Suddenly went out of fashion Minsky and Papert Perceptrons 1986: Returned, rebranded connectionism Rumelhart and McClelland Parallel distributed processing Some claim that artificial neural networks mirror brain function Multilayer perceptrons Nonlinear decision boundaries Backpropagation algorithm

8 Lesson 5.1: Simple neural networks Basic Perceptron algorithm: linear decision boundary Like classification-by-regression Works with numeric attributes Iterative algorithm, order dependent My MSc thesis (1971) describes a simple improvement! Still not impressed, sorry Modern improvements (1999): get more complex boundaries using the Kernel trick more sophisticated strategy with multiple weight vectors and voting Course text Section 4.6 Linear classification using the Perceptron Section 6.4 Kernel Perceptron

9 More Data Mining with Weka Class 5 Lesson 2 Multilayer Perceptrons Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz

10 Lesson 5.2: Multilayer Perceptrons Class 1 Exploring Weka s interfaces; working with big data Class 2 Class 3 Discretization and text classification Classification rules, association rules, and clustering Lesson 5.1 Simple neural networks Lesson 5.2 Multilayer Perceptrons Lesson 5.3 Learning curves Class 4 Class 5 Selecting attributes and counting the cost Neural networks, learning curves, and performance optimization Lesson 5.4 Performance optimization Lesson 5.5 ARFF and XRFF Lesson 5.6 Summary

11 Lesson 5.2: Multilayer Perceptrons Network of perceptrons Input layer, hidden layer(s), and output layer Each connection has a weight (a number) Each node performs a weighted sum of its inputs and thresholds the result usually with a sigmoid function nodes are often called neurons output sigmoid input output output input input 3 hidden layers

12 Lesson 5.2: Multilayer Perceptrons How many layers, how many nodes in each? Input layer: one for each attribute (attributes are numeric, or binary) Output layer: one for each class (or just one if the class is numeric) How many hidden layers? Big Question #1 Zero hidden layers: standard Perceptron algorithm suitable if data is linearly separable One hidden layer: suitable for a single convex region of the decision space Two hidden layers: can generate arbitrary decision boundaries How big are they? Big Question #2 usually chosen somewhere between the input and output layers common heuristic: mean value of input and output layers (Weka s default)

13 Lesson 5.2: Multilayer Perceptrons What are the weights? They re learned from the training set Iteratively minimize the error using steepest descent Gradient is determined using the backpropagation algorithm Change in weight computed by multiplying the gradient by the learning rate and adding the previous change in weight multiplied by the momentum : W next = W + ΔW ΔW = learning_rate gradient + momentum ΔW previous Can get excellent results Often involves (much) experimentation number and size of hidden layers value of learning rate and momentum

14 Lesson 5.2: Multilayer Perceptrons MultilayerPerceptron performance Numeric weather data 79%! (J48, NaiveBayes both 64%, SMO 57%, IBk 79%) On real problems does quite well but slow Parameters hiddenlayers: set GUI to true and try 5, 10, 20 learningrate, momentum makes multiple passes ( epochs ) through the data training continues until error on the validation set consistently increases or training time is exceeded

15 Lesson 5.2: Multilayer Perceptrons Create your own network structure! Selecting nodes click to select right-click in empty space to deselect Creating/deleting nodes click in empty space to create right-click (with no node selected) to delete Creating/deleting connections with a node selected, click on another to connect to it and another, and another right-click to delete connection Can set parameters here too

16 Lesson 5.2: Multilayer Perceptrons Are they any good? Experimenter with 6 datasets Iris, breast-cancer, credit-g, diabetes, glass, ionosphere 9 algorithms MultilayerPerceptron, ZeroR, OneR, J48, NaiveBayes, IBk, SMO, AdaBoostM1, VotedPerceptron MultilayerPerceptron wins on 2 datasets Other wins: SMO on 2 datasets J48 on 1 dataset IBk on 1 dataset But times slower than other methods

17 Lesson 5.2: Multilayer Perceptrons Multilayer Perceptrons implement arbitrary decision boundaries given two (or more) hidden layers, that are large enough and are trained properly Training by backpropagation iterative algorithm based on gradient descent In practice?? Quite good performance, but extremely slow Still not impressed, sorry Might be a lot more impressive on more complex datasets Course text Section 4.6 Linear classification using the Perceptron Section 6.4 Kernel Perceptron

18 More Data Mining with Weka Class 5 Lesson 3 Learning curves Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz

19 Lesson 5.3: Learning curves Class 1 Exploring Weka s interfaces; working with big data Class 2 Class 3 Discretization and text classification Classification rules, association rules, and clustering Lesson 5.1 Simple neural networks Lesson 5.2 Multilayer Perceptrons Lesson 5.3 Learning curves Class 4 Class 5 Selecting attributes and counting the cost Neural networks, learning curves, and performance optimization Lesson 5.4 Performance optimization Lesson 5.5 ARFF and XRFF Lesson 5.6 Summary

20 Lesson 5.3: Learning curves The advice on evaluation (from Data Mining with Weka ) Large separate test set? use it Lots of data? use holdout Otherwise, use 10-fold cross-validation and repeat 10 times, as the Experimenter does But how much is a lot? It depends on number of classes number of attributes structure of the domain kind of model Learning curves performance training data

21 Lesson 5.3: Learning curves Plotting a learning curve Resample filter: replacement vs. no replacement original dataset copy, or move? sampled dataset Sample training set but not test set Meta > FilteredClassifier Resample (no replacement), 50% sample, J48, 10-fold cross-validation Glass dataset (214 instances, 6 classes)

22 Lesson 5.3: Learning curves An empirical learning curve 100% 66.8% 90% 68.7% 80% 68.2% 70% 66.4% 60% 66.4% 50% 65.0% 45% 62.1% 40% 57.0% 35% 56.5% 30% 59.3% 25% 57.0% 20% 44.9% 10% 43.5% 5% 41.1% 2% 33.6% 1% 27.6% performance (%) training data (%) J48 ZeroR

23 Lesson 5.3: Learning curves An empirical learning curve 80 performance (%) J48 (10 repetitions) ZeroR training data (%)

24 Lesson 5.3: Learning curves An empirical learning curve 80 performance (%) J48 (1000 repetitions) ZeroR training data (%)

25 Lesson 5.3: Learning curves How much data is enough? Hard to say! Plot learning curve? Resampling (with/without replacement) but don t sample the test set! meta > FilteredClassifier Note: performance figure is only an estimate

26 More Data Mining with Weka Class 5 Lesson 4 Meta-learners for performance optimization Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz

27 Lesson 5.4: Meta-learners for performance optimization Class 1 Exploring Weka s interfaces; working with big data Class 2 Class 3 Discretization and text classification Classification rules, association rules, and clustering Lesson 5.1 Simple neural networks Lesson 5.2 Multilayer Perceptrons Lesson 5.3 Learning curves Class 4 Class 5 Selecting attributes and counting the cost Neural networks, learning curves, and performance optimization Lesson 5.4 Performance optimization Lesson 5.5 ARFF and XRFF Lesson 5.6 Summary

28 Lesson 5.4: Meta-learners for performance optimization Wrapper meta-learners in Weka Recall AttributeSelectedClassifier with WrapperSubsetEval selects an attribute subset based on how well a classifier performs uses cross-validation to assess performance 1. CVParameterSelection: selects best value for a parameter optimizes performance, using cross-validation optimizes accuracy (classification) or root mean-squared error (regression) 2. GridSearch optimizes two parameters by searching a 2D grid 3. ThresholdSelector selects a probability threshold on the classifier s output can optimize accuracy, true positive rate, precision, recall, F-measure

29 Lesson 5.4: Meta-learners for performance optimization Try CVParameterSelection J48 has two parameters, confidencefactor C and minnumobj M in Data Mining with Weka, I advised not to play with confidencefactor Load diabetes.arff, select J48: 73.8% CVParameterSelection with J48 confidencefactor from 0.1 to 1.0 in 10 steps: C check More button use C Achieves 73.4% with C = 0.1 minnumobj from 1 to 10 in 10 steps add M (first) Achieves 74.3% with C = 0.2 and M = 10; simpler tree takes a while!

30 Lesson 5.4: Meta-learners for performance optimization GridSearch CVParameterSelection with multiple parameters first one, then the other GridSearch optimizes two parameters together Can explore best parameter combinations for a filter and classifier Can optimize accuracy (classification) or various measures (regression) Very flexible but fairly complicated to set up Take a quick look

32 Lesson 5.4: Meta-learners for performance optimization Try ThresholdSelector Credit dataset credit-g.arff, NaiveBayes 75.4% ThresholdSelector, NaiveBayes, optimize Accuracy 75.4% NB designatedclass should be the first class value But you can optimize other things! FMEASURE ACCURACY TRUE_POS TRUE_NEG TP_RATE PRECISION RECALL Confusion matrix a b <-- classified as TP FN a = good FP TN b = bad number correctly classified as good Precision total number classified as good = number correctly classified as good Recall actual number of good instances = 2 Precision Recall F-measure Precision + Recall TP TP+FP TP TP+FN

33 Lesson 5.4: Meta-learners for performance optimization Don t optimize parameters manually you ll overfit! Wrapper method uses internal cross-validation to optimize 1. CVParameterSelection optimize parameters individually 2. GridSearch optimize two parameters together 3. ThresholdSelection select a probability threshold Course text Section 11.5 Optimizing performance Section 5.7 Recall Precision curves

34 More Data Mining with Weka Class 5 Lesson 5 ARFF and XRFF Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz

35 Lesson 5.5: ARFF and XRFF Class 1 Exploring Weka s interfaces; working with big data Class 2 Class 3 Discretization and text classification Classification rules, association rules, and clustering Lesson 5.1 Simple neural networks Lesson 5.2 Multilayer Perceptrons Lesson 5.3 Learning curves Class 4 Class 5 Selecting attributes and counting the cost Neural networks, learning curves, and performance optimization Lesson 5.4 Performance optimization Lesson 5.5 ARFF and XRFF Lesson 5.6 Summary

36 Lesson 5.5: ARFF and XRFF ARFF nominal, numeric (integer or real), data lines (? for a missing value) % comment outlook {sunny, overcast, temperature humidity windy {TRUE, play {yes, sunny, 85, 85, FALSE, no sunny, 80, 90, TRUE, no rainy, 71, 91, TRUE, no

37 Lesson 5.5: ARFF and XRFF ARFF format: more sparse NonSparseToSparse, SparseToNonSparse all classifiers accept sparse data as input but some expand the data internally while others use sparsity to speed up computation e.g. NaiveBayesMultinomial, SMO StringToWordVector produces sparse output weighted instances missing weights are assumed to be 1 date attributes relational attributes (multi-instance learning) sunny, hot, high, FALSE, no sunny, hot, high, TRUE, no overcast, hot, high, FALSE, yes rainy, mild, high, FALSE, yes rainy, cool, normal, FALSE, yes rainy, cool, normal, TRUE, no overcast, cool, normal, TRUE, sunny, 85, 85, FALSE, no, {0.5} sunny, 80, 90, TRUE, no, outlook {sunny, temperature {hot, humidity {high, windy {TRUE, play {yes, {3 FALSE, 4 no} {4 no} {0 overcast, 3 FALSE} {0 rainy, 1 mild, 3 FALSE} {0 rainy, 1 cool, 2 normal, 3 FALSE} {0 rainy, 1 cool, 2 normal, 4 no} {0 overcast, 1 cool, 2 normal}

38 Lesson 5.5: ARFF and XRFF XML file format: XRFF Explorer can read and write XRFF files Verbose (compressed version:.xrff.gz) Same information as ARFF files including sparse option and instance weights plus a little more can specify which attribute is the class attribute weights <dataset name="weather.symbolic" version="3.6.10"> <header> <attributes> <attribute name="outlook" type="nominal"> <labels> <label>sunny</label> <label>overcast</label> <label>rainy</label> </labels> </attribute> </header> <body> <instances> <instance> <value>sunny</value> <value>hot</value> <value>high</value> <value>false</value> <value>no</value> </instance> </instances> </body> </dataset>

39 Lesson 5.5: ARFF and XRFF ARFF has extra features sparse format instance weights date attributes relational attributes Some filters and classifiers take advantage of sparsity XRFF is XML equivalent of ARFF plus some additional features Course text Section 2.4 ARFF format

40 More Data Mining with Weka Class 5 Lesson 6 Summary Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz

41 Lesson 5.6: Summary Class 1 Exploring Weka s interfaces; working with big data Class 2 Class 3 Discretization and text classification Classification rules, association rules, and clustering Lesson 5.1 Simple neural networks Lesson 5.2 Multilayer Perceptrons Lesson 5.3 Learning curves Class 4 Class 5 Selecting attributes and counting the cost Neural networks, learning curves, and performance optimization Lesson 5.4 Performance optimization Lesson 5.5 ARFF and XRFF Lesson 5.6 Summary

42 Lesson 5.6 Summary From Data Mining with Weka There s no magic in data mining Instead, a huge array of alternative techniques There s no single universal best method It s an experimental science! What works best on your problem? Weka makes it easy maybe too easy? There are many pitfalls You need to understand what you re doing! Focus on evaluation and significance Different algorithms differ in performance but is it significant?

43 Lesson 5.6 Summary What did we miss in Data Mining with Weka? Filtered classifiers Filter training data but not test data during cross-validation Cost-sensitive evaluation and classification Evaluate and minimize cost, not error rate Attribute selection Select a subset of attributes to use when learning Clustering Learn something even when there s no class value Association rules Find associations between attributes, when no class is specified Text classification Handling textual data as words, characters, n-grams Weka Experimenter Calculating means and standard deviations automatically + more

44 Lesson 5.6 Summary What did we do in More Data Mining with Weka? Filtered classifiers Filter training data but not test data during cross-validation Cost-sensitive evaluation and classification Evaluate and minimize cost, not error rate Attribute selection Select a subset of attributes to use when learning Clustering Learn something even when there s no class value Association rules Find associations between attributes, when no class is specified Text classification Handling textual data as words, characters, n-grams Weka Experimenter Calculating means and standard deviations automatically + more Plus Big data CLI Knowledge Flow Streaming Discretization Rules vs trees Multinomial NB Neural nets ROC curves Learning curves ARFF/XRFF

45 Lesson 5.6 Summary What have we missed? Time series analysis Environment for time series forecasting Stream-oriented algorithms MOA system for massive online analysis Multi-instance learning Bags of instances labeled positive or negative, not single instances One-class classification Interfaces to other data mining packages Accessing from Weka the excellent resources provided by the R data mining system Wrapper classes for popular packages like LibSVM, LibLinear Distributed Weka with Hadoop Latent Semantic Analysis These are available as Weka packages

46 Lesson 5.6 Summary What have we missed? Time series analysis Environment for time series forecasting Stream-oriented algorithms MOA system for massive online analysis Multi-instance learning Bags of instances labeled positive or negative, not single instances One-class classification Interfaces to other data mining packages Accessing from Weka the excellent resources provided by the R data mining system Wrapper classes for popular packages like LibSVM, LibLinear Distributed Weka with Hadoop Latent Semantic Analysis These are available as Weka packages

47 Lesson 5.6 Summary Data is the new oil economic and social importance of data mining will rival that of the oil economy (by 2020?) Personal data is becoming a new economic asset class we need trust between individuals, government, private sector Ethics a person without ethics is a wild beast loosed upon this world Albert Camus Wisdom the value attached to knowledge knowledge speaks, but wisdom listens attributed to Jimi Hendrix

48 More Data Mining with Weka Department of Computer Science University of Waikato New Zealand Creative Commons Attribution 3.0 Unported License creativecommons.org/licenses/by/3.0/ weka.waikato.ac.nz

### Data Mining with Weka

Data Mining with Weka Class 1 Lesson 1 Introduction Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz Data Mining with Weka a practical course on how to

### How To Understand How Weka Works

More Data Mining with Weka Class 1 Lesson 1 Introduction Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz More Data Mining with Weka a practical course

### Chapter 6. The stacking ensemble approach

82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

### Université de Montpellier 2 Hugo Alatrista-Salas : hugo.alatrista-salas@teledetection.fr

Université de Montpellier 2 Hugo Alatrista-Salas : hugo.alatrista-salas@teledetection.fr WEKA Gallirallus Zeland) australis : Endemic bird (New Characteristics Waikato university Weka is a collection

### Tutorial Exercises for the Weka Explorer

CHAPTER Tutorial Exercises for the Weka Explorer 17 The best way to learn about the Explorer interface is simply to use it. This chapter presents a series of tutorial exercises that will help you learn

### WEKA. Machine Learning Algorithms in Java

WEKA Machine Learning Algorithms in Java Ian H. Witten Department of Computer Science University of Waikato Hamilton, New Zealand E-mail: ihw@cs.waikato.ac.nz Eibe Frank Department of Computer Science

### Supervised Learning (Big Data Analytics)

Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used

### 1. Classification problems

Neural and Evolutionary Computing. Lab 1: Classification problems Machine Learning test data repository Weka data mining platform Introduction Scilab 1. Classification problems The main aim of a classification

### Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

### Social Media Mining. Data Mining Essentials

Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

### Data Mining Practical Machine Learning Tools and Techniques

Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

### Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Evaluating the Accuracy of a Classifier Holdout, random subsampling, crossvalidation, and the bootstrap are common techniques for

### Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support

### Evaluation & Validation: Credibility: Evaluating what has been learned

Evaluation & Validation: Credibility: Evaluating what has been learned How predictive is a learned model? How can we evaluate a model Test the model Statistical tests Considerations in evaluating a Model

### DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com

### Keywords Data mining, Classification Algorithm, Decision tree, J48, Random forest, Random tree, LMT, WEKA 3.7. Fig.1. Data mining techniques.

International Journal of Emerging Research in Management &Technology Research Article October 2015 Comparative Study of Various Decision Tree Classification Algorithm Using WEKA Purva Sewaiwar, Kamal Kant

### Overview. Evaluation Connectionist and Statistical Language Processing. Test and Validation Set. Training and Test Set

Overview Evaluation Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes training set, validation set, test set holdout, stratification

### 8. Machine Learning Applied Artificial Intelligence

8. Machine Learning Applied Artificial Intelligence Prof. Dr. Bernhard Humm Faculty of Computer Science Hochschule Darmstadt University of Applied Sciences 1 Retrospective Natural Language Processing Name

### Data Mining. Nonlinear Classification

Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15

### Data Mining Techniques for Prognosis in Pancreatic Cancer

Data Mining Techniques for Prognosis in Pancreatic Cancer by Stuart Floyd A Thesis Submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUE In partial fulfillment of the requirements for the Degree

### Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier

International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6, January 2013 Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing

### An Introduction to WEKA. As presented by PACE

An Introduction to WEKA As presented by PACE Download and Install WEKA Website: http://www.cs.waikato.ac.nz/~ml/weka/index.html 2 Content Intro and background Exploring WEKA Data Preparation Creating Models/

### Neural Networks and Support Vector Machines

INF5390 - Kunstig intelligens Neural Networks and Support Vector Machines Roar Fjellheim INF5390-13 Neural Networks and SVM 1 Outline Neural networks Perceptrons Neural networks Support vector machines

### Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk

Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trakovski trakovski@nyus.edu.mk Neural Networks 2 Neural Networks Analogy to biological neural systems, the most robust learning systems

### WEKA KnowledgeFlow Tutorial for Version 3-5-8

WEKA KnowledgeFlow Tutorial for Version 3-5-8 Mark Hall Peter Reutemann July 14, 2008 c 2008 University of Waikato Contents 1 Introduction 2 2 Features 3 3 Components 4 3.1 DataSources..............................

### Introduction Predictive Analytics Tools: Weka

Introduction Predictive Analytics Tools: Weka Predictive Analytics Center of Excellence San Diego Supercomputer Center University of California, San Diego Tools Landscape Considerations Scale User Interface

### Open-Source Machine Learning: R Meets Weka

Open-Source Machine Learning: R Meets Weka Kurt Hornik Christian Buchta Achim Zeileis Weka? Weka is not only a flightless endemic bird of New Zealand (Gallirallus australis, picture from Wekapedia) but

### Data Mining of Web Access Logs

Data Mining of Web Access Logs A minor thesis submitted in partial fulfilment of the requirements for the degree of Master of Applied Science in Information Technology Anand S. Lalani School of Computer

### Chapter 4: Artificial Neural Networks

Chapter 4: Artificial Neural Networks CS 536: Machine Learning Littman (Wu, TA) Administration icml-03: instructional Conference on Machine Learning http://www.cs.rutgers.edu/~mlittman/courses/ml03/icml03/

### FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,

### Lecture 6. Artificial Neural Networks

Lecture 6 Artificial Neural Networks 1 1 Artificial Neural Networks In this note we provide an overview of the key concepts that have led to the emergence of Artificial Neural Networks as a major paradigm

### WEKA Explorer Tutorial

Machine Learning with WEKA WEKA Explorer Tutorial for WEKA Version 3.4.3 Svetlana S. Aksenova aksenovs@ecs.csus.edu School of Engineering and Computer Science Department of Computer Science California

### Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati tnpatil2@gmail.com, ss_sherekar@rediffmail.com

### Prof. Pietro Ducange Students Tutor and Practical Classes Course of Business Intelligence 2014 http://www.iet.unipi.it/p.ducange/esercitazionibi/

Prof. Pietro Ducange Students Tutor and Practical Classes Course of Business Intelligence 2014 http://www.iet.unipi.it/p.ducange/esercitazionibi/ Email: p.ducange@iet.unipi.it Office: Dipartimento di Ingegneria

### Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

### Predict Influencers in the Social Network

Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

### Decision-Tree Learning

Decision-Tree Learning Introduction ID3 Attribute selection Entropy, Information, Information Gain Gain Ratio C4.5 Decision Trees TDIDT: Top-Down Induction of Decision Trees Numeric Values Missing Values

### Improving spam mail filtering using classification algorithms with discretization Filter

International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational

### Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

### 6.2.8 Neural networks for data mining

6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural

### Web Document Clustering

Web Document Clustering Lab Project based on the MDL clustering suite http://www.cs.ccsu.edu/~markov/mdlclustering/ Zdravko Markov Computer Science Department Central Connecticut State University New Britain,

### Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News

Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News Sushilkumar Kalmegh Associate Professor, Department of Computer Science, Sant Gadge Baba Amravati

### Neural Computation - Assignment

Neural Computation - Assignment Analysing a Neural Network trained by Backpropagation AA SSt t aa t i iss i t i icc aa l l AA nn aa l lyy l ss i iss i oo f vv aa r i ioo i uu ss l lee l aa r nn i inn gg

### Experiments in Web Page Classification for Semantic Web

Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address

### A Content based Spam Filtering Using Optical Back Propagation Technique

A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT

### Logistic Regression for Spam Filtering

Logistic Regression for Spam Filtering Nikhila Arkalgud February 14, 28 Abstract The goal of the spam filtering problem is to identify an email as a spam or not spam. One of the classic techniques used

### Data quality in Accounting Information Systems

Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania

### Back Propagation Neural Networks User Manual

Back Propagation Neural Networks User Manual Author: Lukáš Civín Library: BP_network.dll Runnable class: NeuralNetStart Document: Back Propagation Neural Networks Page 1/28 Content: 1 INTRODUCTION TO BACK-PROPAGATION

### CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.

CI6227: Data Mining Lesson 11b: Ensemble Learning Sinno Jialin PAN Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore Acknowledgements: slides are adapted from the lecture notes

### Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk

Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk Ensembles 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training

### IBM SPSS Neural Networks 22

IBM SPSS Neural Networks 22 Note Before using this information and the product it supports, read the information in Notices on page 21. Product Information This edition applies to version 22, release 0,

### Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

### Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Unit # 11 Sajjad Haider Fall 2013 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

### Getting Even More Out of Ensemble Selection

Getting Even More Out of Ensemble Selection Quan Sun Department of Computer Science The University of Waikato Hamilton, New Zealand qs12@cs.waikato.ac.nz ABSTRACT Ensemble Selection uses forward stepwise

### Roulette Sampling for Cost-Sensitive Learning

Roulette Sampling for Cost-Sensitive Learning Victor S. Sheng and Charles X. Ling Department of Computer Science, University of Western Ontario, London, Ontario, Canada N6A 5B7 {ssheng,cling}@csd.uwo.ca

### Feed-Forward mapping networks KAIST 바이오및뇌공학과 정재승

Feed-Forward mapping networks KAIST 바이오및뇌공학과 정재승 How much energy do we need for brain functions? Information processing: Trade-off between energy consumption and wiring cost Trade-off between energy consumption

### Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

### Making Sense of the Mayhem: Machine Learning and March Madness

Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University atran3@stanford.edu ginzberg@stanford.edu I. Introduction III. Model The goal of our research

### Data Mining Analysis (breast-cancer data)

Data Mining Analysis (breast-cancer data) Jung-Ying Wang Register number: D9115007, May, 2003 Abstract In this AI term project, we compare some world renowned machine learning tools. Including WEKA data

### Comparative Analysis of Classification Algorithms on Different Datasets using WEKA

Volume 54 No13, September 2012 Comparative Analysis of Classification Algorithms on Different Datasets using WEKA Rohit Arora MTech CSE Deptt Hindu College of Engineering Sonepat, Haryana, India Suman

### Statistical Validation and Data Analytics in ediscovery. Jesse Kornblum

Statistical Validation and Data Analytics in ediscovery Jesse Kornblum Administrivia Silence your mobile Interactive talk Please ask questions 2 Outline Introduction Big Questions What Makes Things Similar?

### W6.B.1. FAQs CS535 BIG DATA W6.B.3. 4. If the distance of the point is additionally less than the tight distance T 2, remove it from the original set

http://wwwcscolostateedu/~cs535 W6B W6B2 CS535 BIG DAA FAQs Please prepare for the last minute rush Store your output files safely Partial score will be given for the output from less than 50GB input Computer

### Classification of Bad Accounts in Credit Card Industry

Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition

### Neural network software tool development: exploring programming language options

INEB- PSI Technical Report 2006-1 Neural network software tool development: exploring programming language options Alexandra Oliveira aao@fe.up.pt Supervisor: Professor Joaquim Marques de Sá June 2006

### Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

### BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University

### Data Mining Practical Machine Learning Tools and Techniques

Credibility: Evaluating what s been learned Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Issues: training, testing,

### Lecture 10: Regression Trees

Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,

### NEURAL NETWORKS A Comprehensive Foundation

NEURAL NETWORKS A Comprehensive Foundation Second Edition Simon Haykin McMaster University Hamilton, Ontario, Canada Prentice Hall Prentice Hall Upper Saddle River; New Jersey 07458 Preface xii Acknowledgments

### Feature Subset Selection in E-mail Spam Detection

Feature Subset Selection in E-mail Spam Detection Amir Rajabi Behjat, Universiti Technology MARA, Malaysia IT Security for the Next Generation Asia Pacific & MEA Cup, Hong Kong 14-16 March, 2012 Feature

2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously

### How To Predict Web Site Visits

Web Site Visit Forecasting Using Data Mining Techniques Chandana Napagoda Abstract: Data mining is a technique which is used for identifying relationships between various large amounts of data in many

### AN APPLICATION OF TIME SERIES ANALYSIS FOR WEATHER FORECASTING

AN APPLICATION OF TIME SERIES ANALYSIS FOR WEATHER FORECASTING Abhishek Agrawal*, Vikas Kumar** 1,Ashish Pandey** 2,Imran Khan** 3 *(M. Tech Scholar, Department of Computer Science, Bhagwant University,

### Role of Neural network in data mining

Role of Neural network in data mining Chitranjanjit kaur Associate Prof Guru Nanak College, Sukhchainana Phagwara,(GNDU) Punjab, India Pooja kapoor Associate Prof Swami Sarvanand Group Of Institutes Dinanagar(PTU)

### HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION

HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION Chihli Hung 1, Jing Hong Chen 2, Stefan Wermter 3, 1,2 Department of Management Information Systems, Chung Yuan Christian University, Taiwan

Neural Network Add-in Version 1.5 Software User s Guide Contents Overview... 2 Getting Started... 2 Working with Datasets... 2 Open a Dataset... 3 Save a Dataset... 3 Data Pre-processing... 3 Lagging...

### Machine Learning and Data Mining -

Machine Learning and Data Mining - Perceptron Neural Networks Nuno Cavalheiro Marques (nmm@di.fct.unl.pt) Spring Semester 2010/2011 MSc in Computer Science Multi Layer Perceptron Neurons and the Perceptron

### Monday Morning Data Mining

Monday Morning Data Mining Tim Ruhe Statistische Methoden der Datenanalyse Outline: - data mining - IceCube - Data mining in IceCube Computer Scientists are different... Fakultät Physik Fakultät Physik

### PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

### New Ensemble Combination Scheme

New Ensemble Combination Scheme Namhyoung Kim, Youngdoo Son, and Jaewook Lee, Member, IEEE Abstract Recently many statistical learning techniques are successfully developed and used in several areas However,

### Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems

2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Impact of Feature Selection on the Performance of ireless Intrusion Detection Systems

### SVM Ensemble Model for Investment Prediction

19 SVM Ensemble Model for Investment Prediction Chandra J, Assistant Professor, Department of Computer Science, Christ University, Bangalore Siji T. Mathew, Research Scholar, Christ University, Dept of

### A semi-supervised Spam mail detector

A semi-supervised Spam mail detector Bernhard Pfahringer Department of Computer Science, University of Waikato, Hamilton, New Zealand Abstract. This document describes a novel semi-supervised approach

### Learning is a very general term denoting the way in which agents:

What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);

### What is Data Mining? Data Mining (Knowledge discovery in database) Data mining: Basic steps. Mining tasks. Classification: YES, NO

What is Data Mining? Data Mining (Knowledge discovery in database) Data Mining: "The non trivial extraction of implicit, previously unknown, and potentially useful information from data" William J Frawley,

International Journal of Computer Trends and Technology- volume4issue2-2013 ABSTRACT: Neural Network Design in Cloud Computing B.Rajkumar #1,T.Gopikiran #2,S.Satyanarayana *3 #1,#2Department of Computer

### AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree

### Welcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA

Welcome Xindong Wu Data Mining: Updates in Technologies Dept of Math and Computer Science Colorado School of Mines Golden, Colorado 80401, USA Email: xwu@ mines.edu Home Page: http://kais.mines.edu/~xwu/

### SUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK

SUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK N M Allinson and D Merritt 1 Introduction This contribution has two main sections. The first discusses some aspects of multilayer perceptrons,

### Analysis Tools and Libraries for BigData

+ Analysis Tools and Libraries for BigData Lecture 02 Abhijit Bendale + Office Hours 2 n Terry Boult (Waiting to Confirm) n Abhijit Bendale (Tue 2:45 to 4:45 pm). Best if you email me in advance, but I

### Financial Statement Fraud Detection: An Analysis of Statistical and Machine Learning Algorithms

Financial Statement Fraud Detection: An Analysis of Statistical and Machine Learning Algorithms Johan Perols Assistant Professor University of San Diego, San Diego, CA 92110 jperols@sandiego.edu April

### Spyware Prevention by Classifying End User License Agreements

Spyware Prevention by Classifying End User License Agreements Niklas Lavesson, Paul Davidsson, Martin Boldt, and Andreas Jacobsson Blekinge Institute of Technology, Dept. of Software and Systems Engineering,

### Analecta Vol. 8, No. 2 ISSN 2064-7964

EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,

### T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari : 245577

T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier Santosh Tirunagari : 245577 January 20, 2011 Abstract This term project gives a solution how to classify an email as spam or

### Classification of Titanic Passenger Data and Chances of Surviving the Disaster Data Mining with Weka and Kaggle Competition Data

Proceedings of Student-Faculty Research Day, CSIS, Pace University, May 2 nd, 2014 Classification of Titanic Passenger Data and Chances of Surviving the Disaster Data Mining with Weka and Kaggle Competition

### Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

### IBM SPSS Neural Networks 19

IBM SPSS Neural Networks 19 Note: Before using this information and the product it supports, read the general information under Notices on p. 95. This document contains proprietary information of SPSS

### Data Mining Techniques Chapter 7: Artificial Neural Networks

Data Mining Techniques Chapter 7: Artificial Neural Networks Artificial Neural Networks.................................................. 2 Neural network example...................................................