# Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) ( ) Roman Kern. KTI, TU Graz

 To view this video please enable JavaScript, and consider upgrading to a web browser that supports HTML5 video
Save this PDF as:

Size: px
Start display at page:

Download "Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05"

## Transcription

1 Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz Roman Kern (KTI, TU Graz) Ensemble Methods / 38

2 Outline 1 Introduction 2 Classification 3 Clustering Roman Kern (KTI, TU Graz) Ensemble Methods / 38

3 Introduction Introduction to Ensemble Methods Motivation & Basics Roman Kern (KTI, TU Graz) Ensemble Methods / 38

4 Introduction Ensemble Methods Intro Quick facts Basic Idea: Have multiple models and a method to combine them into a single one Predominately used in classification and prediction Sometimes called: combined models, meta learning, committee machines, multiple classifier systems Ensemble methods do have a long history and used in statistics for more than 200 years Roman Kern (KTI, TU Graz) Ensemble Methods / 38

5 Introduction Ensemble Methods Intro Types of ensembles different hypothesis different algorithms different parts of the data set Roman Kern (KTI, TU Graz) Ensemble Methods / 38

6 Introduction Ensemble Methods Intro Motivation as every model has its limitations Goal: combine the strength of all models Improve the accuracy of using an ensemble Be more robust in regard to noise Basic Approaches Averaging Voting Probabilistic methods Roman Kern (KTI, TU Graz) Ensemble Methods / 38

7 Introduction Ensemble Methods Intro Combination of Models Need a function to combine the results from the models For real values output Linear combination Product rule For categorical output, eg class labels Majority vote Roman Kern (KTI, TU Graz) Ensemble Methods / 38

8 Introduction Ensemble Methods Intro Linear combination Simple form of combining the output of an ensemble Given T models, f t (y x) g(y x) = T t=1 w tf t (y x) Problem of estimating the optimal weights (w t ) Simple solution: use the uniform distribution: w t = 1/T Roman Kern (KTI, TU Graz) Ensemble Methods / 38

9 Introduction Ensemble Methods Intro Product rule Alternative form of combining the output of an ensemble g(y x) = 1 T Z t=1 f t(y x) w t where Z is a normalisation factor Again, estimating the weights is non-trivial Roman Kern (KTI, TU Graz) Ensemble Methods / 38

10 Introduction Ensemble Methods Intro Majority Vote Combining the output, if categorical The models produce a label as output, eg h t (x) {+1, 1} H(x) = sign( T t=1 w th t (x)) If the weights are non-uniform, it is a weighted vote Roman Kern (KTI, TU Graz) Ensemble Methods / 38

11 Introduction Ensemble Methods Intro Selection of models The models should not be identical, ie produce identical results therefore an ensemble should represent a degree of diversity Two basic types of achieving this diversity Implicitly, eg by integrating randomness (bagging) Explicitly, eg integrate variance into the process (boosting) Most of the methods implicitly integrate diversity Roman Kern (KTI, TU Graz) Ensemble Methods / 38

12 Introduction Ensemble Methods Intro Motivation for ensemble methods Statistical Large number of hypothesis (in relation to training data-set) Not clear, which hypothesis is the best Using an ensemble reduces the risk of picking a bad model Computational Avoid local minima Partially addressed by heuristics Representational A single model/hypothesis might not be able to represent the data Dietterich, T G (2000) Ensemble methods in machine learning In Multiple classifier systems (pp 1-15) Roman Kern (KTI, TU Graz) Ensemble Methods / 38

13 Classification Ensemble Methods for Classification Roman Kern (KTI, TU Graz) Ensemble Methods / 38

14 Diversity Underlying question How much of the ensemble prediction is due to the accuracies of the individual models and how much due to their combination? express the ensemble error as two terms: Error of individual models Impact of interactions, the diversity Note: It depends on the combination, whether one can separate the two terms Roman Kern (KTI, TU Graz) Ensemble Methods / 38

15 Diversity Regression error for the linear combination Squared error of the ensemble regression (g(x) d) 2 = 1 T T t=1 (g t(x) d) 2 1 T T t=1 (g t(x) g(x)) 2 First term: error of the individual models Second term: interactions between the predictions the ambiguity, 0 Therefore it is preferable to increase the ambiguity (diversity) Smallprint: Actually there is a tradeoff of bias, variance and covariance, known as accuracy-diversity dilemma Krogh, A, & Vedelsby, J (1995) Neural network ensembles, cross-validation and active learning In Advances in neural information processing systems (pp ) Cambridge, MA: MIT Press Kuncheva, Roman Kern (KTI, TU Graz) Ensemble Methods / 38

16 Diversity Classification error for the linear combination For a simple averaging ensemble (and some assumptions) e ave = e add ( 1+δ(T 1) T ) where e add is the error of the individual model and δ being the correlation between the models Tumer, K, & Ghosh, J (1996) Error correlation and error reduction in ensemble classifiers Connection Science 8(3 4), Roman Kern (KTI, TU Graz) Ensemble Methods / 38

17 Approaches Basic Approaches Bagging - combines strong learners reduce variance Boosting - combines weak learners reduce bias Many more: mixture of experts, cascades, Roman Kern (KTI, TU Graz) Ensemble Methods / 38

18 Bootstrap Bootstrap Sampling Create a distribution of data-sets from a single data-set If used within ensemble methods, it is typically called Bagging Simple approach, but has proven to increase performance Davison, A C, & Hinkley, D (2006) Bootstrap methods and their applications (8th ed) Cambridge: Cambridge Series in Statisti- cal and Probabilistic Mathematics Roman Kern (KTI, TU Graz) Ensemble Methods / 38

19 Bagging Bagging Each member of the ensemble is generated by a different data-set Good for unstable models where small differences in the input data-set yield big differences in output Also known as high variance models not so good for simple models Note: Bagging is an abbreviation for bootstrap aggregating Breiman, L (1998) Arcing classifiers Annals of Statistics, 26(3), Roman Kern (KTI, TU Graz) Ensemble Methods / 38

20 Bagging Bagging Algorithm (train) 1 Input: Ensemble size T, training set D = {(x 1, y 1 ),, (x n, y n )} 2 For each model M t 1 For n times, where n n 1 Sampling (random) from D with replacement 2 Train model M t with subset Bagging Algorithm (classify) For classification typically majority vote For regression typically linear combination Note: Subset may contain duplications, ie if n = n Roman Kern (KTI, TU Graz) Ensemble Methods / 38

21 Boosting Boosting Family of ensemble learners Boost weak learners to a strong learner Adaboost is the most prominent one Weak learners need to be better than random guessing Roman Kern (KTI, TU Graz) Ensemble Methods / 38

22 Boosting Adaboost Basic idea: Weight the individual instances of the data-set Iteratively learn models and record their errors Distribute the effort of the next round on the mis-classified examples Roman Kern (KTI, TU Graz) Ensemble Methods / 38

23 Boosting Adaboost (train) 1 Input: Ensemble size T, training set D = {(x 1, y 1 ),, (x n, y n )} 2 Define a uniform distribution over W t over elements of D 3 For each model M i 1 Train model M i using distribution W t 2 Calculate the error of model ϵ t and weight α t = 1 2 ln( 1 ϵ t 3 if ϵ t > 05 break (and discard model) 4 else update the distribution W t according to ϵ t Adaboost (classify) Linear combination, H(x) = sign( T t=1 α th t (x)) ϵ t ) Roman Kern (KTI, TU Graz) Ensemble Methods / 38

24 Stacking Stacked generalisation Basic idea: Have the output of a layer of classifiers as input to another layer For 2 layers: 1 Split the training data-set into two parts 2 Learn the first layer using the first part 3 Classify the second part and 4 take the decision as input for the second parts Wolpert, D H (1992) Stacked generalization Neural Networks 5(2), Roman Kern (KTI, TU Graz) Ensemble Methods / 38

25 Mixture of Experts Mixture of Experts Basic idea: some models should specialise on parts of the input space Ingredients Base models (eg specialised models - so called experts) Component to estimate probabilities, often called a gating network The gating networks learns to select the appropriate expert for parts of the input space Roman Kern (KTI, TU Graz) Ensemble Methods / 38

26 Mixture of Experts Mixture of Experts - Example #1 Ensemble of base learners being combined using weighted linear combination The weight is found via a neural network The neural network is learnt via the same input data-set Mixture of Experts - Example #2 Mixture of expert models are called mixture models eg the Expectations-maximisation algorithm Roman Kern (KTI, TU Graz) Ensemble Methods / 38

27 Cascading Cascade of classifiers Setting Have a sequence of models, each with high hitrate ( h) and low false alarm rate (< f) with increasing complexity In the data-set the negative examples are more common The cascade is learnt via boosting For example: For h = 099 and f = 03 and a cascade of size 10 one gets the hitrate of about 09 and a false alarm rate of about Viola, P, & Jones, M (2001) Rapid object detection using a boosted cascade of simple features In Computer Vision and Pattern Recognition, 2001 Roman Kern (KTI, TU Graz) Ensemble Methods / 38

28 Decision Stump Decision stump are a popular choice for (some) ensemble learning as they are fast as they are less prone to overfitting A decision stump is a decision tree that only uses a single feature (attribute) Holte, R C (1993) Very simple classification rules perform well on most commonly used datasets Machine Learning, 11, Roman Kern (KTI, TU Graz) Ensemble Methods / 38

29 Random Subset Method Basic idea: Instead of take a subset of the data-set, use a subset of the feature set will work best, if there are many features and will not work as well if most of the features are just noise Roman Kern (KTI, TU Graz) Ensemble Methods / 38

30 Random Forest Combines two randomization strategies Select random subset of the data-set to learn decision tree (bagging), eg select n = 100 random trees Select random subset of features, eg select m features Random forests are used to estimate the importance of features (by comparing the error using a feature vs not using a feature) Typically good performance Breiman, L (2001) Random forests Machine Learning, 45(1), 5 32 Roman Kern (KTI, TU Graz) Ensemble Methods / 38

31 Multiclass Classification Multiclass Classification Basic idea: split a multi-class problem into a set binary classification problems eg Error correcting output codes Kong, E B, & Dietterich, T G (1995) Error-correcting output coding corrects bias and variance In International conference on machine learning Roman Kern (KTI, TU Graz) Ensemble Methods / 38

32 Vote / Veto Classification Ensemble classification for multi-class problems Have different base classifiers for different parts of the feature set Train all base classifiers using the training data-set Record their performance with cross-evaluation for each class have two thresholds, min precision and min recall If the precision for a certain class and model is min precision allowed to vote If the recall for a certain class and model is min recall allowed to vote against (veto) In the classification use a weighted vote where veto is a negative vote and the weight is according to the respective measure (precision or recall) Kern, R, Seifert, C, Zechner, M, & Granitzer, M (2011, September) Vote/Veto Meta-Classifier for Authorship Identification Notebook for PAN at CLEF 2011 Roman Kern (KTI, TU Graz) Ensemble Methods / 38

33 Active Learning Active learning is a form of semi-supervised learning The basic idea is to give the human instances to label which carry the most information (to update the model) Query by Committee use an ensemble, ie the disagreement of multiple classifiers to pick instances Roman Kern (KTI, TU Graz) Ensemble Methods / 38

34 Clustering Clustering and other approaches Roman Kern (KTI, TU Graz) Ensemble Methods / 38

35 Clustering Cluster Ensembles Basic idea: Have multiple clustering algorithms group a data-set combine all results into a single clustering results Motivation: More reliable result than individual cluster solutions Roman Kern (KTI, TU Graz) Ensemble Methods / 38

36 Clustering Cluster Ensembles Consensus Clustering Have a set of clusterings: {C 1,, C m } Find an overall clustering solution C Minimise the disagreement using a metric: D(C) = C i d(c, C i ) Also known as clustering aggregation Mirkin Metric The metric reflects the numbers of pairs of instances being together in the overall clustering, but separate in C i and vice versa Roman Kern (KTI, TU Graz) Ensemble Methods / 38

37 Clustering Other Ensembles Ensemble methods are not limited to machine learning tasks alone For example, in the field of recommender systems they are known as hybrid recommender system eg combine a content based recommender with a collaborative filtering one Roman Kern (KTI, TU Graz) Ensemble Methods / 38

38 Clustering The End Next: Time Series Roman Kern (KTI, TU Graz) Ensemble Methods / 38

### Data Mining. Nonlinear Classification

Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15

### Model Combination. 24 Novembre 2009

Model Combination 24 Novembre 2009 Datamining 1 2009-2010 Plan 1 Principles of model combination 2 Resampling methods Bagging Random Forests Boosting 3 Hybrid methods Stacking Generic algorithm for mulistrategy

### Data Mining Practical Machine Learning Tools and Techniques

Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

### Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Unit # 11 Sajjad Haider Fall 2013 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

### Chapter 6. The stacking ensemble approach

82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

### L25: Ensemble learning

L25: Ensemble learning Introduction Methods for constructing ensembles Combination strategies Stacked generalization Mixtures of experts Bagging Boosting CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna

### A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,

### CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.

CI6227: Data Mining Lesson 11b: Ensemble Learning Sinno Jialin PAN Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore Acknowledgements: slides are adapted from the lecture notes

### Ensemble Data Mining Methods

Ensemble Data Mining Methods Nikunj C. Oza, Ph.D., NASA Ames Research Center, USA INTRODUCTION Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods

### Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Evaluating the Accuracy of a Classifier Holdout, random subsampling, crossvalidation, and the bootstrap are common techniques for

### Boosting. riedmiller@informatik.uni-freiburg.de

. Machine Learning Boosting Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut für Informatik Technische Fakultät Albert-Ludwigs-Universität Freiburg riedmiller@informatik.uni-freiburg.de

### Data Mining Methods: Applications for Institutional Research

Data Mining Methods: Applications for Institutional Research Nora Galambos, PhD Office of Institutional Research, Planning & Effectiveness Stony Brook University NEAIR Annual Conference Philadelphia 2014

### CS570 Data Mining Classification: Ensemble Methods

CS570 Data Mining Classification: Ensemble Methods Cengiz Günay Dept. Math & CS, Emory University Fall 2013 Some slides courtesy of Han-Kamber-Pei, Tan et al., and Li Xiong Günay (Emory) Classification:

### Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk

Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk Ensembles 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training

### REVIEW OF ENSEMBLE CLASSIFICATION

Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IJCSMC, Vol. 2, Issue.

### Using multiple models: Bagging, Boosting, Ensembles, Forests

Using multiple models: Bagging, Boosting, Ensembles, Forests Bagging Combining predictions from multiple models Different models obtained from bootstrap samples of training data Average predictions or

### Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

### Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes

Knowledge Discovery and Data Mining Lecture 19 - Bagging Tom Kelsey School of Computer Science University of St Andrews http://tom.host.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-19-B &

### Data Mining Classification: Alternative Techniques. Instance-Based Classifiers. Lecture Notes for Chapter 5. Introduction to Data Mining

Data Mining Classification: Alternative Techniques Instance-Based Classifiers Lecture Notes for Chapter 5 Introduction to Data Mining by Tan, Steinbach, Kumar Set of Stored Cases Atr1... AtrN Class A B

### A Learning Algorithm For Neural Network Ensembles

A Learning Algorithm For Neural Network Ensembles H. D. Navone, P. M. Granitto, P. F. Verdes and H. A. Ceccatto Instituto de Física Rosario (CONICET-UNR) Blvd. 27 de Febrero 210 Bis, 2000 Rosario. República

### Decision Trees from large Databases: SLIQ

Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values

### Why Ensembles Win Data Mining Competitions

Why Ensembles Win Data Mining Competitions A Predictive Analytics Center of Excellence (PACE) Tech Talk November 14, 2012 Dean Abbott Abbott Analytics, Inc. Blog: http://abbottanalytics.blogspot.com URL:

### Comparison of Data Mining Techniques used for Financial Data Analysis

Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract

### Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Unit # 10 Sajjad Haider Fall 2012 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

### Ensemble Learning Better Predictions Through Diversity. Todd Holloway ETech 2008

Ensemble Learning Better Predictions Through Diversity Todd Holloway ETech 2008 Outline Building a classifier (a tutorial example) Neighbor method Major ideas and challenges in classification Ensembles

### Advanced Ensemble Strategies for Polynomial Models

Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer

### Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

### Introduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011

Introduction to Machine Learning Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 1 Outline 1. What is machine learning? 2. The basic of machine learning 3. Principles and effects of machine learning

### Metalearning for Dynamic Integration in Ensemble Methods

Metalearning for Dynamic Integration in Ensemble Methods Fábio Pinto 12 July 2013 Faculdade de Engenharia da Universidade do Porto Ph.D. in Informatics Engineering Supervisor: Doutor Carlos Soares Co-supervisor:

### Chapter 11 Boosting. Xiaogang Su Department of Statistics University of Central Florida - 1 -

Chapter 11 Boosting Xiaogang Su Department of Statistics University of Central Florida - 1 - Perturb and Combine (P&C) Methods have been devised to take advantage of the instability of trees to create

### Active Learning with Boosting for Spam Detection

Active Learning with Boosting for Spam Detection Nikhila Arkalgud Last update: March 22, 2008 Active Learning with Boosting for Spam Detection Last update: March 22, 2008 1 / 38 Outline 1 Spam Filters

### BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University

### SVM Ensemble Model for Investment Prediction

19 SVM Ensemble Model for Investment Prediction Chandra J, Assistant Professor, Department of Computer Science, Christ University, Bangalore Siji T. Mathew, Research Scholar, Christ University, Dept of

### Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods

Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods Jerzy B laszczyński 1, Krzysztof Dembczyński 1, Wojciech Kot lowski 1, and Mariusz Paw lowski 2 1 Institute of Computing

### Chapter 12 Bagging and Random Forests

Chapter 12 Bagging and Random Forests Xiaogang Su Department of Statistics and Actuarial Science University of Central Florida - 1 - Outline A brief introduction to the bootstrap Bagging: basic concepts

### Ensemble Methods. Adapted from slides by Todd Holloway h8p://abeau<fulwww.com/2007/11/23/ ensemble- machine- learning- tutorial/

Ensemble Methods Adapted from slides by Todd Holloway h8p://abeau

### Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

### Supervised Learning (Big Data Analytics)

Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used

### On the effect of data set size on bias and variance in classification learning

On the effect of data set size on bias and variance in classification learning Abstract Damien Brain Geoffrey I Webb School of Computing and Mathematics Deakin University Geelong Vic 3217 With the advent

### Leveraging Ensemble Models in SAS Enterprise Miner

ABSTRACT Paper SAS133-2014 Leveraging Ensemble Models in SAS Enterprise Miner Miguel Maldonado, Jared Dean, Wendy Czika, and Susan Haller SAS Institute Inc. Ensemble models combine two or more models to

### Solving Regression Problems Using Competitive Ensemble Models

Solving Regression Problems Using Competitive Ensemble Models Yakov Frayman, Bernard F. Rolfe, and Geoffrey I. Webb School of Information Technology Deakin University Geelong, VIC, Australia {yfraym,brolfe,webb}@deakin.edu.au

### Data Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 7 of Data Mining by I. H. Witten and E. Frank

Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 7 of Data Mining by I. H. Witten and E. Frank Engineering the input and output Attribute selection Scheme independent, scheme

### Generalizing Random Forests Principles to other Methods: Random MultiNomial Logit, Random Naive Bayes, Anita Prinzie & Dirk Van den Poel

Generalizing Random Forests Principles to other Methods: Random MultiNomial Logit, Random Naive Bayes, Anita Prinzie & Dirk Van den Poel Copyright 2008 All rights reserved. Random Forests Forest of decision

### MHI3000 Big Data Analytics for Health Care Final Project Report

MHI3000 Big Data Analytics for Health Care Final Project Report Zhongtian Fred Qiu (1002274530) http://gallery.azureml.net/details/81ddb2ab137046d4925584b5095ec7aa 1. Data pre-processing The data given

### Social Media Mining. Data Mining Essentials

Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

### Introducing diversity among the models of multi-label classification ensemble

Introducing diversity among the models of multi-label classification ensemble Lena Chekina, Lior Rokach and Bracha Shapira Ben-Gurion University of the Negev Dept. of Information Systems Engineering and

### Classification of Bad Accounts in Credit Card Industry

Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition

### Cross-Validation. Synonyms Rotation estimation

Comp. by: BVijayalakshmiGalleys0000875816 Date:6/11/08 Time:19:52:53 Stage:First Proof C PAYAM REFAEILZADEH, LEI TANG, HUAN LIU Arizona State University Synonyms Rotation estimation Definition is a statistical

### Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

### Combining Multiple Models Across Algorithms and Samples for Improved Results

Combining Multiple Models Across Algorithms and Samples for Improved Results Haleh Vafaie, PhD. Federal Data Corporation Bethesda, MD Hvafaie@feddata.com Dean Abbott Abbott Consulting San Diego, CA dean@abbottconsulting.com

### Data Analytics and Business Intelligence (8696/8697)

http: // togaware. com Copyright 2014, Graham.Williams@togaware.com 1/36 Data Analytics and Business Intelligence (8696/8697) Ensemble Decision Trees Graham.Williams@togaware.com Data Scientist Australian

### Decompose Error Rate into components, some of which can be measured on unlabeled data

Bias-Variance Theory Decompose Error Rate into components, some of which can be measured on unlabeled data Bias-Variance Decomposition for Regression Bias-Variance Decomposition for Classification Bias-Variance

### Using Random Forest to Learn Imbalanced Data

Using Random Forest to Learn Imbalanced Data Chao Chen, chenchao@stat.berkeley.edu Department of Statistics,UC Berkeley Andy Liaw, andy liaw@merck.com Biometrics Research,Merck Research Labs Leo Breiman,

### Robust Real-Time Face Detection

Robust Real-Time Face Detection International Journal of Computer Vision 57(2), 137 154, 2004 Paul Viola, Michael Jones 授 課 教 授 : 林 信 志 博 士 報 告 者 : 林 宸 宇 報 告 日 期 :96.12.18 Outline Introduction The Boost

### Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu

Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics

### Increasing Classification Accuracy. Data Mining: Bagging and Boosting. Bagging 1. Bagging 2. Bagging. Boosting Meta-learning (stacking)

Data Mining: Bagging and Boosting Increasing Classification Accuracy Andrew Kusiak 2139 Seamans Center Iowa City, Iowa 52242-1527 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak Tel: 319-335

### Simple and efficient online algorithms for real world applications

Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,

### Gerry Hobbs, Department of Statistics, West Virginia University

Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit

### II. RELATED WORK. Sentiment Mining

Sentiment Mining Using Ensemble Classification Models Matthew Whitehead and Larry Yaeger Indiana University School of Informatics 901 E. 10th St. Bloomington, IN 47408 {mewhiteh, larryy}@indiana.edu Abstract

### Data Mining Techniques for Prognosis in Pancreatic Cancer

Data Mining Techniques for Prognosis in Pancreatic Cancer by Stuart Floyd A Thesis Submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUE In partial fulfillment of the requirements for the Degree

### E-commerce Transaction Anomaly Classification

E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce

### New Ensemble Combination Scheme

New Ensemble Combination Scheme Namhyoung Kim, Youngdoo Son, and Jaewook Lee, Member, IEEE Abstract Recently many statistical learning techniques are successfully developed and used in several areas However,

### Introduction To Ensemble Learning

Educational Series Introduction To Ensemble Learning Dr. Oliver Steinki, CFA, FRM Ziad Mohammad July 2015 What Is Ensemble Learning? In broad terms, ensemble learning is a procedure where multiple learner

### Cross Validation. Dr. Thomas Jensen Expedia.com

Cross Validation Dr. Thomas Jensen Expedia.com About Me PhD from ETH Used to be a statistician at Link, now Senior Business Analyst at Expedia Manage a database with 720,000 Hotels that are not on contract

### CLASSIFICATION AND CLUSTERING. Anveshi Charuvaka

CLASSIFICATION AND CLUSTERING Anveshi Charuvaka Learning from Data Classification Regression Clustering Anomaly Detection Contrast Set Mining Classification: Definition Given a collection of records (training

### Monday Morning Data Mining

Monday Morning Data Mining Tim Ruhe Statistische Methoden der Datenanalyse Outline: - data mining - IceCube - Data mining in IceCube Computer Scientists are different... Fakultät Physik Fakultät Physik

### Random forest algorithm in big data environment

Random forest algorithm in big data environment Yingchun Liu * School of Economics and Management, Beihang University, Beijing 100191, China Received 1 September 2014, www.cmnt.lv Abstract Random forest

### Beating the MLB Moneyline

Beating the MLB Moneyline Leland Chen llxchen@stanford.edu Andrew He andu@stanford.edu 1 Abstract Sports forecasting is a challenging task that has similarities to stock market prediction, requiring time-series

### CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

### Outlier Ensembles. Charu C. Aggarwal IBM T J Watson Research Center Yorktown, NY 10598. Keynote, Outlier Detection and Description Workshop, 2013

Charu C. Aggarwal IBM T J Watson Research Center Yorktown, NY 10598 Outlier Ensembles Keynote, Outlier Detection and Description Workshop, 2013 Based on the ACM SIGKDD Explorations Position Paper: Outlier

### Machine Learning Final Project Spam Email Filtering

Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE

### A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions

A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions Jing Gao Wei Fan Jiawei Han Philip S. Yu University of Illinois at Urbana-Champaign IBM T. J. Watson Research Center

### Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

### Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next

### ENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS

ENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS Michael Affenzeller (a), Stephan M. Winkler (b), Stefan Forstenlechner (c), Gabriel Kronberger (d), Michael Kommenda (e), Stefan

### Ensembles and PMML in KNIME

Ensembles and PMML in KNIME Alexander Fillbrunn 1, Iris Adä 1, Thomas R. Gabriel 2 and Michael R. Berthold 1,2 1 Department of Computer and Information Science Universität Konstanz Konstanz, Germany First.Last@Uni-Konstanz.De

### Supervised Learning with Unsupervised Output Separation

Supervised Learning with Unsupervised Output Separation Nathalie Japkowicz School of Information Technology and Engineering University of Ottawa 150 Louis Pasteur, P.O. Box 450 Stn. A Ottawa, Ontario,

### Presentation by: Ahmad Alsahaf. Research collaborator at the Hydroinformatics lab - Politecnico di Milano MSc in Automation and Control Engineering

Johann Bernoulli Institute for Mathematics and Computer Science, University of Groningen 9-October 2015 Presentation by: Ahmad Alsahaf Research collaborator at the Hydroinformatics lab - Politecnico di

### Predicting Flight Delays

Predicting Flight Delays Dieterich Lawson jdlawson@stanford.edu William Castillo will.castillo@stanford.edu Introduction Every year approximately 20% of airline flights are delayed or cancelled, costing

### TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP

TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző csaba.fozo@lloydsbanking.com 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions

### UNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee

UNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee 1. Introduction There are two main approaches for companies to promote their products / services: through mass

### Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification

### ENSEMBLE METHODS FOR CLASSIFIERS

Chapter 45 ENSEMBLE METHODS FOR CLASSIFIERS Lior Rokach Department of Industrial Engineering Tel-Aviv University liorr@eng.tau.ac.il Abstract Keywords: The idea of ensemble methodology is to build a predictive

### Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC

Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain

### C19 Machine Learning

C9 Machine Learning 8 Lectures Hilary Term 25 2 Tutorial Sheets A. Zisserman Overview: Supervised classification perceptron, support vector machine, loss functions, kernels, random forests, neural networks

### An Ensemble Method for Large Scale Machine Learning with Hadoop MapReduce

An Ensemble Method for Large Scale Machine Learning with Hadoop MapReduce by Xuan Liu Thesis submitted to the Faculty of Graduate and Postdoctoral Studies In partial fulfillment of the requirements For

### A New Ensemble Model for Efficient Churn Prediction in Mobile Telecommunication

2012 45th Hawaii International Conference on System Sciences A New Ensemble Model for Efficient Churn Prediction in Mobile Telecommunication Namhyoung Kim, Jaewook Lee Department of Industrial and Management

### Knowledge Discovery and Data Mining 1 (VO) (707.003)

Knowledge Discovery and Data Mining 1 (VO) (707.003) Denis Helic KTI, TU Graz Oct 1, 2015 Denis Helic (KTI, TU Graz) KDDM1 Oct 1, 2015 1 / 55 Lecturer Name: Denis Helic Office: IWT, Inffeldgasse 13, 5th

### HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION

HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION Chihli Hung 1, Jing Hong Chen 2, Stefan Wermter 3, 1,2 Department of Management Information Systems, Chung Yuan Christian University, Taiwan

### ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA

ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA D.Lavanya 1 and Dr.K.Usha Rani 2 1 Research Scholar, Department of Computer Science, Sree Padmavathi Mahila Visvavidyalayam, Tirupati, Andhra Pradesh,

### Distributed forests for MapReduce-based machine learning

Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication

### Machine Learning. Term 2012/2013 LSI - FIB. Javier Béjar cbea (LSI - FIB) Machine Learning Term 2012/2013 1 / 34

Machine Learning Javier Béjar cbea LSI - FIB Term 2012/2013 Javier Béjar cbea (LSI - FIB) Machine Learning Term 2012/2013 1 / 34 Outline 1 Introduction to Inductive learning 2 Search and inductive learning

### DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com

### Maschinelles Lernen mit MATLAB

Maschinelles Lernen mit MATLAB Jérémy Huard Applikationsingenieur The MathWorks GmbH 2015 The MathWorks, Inc. 1 Machine Learning is Everywhere Image Recognition Speech Recognition Stock Prediction Medical

### Microsoft Azure Machine learning Algorithms

Microsoft Azure Machine learning Algorithms Tomaž KAŠTRUN @tomaz_tsql Tomaz.kastrun@gmail.com http://tomaztsql.wordpress.com Our Sponsors Speaker info https://tomaztsql.wordpress.com Agenda Focus on explanation

### Classification and Regression by randomforest

Vol. 2/3, December 02 18 Classification and Regression by randomforest Andy Liaw and Matthew Wiener Introduction Recently there has been a lot of interest in ensemble learning methods that generate many

### Machine Learning. Mausam (based on slides by Tom Mitchell, Oren Etzioni and Pedro Domingos)

Machine Learning Mausam (based on slides by Tom Mitchell, Oren Etzioni and Pedro Domingos) What Is Machine Learning? A computer program is said to learn from experience E with respect to some class of

### Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News

Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News Sushilkumar Kalmegh Associate Professor, Department of Computer Science, Sant Gadge Baba Amravati