# Boosting.

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 . Machine Learning Boosting Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut für Informatik Technische Fakultät Albert-Ludwigs-Universität Freiburg Acknowledgment Slides are based on tutorials from Robert Schapire and Gunnar Raetsch Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 1

2 Overview of Today s Lecture: Boosting Motivation AdaBoost examples Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 2

3 Motivation often, simple rules already yield a reasonable classification performance e.g. spam filter: buy now would already be a good indicator for spam, doing better than random idea: use several of these simple rules ( decisions stumps ) and combine them to do classification ( committee appraoch) Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 3

4 Boosting Boosting is a general method to combine simple, rough rules into a highly accurate decision model. assumes base learners (or weak learners ) that do at least slightly better than random, e.g. 53% for a two-class classification. In other words the error is strictly smaller than 0.5. given sufficient data, a boosting algorithm can provably construct a single classifier with very high accuracy (say, e.g. 99%) Algorithm: AdaBoost (Freund and Schapire, 1995) Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 4

5 General Framework for Boosting training set (x 1, y 1 ),...,(x p, y p ), where y i { 1,1} and x i X For t = 1... T construct a distribution D t = (d t 1,..., d t p) on the training set, where d t i is the probability of occurence of training pattern i (also called weight of i) train a base learner h t : X { 1,+1} according to the training set distribution D t with a small error ǫ t, ǫ t = Pr Dt [h t (x i ) y i ] output final classifier H final as a combination of the T hypothesis h 1,..., h T Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 5

6 AdaBoost - Details (Freund and Schapire, 1995) the final hypothesis is a weighted sum of the individual hypothesis: H final (x) = sign( t α t h t (x)) Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 6

7 AdaBoost - Details (Freund and Schapire, 1995) the final hypothesis is a weighted sum of the individual hypothesis: H final (x) = sign( t α t h t (x)) hypothesis are weighted depending on the error they produce: α t := 1 2 ln 1 ǫ t ǫ t e.g: ǫ t = 0.1 α t = 2.197; ǫ t = 0.4 α t = 0.41; ǫ t = 0.5 α t = 0; the above choice of α can be shown to minimize an upper bound of the final hypothesis error (see e.g. (Schapire, 2003) for details) Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 6

8 AdaBoost - Details (cont d) (Freund and Schapire, 1995) training pattern probabilities are recomputed, such that wrongly classified patterns gets a higher probability in the next round: d t+1 i := dt i Z t { exp(αt ), if y i h t (x i ) exp( α t ), else = dt i Z t exp( α t y i h t (x i )) Z t is a normalisation factor, such that i dt i = 1 (probability distribution), i.e. Z t = d t i exp( α t y i h t (x i )) i Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 7

9 AdaBoost (Freund and Schapire, 1995) Initialize D 1 = (1/p,...,1/p), p is the number of training patterns For t = 1... T: train base learner h t according to distribution D t determine expected error: ǫ t := p d t i I(h(x i ) y i ), i=1 where I(A) = 1, if A is true, 0, else. compute hypothesis weight and new pattern distribution: α t := 1 2 ln 1 ǫ t ǫ t d t+1 i := dt i Z t exp( α t y i h t (x i )), (Z t normalisation factor) Final hypothesis: H final (x) = sign( t α t h t (x)) Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 8

10 Example (1) Example taken from (R. Schapire) Hypothesis: simple splits parallel to one axis Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 9

11 Example (2) Example taken from (R. Schapire) ǫ 1 = 0.3, α 1 = 0.42, wrong pattern: d 2 i = 0.166, correct: d2 i = 0.07 computation see next slide Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 10

12 ǫ 1 = p i=1 d 1 i I(h(x i ) y i ) = = 0.3 α 1 = 1 2 ln 1 ǫ 1 ǫ 1 = 1 2 ln = 0.42 wrong patterns: d 1 i exp(α 1) = 0.1 exp(0.42) = correct patterns: d 1 i exp( α 1) = 0.1 exp( 0.42) = Z 1 = = wrong patterns: d 2 i = 0.152/0.918 = correct patterns: d 2 i = 0.066/0.918 = 0.07 Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 11

13 Example (3) Example taken from (R. Schapire) ǫ 2 = 0.21, α 2 = 0.65 computation see next slide Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 12

14 ǫ 2 = p d 2 i I(h(x i ) y i ) = = 0.21 i=1 α 1 = 1 2 ln 1 ǫ 2 ǫ 2 = 1 2 ln = 0.65 Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 13

15 Example (4) Example taken from (R. Schapire) ǫ 2 = 0.14, α 2 = 0.92 Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 14

16 Example (5) Example taken from (R. Schapire) Final classifier: Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 15

17 Further examples (1) Taken from (Meir, Raetsch, 2003): AdaBoost on a 2D toy data set: color indicates the label and the diameter is proportional to the weight of the example. Dashed lines show decision boundaries of the single classifiers (up to 5th iteration). Solid line shows the decision line of the combined classifier. In the last two plots the decision line of Bagging is plotted for a comparison. Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 16

18 Performance of AdaBoost In case of low noise in the training data, AdaBoost can reasonbly learn even complex decision boundaries without overfitting Figure taken from G. Raetsch, Tutorial MLSS 03 Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 17

19 Practical Application: Face Recognition (Viola and Jones, 2004) problem: find faces in photographs and movies weak classifiers: detec light/ dark rectangles in images many tricks to make it fast and accurate Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 18

20 Further aspects Boosting can be applied to any ML classifier: MLPs, Decision Trees,... related approach: Bagging (Breiman): training sample distribution remains unchanged. For each training of a hypotheses, another set of training samples is drawn with putting back. boosting performs usually very well with respect to non overfitting the data, if the noise in the data is low. While this is astonishing at first glance, it can be theortically explained by analyzing the margins of the classifiers. if the data has a higher noise level, boosting runs into the problem of overfitting. Regularisation methods exist, that try to avoid this (e.g. by restricting the weight of notorious outliers). extension to multi-class problems and regression problems exist. Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 19

21 References R. Schapire: A boosting tutorial. schapire Meir and G. Raetsch, 2003: An introduction to boosting and leveraging 2003 G. Raetsch: Tutorial at MLSS 2003 Viola and Jones, 2004: Fast Face Recognition Martin Riedmiller, Albert-Ludwigs-Universität Freiburg, Machine Learning 20

### Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk

Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk Ensembles 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training

### Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

### Active Learning with Boosting for Spam Detection

Active Learning with Boosting for Spam Detection Nikhila Arkalgud Last update: March 22, 2008 Active Learning with Boosting for Spam Detection Last update: March 22, 2008 1 / 38 Outline 1 Spam Filters

### Model Combination. 24 Novembre 2009

Model Combination 24 Novembre 2009 Datamining 1 2009-2010 Plan 1 Principles of model combination 2 Resampling methods Bagging Random Forests Boosting 3 Hybrid methods Stacking Generic algorithm for mulistrategy

### CS570 Data Mining Classification: Ensemble Methods

CS570 Data Mining Classification: Ensemble Methods Cengiz Günay Dept. Math & CS, Emory University Fall 2013 Some slides courtesy of Han-Kamber-Pei, Tan et al., and Li Xiong Günay (Emory) Classification:

### Version Spaces. riedmiller@informatik.uni-freiburg.de

. Machine Learning Version Spaces Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut für Informatik Technische Fakultät Albert-Ludwigs-Universität Freiburg riedmiller@informatik.uni-freiburg.de

### Motivation. Motivation. Can a software agent learn to play Backgammon by itself? Machine Learning. Reinforcement Learning

Motivation Machine Learning Can a software agent learn to play Backgammon by itself? Reinforcement Learning Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut

### AdaBoost. Jiri Matas and Jan Šochman. Centre for Machine Perception Czech Technical University, Prague http://cmp.felk.cvut.cz

AdaBoost Jiri Matas and Jan Šochman Centre for Machine Perception Czech Technical University, Prague http://cmp.felk.cvut.cz Presentation Outline: AdaBoost algorithm Why is of interest? How it works? Why

### CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.

CI6227: Data Mining Lesson 11b: Ensemble Learning Sinno Jialin PAN Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore Acknowledgements: slides are adapted from the lecture notes

### Introduction to Machine Learning. Rob Schapire Princeton University. schapire

Introduction to Machine Learning Rob Schapire Princeton University www.cs.princeton.edu/ schapire Machine Learning studies how to automatically learn to make accurate predictions based on past observations

### L25: Ensemble learning

L25: Ensemble learning Introduction Methods for constructing ensembles Combination strategies Stacked generalization Mixtures of experts Bagging Boosting CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna

### Source. The Boosting Approach. Example: Spam Filtering. The Boosting Approach to Machine Learning

Source The Boosting Approach to Machine Learning Notes adapted from Rob Schapire www.cs.princeton.edu/~schapire CS 536: Machine Learning Littman (Wu, TA) Example: Spam Filtering problem: filter out spam

### Data Mining Practical Machine Learning Tools and Techniques

Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

### Ensemble Data Mining Methods

Ensemble Data Mining Methods Nikunj C. Oza, Ph.D., NASA Ames Research Center, USA INTRODUCTION Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods

### Data Mining. Nonlinear Classification

Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15

### Online Algorithms: Learning & Optimization with No Regret.

Online Algorithms: Learning & Optimization with No Regret. Daniel Golovin 1 The Setup Optimization: Model the problem (objective, constraints) Pick best decision from a feasible set. Learning: Model the

### How Boosting the Margin Can Also Boost Classifier Complexity

Lev Reyzin lev.reyzin@yale.edu Yale University, Department of Computer Science, 51 Prospect Street, New Haven, CT 652, USA Robert E. Schapire schapire@cs.princeton.edu Princeton University, Department

### REVIEW OF ENSEMBLE CLASSIFICATION

Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IJCSMC, Vol. 2, Issue.

### Data Mining Methods: Applications for Institutional Research

Data Mining Methods: Applications for Institutional Research Nora Galambos, PhD Office of Institutional Research, Planning & Effectiveness Stony Brook University NEAIR Annual Conference Philadelphia 2014

### A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,

### FilterBoost: Regression and Classification on Large Datasets

FilterBoost: Regression and Classification on Large Datasets Joseph K. Bradley Machine Learning Department Carnegie Mellon University Pittsburgh, PA 523 jkbradle@cs.cmu.edu Robert E. Schapire Department

### Local features and matching. Image classification & object localization

Overview Instance level search Local features and matching Efficient visual recognition Image classification & object localization Category recognition Image classification: assigning a class label to

### Reinforcement Learning

Reinforcement Learning LU 2 - Markov Decision Problems and Dynamic Programming Dr. Joschka Bödecker AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg jboedeck@informatik.uni-freiburg.de

### Reinforcement Learning

Reinforcement Learning LU 2 - Markov Decision Problems and Dynamic Programming Dr. Martin Lauer AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg martin.lauer@kit.edu

### Evaluation in Machine Learning (2) Aims. Introduction

Acknowledgements Evaluation in Machine Learning (2) 15s1: COMP9417 Machine Learning and Data Mining School of Computer Science and Engineering, University of New South Wales Material derived from slides

### An Introduction to Ensemble Learning in Credit Risk Modelling

An Introduction to Ensemble Learning in Credit Risk Modelling October 15, 2014 Han Sheng Sun, BMO Zi Jin, Wells Fargo Disclaimer The opinions expressed in this presentation and on the following slides

### MHI3000 Big Data Analytics for Health Care Final Project Report

MHI3000 Big Data Analytics for Health Care Final Project Report Zhongtian Fred Qiu (1002274530) http://gallery.azureml.net/details/81ddb2ab137046d4925584b5095ec7aa 1. Data pre-processing The data given

### Chapter 11 Boosting. Xiaogang Su Department of Statistics University of Central Florida - 1 -

Chapter 11 Boosting Xiaogang Su Department of Statistics University of Central Florida - 1 - Perturb and Combine (P&C) Methods have been devised to take advantage of the instability of trees to create

### Machine Learning Algorithms for Classification. Rob Schapire Princeton University

Machine Learning Algorithms for Classification Rob Schapire Princeton University Machine Learning studies how to automatically learn to make accurate predictions based on past observations classification

### Decision Trees from large Databases: SLIQ

Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values

### Maschinelles Lernen mit MATLAB

Maschinelles Lernen mit MATLAB Jérémy Huard Applikationsingenieur The MathWorks GmbH 2015 The MathWorks, Inc. 1 Machine Learning is Everywhere Image Recognition Speech Recognition Stock Prediction Medical

### Lecture Slides for INTRODUCTION TO. Machine Learning. By: Postedited by: R.

Lecture Slides for INTRODUCTION TO Machine Learning By: alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml Postedited by: R. Basili Learning a Class from Examples Class C of a family car Prediction:

### Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods

Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods Jerzy B laszczyński 1, Krzysztof Dembczyński 1, Wojciech Kot lowski 1, and Mariusz Paw lowski 2 1 Institute of Computing

### Ensembles and PMML in KNIME

Ensembles and PMML in KNIME Alexander Fillbrunn 1, Iris Adä 1, Thomas R. Gabriel 2 and Michael R. Berthold 1,2 1 Department of Computer and Information Science Universität Konstanz Konstanz, Germany First.Last@Uni-Konstanz.De

### Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

### Introduction to Machine Learning

Introduction to Machine Learning Prof. Alexander Ihler Prof. Max Welling icamp Tutorial July 22 What is machine learning? The ability of a machine to improve its performance based on previous results:

### Classification and Regression Trees

Classification and Regression Trees Bob Stine Dept of Statistics, School University of Pennsylvania Trees Familiar metaphor Biology Decision tree Medical diagnosis Org chart Properties Recursive, partitioning

### CPSC 340: Machine Learning and Data Mining. K-Means Clustering Fall 2015

CPSC 340: Machine Learning and Data Mining K-Means Clustering Fall 2015 Admin Assignment 1 solutions posted after class. Tutorials for Assignment 2 on Monday. Random Forests Random forests are one of the

### Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu

Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics

### Application of Face Recognition to Person Matching in Trains

Application of Face Recognition to Person Matching in Trains May 2008 Objective Matching of person Context : in trains Using face recognition and face detection algorithms With a video-surveillance camera

### DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com

### Training Methods for Adaptive Boosting of Neural Networks for Character Recognition

Submission to NIPS*97, Category: Algorithms & Architectures, Preferred: Oral Training Methods for Adaptive Boosting of Neural Networks for Character Recognition Holger Schwenk Dept. IRO Université de Montréal

### Data Mining Classification: Alternative Techniques. Instance-Based Classifiers. Lecture Notes for Chapter 5. Introduction to Data Mining

Data Mining Classification: Alternative Techniques Instance-Based Classifiers Lecture Notes for Chapter 5 Introduction to Data Mining by Tan, Steinbach, Kumar Set of Stored Cases Atr1... AtrN Class A B

### Ensemble Methods for Noise Elimination in Classification Problems

Ensemble Methods for Noise Elimination in Classification Problems Sofie Verbaeten and Anneleen Van Assche Department of Computer Science Katholieke Universiteit Leuven Celestijnenlaan 200 A, B-3001 Heverlee,

### Decompose Error Rate into components, some of which can be measured on unlabeled data

Bias-Variance Theory Decompose Error Rate into components, some of which can be measured on unlabeled data Bias-Variance Decomposition for Regression Bias-Variance Decomposition for Classification Bias-Variance

### Case Study Report: Building and analyzing SVM ensembles with Bagging and AdaBoost on big data sets

Case Study Report: Building and analyzing SVM ensembles with Bagging and AdaBoost on big data sets Ricardo Ramos Guerra Jörg Stork Master in Automation and IT Faculty of Computer Science and Engineering

### Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Unit # 11 Sajjad Haider Fall 2013 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

### Comparison of Data Mining Techniques used for Financial Data Analysis

Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract

### Beating the NCAA Football Point Spread

Beating the NCAA Football Point Spread Brian Liu Mathematical & Computational Sciences Stanford University Patrick Lai Computer Science Department Stanford University December 10, 2010 1 Introduction Over

### Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

### 1 The Consistency Model

COS 511: Foundations of Machine Learning Rob Schapire Lecture #2 Scribe: Tamara Broderick February 9, 2006 1 The Consistency Model c : X {0,1} is a concept. C is a concept class. C is learnable (in the

### Supervised Learning (Big Data Analytics)

Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used

### Chapter 6. The stacking ensemble approach

82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

### CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

### On the effect of data set size on bias and variance in classification learning

On the effect of data set size on bias and variance in classification learning Abstract Damien Brain Geoffrey I Webb School of Computing and Mathematics Deakin University Geelong Vic 3217 With the advent

### Simple and efficient online algorithms for real world applications

Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,

### Asymmetric Gradient Boosting with Application to Spam Filtering

Asymmetric Gradient Boosting with Application to Spam Filtering Jingrui He Carnegie Mellon University 5 Forbes Avenue Pittsburgh, PA 523 USA jingruih@cs.cmu.edu ABSTRACT In this paper, we propose a new

### Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

### Ensemble Methods. Adapted from slides by Todd Holloway h8p://abeau<fulwww.com/2007/11/23/ ensemble- machine- learning- tutorial/

Ensemble Methods Adapted from slides by Todd Holloway h8p://abeau

### Predicting borrowers chance of defaulting on credit loans

Predicting borrowers chance of defaulting on credit loans Junjie Liang (junjie87@stanford.edu) Abstract Credit score prediction is of great interests to banks as the outcome of the prediction algorithm

### Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

### BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University

### Introduction to Learning & Decision Trees

Artificial Intelligence: Representation and Problem Solving 5-38 April 0, 2007 Introduction to Learning & Decision Trees Learning and Decision Trees to learning What is learning? - more than just memorizing

### Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

### II. RELATED WORK. Sentiment Mining

Sentiment Mining Using Ensemble Classification Models Matthew Whitehead and Larry Yaeger Indiana University School of Informatics 901 E. 10th St. Bloomington, IN 47408 {mewhiteh, larryy}@indiana.edu Abstract

### Solving Regression Problems Using Competitive Ensemble Models

Solving Regression Problems Using Competitive Ensemble Models Yakov Frayman, Bernard F. Rolfe, and Geoffrey I. Webb School of Information Technology Deakin University Geelong, VIC, Australia {yfraym,brolfe,webb}@deakin.edu.au

### Incremental SampleBoost for Efficient Learning from Multi-Class Data Sets

Incremental SampleBoost for Efficient Learning from Multi-Class Data Sets Mohamed Abouelenien Xiaohui Yuan Abstract Ensemble methods have been used for incremental learning. Yet, there are several issues

### A Simple Introduction to Support Vector Machines

A Simple Introduction to Support Vector Machines Martin Law Lecture for CSE 802 Department of Computer Science and Engineering Michigan State University Outline A brief history of SVM Large-margin linear

### Monday Morning Data Mining

Monday Morning Data Mining Tim Ruhe Statistische Methoden der Datenanalyse Outline: - data mining - IceCube - Data mining in IceCube Computer Scientists are different... Fakultät Physik Fakultät Physik

### COMP 551 Applied Machine Learning Lecture 6: Performance evaluation. Model assessment and selection.

COMP 551 Applied Machine Learning Lecture 6: Performance evaluation. Model assessment and selection. Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise

### Introduction.

. Machine Learning Introduction Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut für Informatik Technische Fakultät Albert-Ludwigs-Universität Freiburg riedmiller@informatik.uni-freiburg.de

### Classification using Logistic Regression

Classification using Logistic Regression Ingmar Schuster Patrick Jähnichen using slides by Andrew Ng Institut für Informatik This lecture covers Logistic regression hypothesis Decision Boundary Cost function

### C19 Machine Learning

C9 Machine Learning 8 Lectures Hilary Term 25 2 Tutorial Sheets A. Zisserman Overview: Supervised classification perceptron, support vector machine, loss functions, kernels, random forests, neural networks

### COS 511: Foundations of Machine Learning. Rob Schapire Lecture #14 Scribe: Qian Xi March 30, 2006

COS 511: Foundations of Machine Learning Rob Schapire Lecture #14 Scribe: Qian Xi March 30, 006 In the previous lecture, we introduced a new learning model, the Online Learning Model. After seeing a concrete

### Machine Learning Final Project Spam Email Filtering

Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE

### 203.4770: Introduction to Machine Learning Dr. Rita Osadchy

203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:

### Using multiple models: Bagging, Boosting, Ensembles, Forests

Using multiple models: Bagging, Boosting, Ensembles, Forests Bagging Combining predictions from multiple models Different models obtained from bootstrap samples of training data Average predictions or

### Ensemble Approach for the Classification of Imbalanced Data

Ensemble Approach for the Classification of Imbalanced Data Vladimir Nikulin 1, Geoffrey J. McLachlan 1, and Shu Kay Ng 2 1 Department of Mathematics, University of Queensland v.nikulin@uq.edu.au, gjm@maths.uq.edu.au

### INTRODUCTION TO NEURAL NETWORKS

INTRODUCTION TO NEURAL NETWORKS Pictures are taken from http://www.cs.cmu.edu/~tom/mlbook-chapter-slides.html http://research.microsoft.com/~cmbishop/prml/index.htm By Nobel Khandaker Neural Networks An

### Data Analytics and Business Intelligence (8696/8697)

http: // togaware. com Copyright 2014, Graham.Williams@togaware.com 1/36 Data Analytics and Business Intelligence (8696/8697) Ensemble Decision Trees Graham.Williams@togaware.com Data Scientist Australian

### COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised

### Evaluation in Machine Learning (1) Evaluation in machine learning. Aims. 14s1: COMP9417 Machine Learning and Data Mining.

Acknowledgements Evaluation in Machine Learning (1) 14s1: COMP9417 Machine Learning and Data Mining School of Computer Science and Engineering, University of New South Wales March 19, 2014 Material derived

### The Artificial Prediction Market

The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory

### Advice for applying Machine Learning

Advice for applying Machine Learning Andrew Ng Stanford University Today s Lecture Advice on how getting learning algorithms to different applications. Most of today s material is not very mathematical.

### COMP9417: Machine Learning and Data Mining

COMP9417: Machine Learning and Data Mining A note on the two-class confusion matrix, lift charts and ROC curves Session 1, 2003 Acknowledgement The material in this supplementary note is based in part

### Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur

Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur Lecture - 7 Multiple Linear Regression (Contd.) This is my second lecture on Multiple Linear Regression

### Journal of Asian Scientific Research COMPARISON OF THREE CLASSIFICATION ALGORITHMS FOR PREDICTING PM2.5 IN HONG KONG RURAL AREA.

Journal of Asian Scientific Research journal homepage: http://aesswebcom/journal-detailphp?id=5003 COMPARISON OF THREE CLASSIFICATION ALGORITHMS FOR PREDICTING PM25 IN HONG KONG RURAL AREA Yin Zhao School

### The More Trees, the Better! Scaling Up Performance Using Random Forest in SAS Enterprise Miner

Paper 3361-2015 The More Trees, the Better! Scaling Up Performance Using Random Forest in SAS Enterprise Miner Narmada Deve Panneerselvam, Spears School of Business, Oklahoma State University, Stillwater,

### Learning. Artificial Intelligence. Learning. Types of Learning. Inductive Learning Method. Inductive Learning. Learning.

Learning Learning is essential for unknown environments, i.e., when designer lacks omniscience Artificial Intelligence Learning Chapter 8 Learning is useful as a system construction method, i.e., expose

### Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal

Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether

### Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms

Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms Yin Zhao School of Mathematical Sciences Universiti Sains Malaysia (USM) Penang, Malaysia Yahya

### Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC

Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain

### Robust Real-Time Face Detection

Robust Real-Time Face Detection International Journal of Computer Vision 57(2), 137 154, 2004 Paul Viola, Michael Jones 授 課 教 授 : 林 信 志 博 士 報 告 者 : 林 宸 宇 報 告 日 期 :96.12.18 Outline Introduction The Boost

### Ensemble Learning Better Predictions Through Diversity. Todd Holloway ETech 2008

Ensemble Learning Better Predictions Through Diversity Todd Holloway ETech 2008 Outline Building a classifier (a tutorial example) Neighbor method Major ideas and challenges in classification Ensembles

### CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

Lecture 1 Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x-5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

### Data Mining for Model Creation. Presentation by Paul Below, EDS 2500 NE Plunkett Lane Poulsbo, WA USA 98370 paul.below@eds.

Sept 03-23-05 22 2005 Data Mining for Model Creation Presentation by Paul Below, EDS 2500 NE Plunkett Lane Poulsbo, WA USA 98370 paul.below@eds.com page 1 Agenda Data Mining and Estimating Model Creation

### Chapter 7. AdaBoost. 7.1 Introduction. Zhi-Hua Zhou and Yang Yu. Contents

Chapter 7 AdaBoost Zhi-Hua Zhou and Yang Yu Contents 7.1 Introduction... 127 7.2 The Algorithm... 128 7.2.1 Notations... 128 7.2.2 A General Boosting Procedure... 129 7.2.3 The AdaBoost Algorithm... 130

### Getting Even More Out of Ensemble Selection

Getting Even More Out of Ensemble Selection Quan Sun Department of Computer Science The University of Waikato Hamilton, New Zealand qs12@cs.waikato.ac.nz ABSTRACT Ensemble Selection uses forward stepwise