# Introduction To Ensemble Learning

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Educational Series Introduction To Ensemble Learning Dr. Oliver Steinki, CFA, FRM Ziad Mohammad July 2015

2 What Is Ensemble Learning? In broad terms, ensemble learning is a procedure where multiple learner modules are applied on a dataset to extract multiple predictions, which are then combined into one composite prediction. The ensemble learning process is commonly broken down into two tasks: First, constructing a set of base learners from the training data; second, combining some or all of these models to form a unified prediction model. Ensemble learning is a process that uses a set of models, each of them obtained by applying a learning process to a given problem. This set of models (ensemble) is integrated in some way to obtain the final prediction. (Moreira, et al. 2012, 3) Ensemble methods are mathematical procedures that start with a set of base learner models. Multiple forecasts based on the different base learners are constructed and combined into an enhanced composite model superior to the base individual models. The final composite model will provide a superior prediction accuracy than the average of all the individual base models predictions. This integration of all good individual models into one improved composite model generally leads to higher accuracy levels. Ensemble learning provides a critical boost to forecasting abilities and decision-making accuracy. Ensemble methods attempt to improve forecasting bias while simultaneously increasing robustness and reducing variance. Ensemble methods produce predictions according to a combination of all the individual base model forecasts to produce the final predicition. Ensemble methods are expected to be useful when there is uncertainty in choosing the best prediction model and when it is critical to avoid large prediction errors. These criteria clearly apply to our context of predicting returns of financial securities. The Rationale Behind Ensemble Methods (Dietterich 2000b) Lists three fundamental reasons why ensembles are successful in machine learning applications. The first one is statistical. Models can be seen as searching a hypothesis space H to identify the best hypothesis. However, the statistical problem arises as we often have only limited datasets in practice. Hence, we can find many different hypotheses in H which fit reasonably well and we do not know which one of them has the best generalization performance. This makes it difficult to choose among them. Therefore, the use of ensemble methods can help to avoid this issue by averaging over several models to get a good approximation of the unknown true hypothesis. The second reason is computational. Many models work by performing some form of local searches such as the gradient descent to minimize error functions that could get stuck in local optima. An ensemble constructed by starting the local 1

3 search from many different points may provide a better approximation to the true unknown function. The third argument presented by Dietterich (2000b) is representational. In many situations, the unknown function we are looking for is not included in H. However, a combination of several hypotheses drawn from H can enlarge the space of representable functions, which could then also include the unknown true function. (Dietterich 2000b) Common Approaches To Ensemble Methods The ensemble learning process can be broken into different stages depending on the application and the approach implemented. We choose to categorize the learning process into three steps following [Roli et al. 2001]: ensemble generation, ensemble pruning and ensemble integration (Moreira, et al. 2012). In the ensemble generation phase, a number of base learner models are generated according to a chosen learning procedure, to be used to predict the final output. In the ensemble pruning step, a number of base models are filtered out based on various mathematical procedures to improve the overall ensemble accuracy. In the ensemble integration phase, the filtered learner models are combined intelligently to form one unified prediction that is more accurate than the average of all the individuals base models. Ensemble Generation Ensemble Generation is the first step in the application of ensemble methods. The goal of this step is to obtain a set of calibrated models that have an individual prediction of the analyzed outcome. An ensemble is called homogeneous if all base models belong to the same class of models in terms of their predictive function. If the base models are more diverse than the original set, the ensemble is called heterogeneous (Mendes-Moreira et al. 2012). The second approach is expected to obtain a more diverse ensemble with generally better performance (Wichard et al., 2003). Next to the accuracy of the base models, diversity is considered one of the key success factors of ensembles (Perrone and Cooper, 1993). However, we do not have control over the diversity of the base models in the ensemble generation phase since the forecasting models used could have correlated forecasting errors. By calibrating a larger number of models from different classes of forecasting models, we increase the likelihood of having an accurate and diverse subset of base models however at the expense of computational requirements. This increased probability is the rationale for the introduction of a diverse range of base learner models. Ensemble generation methods can be classified according to how they attempt to generate different base models: either through manipulating the data or through manipulating the modeling process (Mendes-Moreira et al., 2012). 2

4 Data manipulation can be further broken down into subsampling from the training data and manipulating input features or output variables. The manipulation process can also be subdivided further: It can be achieved by using different parameter sets or manipulating the induction algorithm or the resulting model. Ensemble Pruning The methods introduced for ensemble generation create a diverse set of models. However, the resulting set of predictor models do not ensure the best accuracy possible. Ensemble pruning describes the process of choosing the appropriate subset from the candidate pool of base models. Ensemble pruning methods try to improve ensemble accuracy and/or to reduce computational cost. They can be divided into partitioning-based and search based methods. Partitioning based approaches split the base models into subgroups based on a predetermined criteria. Search based approaches try to find a subset of models with improved ensemble accuracy by either adding or removing models from the initial candidate pool. Furthermore, the different pruning approaches could be classified according to their stopping criterion: Direct ensemble pruning methods are approaches where the number of models used is determined exante, whereas evaluation ensemble methods determine the number of models used according to the ensemble accuracy (Mendes- Moreira et al., 2012). Ensemble Integration Following the ensemble generation and ensemble pruning step, the last step of the ensemble learning process is called ensemble integration. It describes how the remaining calibrated models are combined into a single composite model. Ensemble integration methods vary in approach and classification. Ensemble integration approaches could be broadly classified as combination or selection. In the combination approach, all learner models are combined into one composite model, in the selection approach only the most promising model(s) are used to construct the final composite model. A common challenge in the integration phase is multicollinearity, the correlation between the base learner models predictions, which could lower the accuracy of the final ensemble prediction. Suggestions to avoid or reduce the existence of multicollinearity include several methods applied during the ensemble generation or ensemble pruning step to guarantee an accurate, yet diverse (and hence not perfectly correlated) set of base models. (Steinki 2014, 109). A detailed review of ensemble methods can be found in chapter 4 of Oliver s doctoral thesis. 3

5 Success Factors Of Ensemble Methods A successful ensemble could be described as having accurate predictors and commits errors in the different parts of the input space. An important factor in measuring the performance of an ensemble lies in the generalization error. Generalization error measures how a learning module performs on out of sample data. It is measured as the difference between the prediction of the module and the actual results. Analyzing the generalization errors allows us to understand the source of the error and the correct technique to minimize it. Understanding the generation error also allows to probe the base predictors underlying characteristics causing this error. To improve the forecasting accuracy of an ensemble, the generalization error should be minimized by increasing the ambiguity yet without increasing the bias. In practice, such an approach could be challenging to achieve. Ambiguity is improved by increasing the diversity of the base learners where a more diverse set of parameters is used to induce the learning process. As the diversity increases, the space for prediction function also increases. A larger space for prediction improves the accuracy of the prediction function given the more diverse set of parameters used to induce learning. The larger space of input given for the prediction models improves the accuracy on the cost of a larger generalization error. Brown provides a good discussion on the relation between ambiguity and co-variance (Brown 2004). An important result obtained from the study of this relation is the confirmation that it is not possible to maximize the ensemble ambiguity without affecting the ensemble bias component as well, i.e., it is not possible to maximize the ambiguity component and minimize the bias component simultaneously (Moreira, et al. 2012, 8). Dietterich (2000b) states an important criteria for successful ensemble methods is to construct individual learning algorithms with prediction accuracy above 50% whose errors are at least somewhat uncorrelated. Proven Applications of Ensemble Methods Numerous academic studies analyzed the success of ensemble methods in diverse application fields such as medicine (Polikar et al., 2008), climate forecasting (Stott and Forest, 2007), image retrieval (Tsoumakas et al., 2005) and astrophysics (Bazell and Aha, 2001). Several academic studies have shown that ensemble predictions can often be much more accurate than the forecasts of the base learners (Freund and Schapire, 1996; Bauer, 1999; Dietterich, 2000a), reduce variance (Breiman, 1996; Lam and Suen, 1997) or bias and variance (Breiman, 1998). Ensemble methods have been successful in solving numerous statistical problems. Applications of 4

6 ensemble methods have been used in a broad range of industries; Air traffic controllers utilize ensembles to minimize airplanes arrival delay time, numerous weather forecast agencies implement ensemble learning to improve weather forecasting accuracy. A recent public competition by Netflix offered a monetary reward for any contester that could improve its film-rating prediction algorithm. After many proposed solutions, the winning team that finally sealed the competition implemented an approach based on ensemble methods. EVOLUTIQ s systematic multi-asset class strategy, the Pred X Model, is based on the application of ensemble methods using Le vy based market models to predict daily market moves. The investment strategy is built upon scholarly research on the applicability of ensemble methods to enhance option pricing models based on Le vy processes conducted by Dr. Oliver Steinki. The Netflix Competition The Netflix Competition 2009 was a public competition with a grand prize of US\$1,000,000 to be given for any contester that can develop a collaborative filtering algorithm that would predict user rating for films with a RMSE (root-mean-squared error) score lower than The contesters were given a dataset consisting of seven years of past film rating data without any further information on the users or the films. The winning team approach was based on gradient boosted decision trees; a technique applied to regression problems to produce predictions. The prediction was based on an ensemble of 500 decision trees, which were used as base learners and combined to formulate the final prediction of film ratings. In 2009, BellKor's Pragmatic Chaos won the competition and provided a solution that resulted in the lowest RMSE score among the contesters and had better prediction capabilities than the prevailing Netflix algorithm. 5

8 Ziad Mohammad Sales & Research Analyst As part of his role, Ziad splits his time between the research and sales departments. On one hand, he focuses on researching fundamental market strategies and portfolio optimization techniques. On the other hand, he participates in the fundraising & marketing efforts for EVOLUTIQ s recently launched multi asset class strategy. In his past role as a Financial Analyst at McKinsey & Company, he applied statistical and data mining techniques on data pools to extract intelligence to aid in the decision making process. Ziad recently completed his Masters degree in Advanced Finance from IE Business School, where he focused his research on emerging markets and wrote his master s final thesis focusing on bubble formations in frontier markets. He completed his bachelors degree in Industrial Engineering from Purdue University and a diploma in Investment Banking from the Swiss Finance Academy. 7

9 References Allwein, E. L., R. E. Schapire, and Y. Singer Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers. The Journal of Machine Learning 1, Bauer, E An Empirical Comparison of Voting Classification Algorithms : Bagging, Boosting, and Variants. Machine Learning 36, Bazell, D., and D. W. W Aha Ensembles of Classifiers for Morphological Galaxy Classification. The Astrophysical Journal, Breiman, L Arcing Classifiers. The Annals of Statistics, Breiman, L Bagging Predictors. Machine Learning 24, Brown, G Diversity In Neural Network Ensembles. Ph.d. thesis, University of Birmingham. Dietterich, T. G. 2000a. An Experimental Comparison of Three Methodsfor Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization. Machine Learning, Dietterich, T. G. 2000b. Ensemble Methods in Machine Learning. In J. Kittler and F. Roli (Ed.), Multiple Classifier Systems, Springer-Verlag, Freund, Y., and R. E. Schapire Experiments with a New Boosting Algorithm. Morgan Kaufmann, Kittler, J., M. Hatef, R. P. Duin, and J. Matas On Combining Classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence, Kleinberg, E. M A Mathematically Rigorous Foundation for Supervised Learning. In F.Roli and J.Kittler (Eds), Multiple Classifier Systems. Springer. Kleinberg, E. M An Overtraining-Resistant Stochastic Modeling Method for Pattern Recognition. The Annals of Statistics, Kong, E., and T. Dietterich Error-Correcting Output Coding Corrects Bias and Variance. In The XII International Conference on Machine Learning, San Francisco, Morgan Kaufmann, Lam, L., and C. Y. Suen Optimal Combinations Of Pattern Classifiers. Pattern Recognition Letters, Moreira, J., C. Soares, A. Jorge, and J. De Sousa Ensemble Approaches for Regression: A Survey. Faculty of Economics, University of Porto, ACM Computing Surveys, Perrone, M. P., and L. N. Cooper When Networks Disagree: Ensemble Methods For Hybrid Neural Networks. Brown University. Polikar, R. A., D. Topalis, D. Parikh, D. Green, J. Frymiare, J. Kounios, and C. Clark An Ensemble Based Data Fusion Approach For Early Diagnosis Of Alzheimer's Disease. Information Fusion 9, Roli, F Methods for Designing Multiple Classifier Systems. F. Roli and J. Kittler (Eds.), Multiple Classifier Systems, Springer, Scott, P. A.; Forest, C. E Ensemble Climate Predictions Using Climate Models and Observational Constraints. Mathematical, Physical, and Engineering Sciences 365 (1857), 52. Steinki, O An Investigation Of Ensemble Methods To Improve The Bias And/Or Variance Of Option Pricing Models Based On Levy Processes. Doctoral Thesis, University of Manchester, 213. Tsoumakas, G., L. Angelis, and I. Vlahavas Selective Fusion Of Heterogeneous Classifiers. Intelligent Data Analysis, Wichard, J., C. Merkwirth, and M. Ogorzalek Building Ensembles With Heterogeneous Models. Lecture Notes, AGH University of Science and Technology. 8

10 EVOLUTIQ GmbH is issuing a series of white papers on the subject of systematic trading. These papers will discuss different approaches to systematic trading as well as present specific trading strategies and associated risk management techniques. This is the second paper of the EVOLUTIQ educational series. EVOLUTIQ GmbH Schwerzistr Freienbach Switzerland Telephone: Website: 9

### REVIEW OF ENSEMBLE CLASSIFICATION

Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IJCSMC, Vol. 2, Issue.

### Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

### Ensemble Data Mining Methods

Ensemble Data Mining Methods Nikunj C. Oza, Ph.D., NASA Ames Research Center, USA INTRODUCTION Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods

### New Ensemble Combination Scheme

New Ensemble Combination Scheme Namhyoung Kim, Youngdoo Son, and Jaewook Lee, Member, IEEE Abstract Recently many statistical learning techniques are successfully developed and used in several areas However,

### A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,

### Chapter 11 Boosting. Xiaogang Su Department of Statistics University of Central Florida - 1 -

Chapter 11 Boosting Xiaogang Su Department of Statistics University of Central Florida - 1 - Perturb and Combine (P&C) Methods have been devised to take advantage of the instability of trees to create

### A New Ensemble Model for Efficient Churn Prediction in Mobile Telecommunication

2012 45th Hawaii International Conference on System Sciences A New Ensemble Model for Efficient Churn Prediction in Mobile Telecommunication Namhyoung Kim, Jaewook Lee Department of Industrial and Management

### CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.

CI6227: Data Mining Lesson 11b: Ensemble Learning Sinno Jialin PAN Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore Acknowledgements: slides are adapted from the lecture notes

### Chapter 6. The stacking ensemble approach

82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

### Introducing diversity among the models of multi-label classification ensemble

Introducing diversity among the models of multi-label classification ensemble Lena Chekina, Lior Rokach and Bracha Shapira Ben-Gurion University of the Negev Dept. of Information Systems Engineering and

### Getting Even More Out of Ensemble Selection

Getting Even More Out of Ensemble Selection Quan Sun Department of Computer Science The University of Waikato Hamilton, New Zealand qs12@cs.waikato.ac.nz ABSTRACT Ensemble Selection uses forward stepwise

### Data Mining. Nonlinear Classification

Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15

### On the effect of data set size on bias and variance in classification learning

On the effect of data set size on bias and variance in classification learning Abstract Damien Brain Geoffrey I Webb School of Computing and Mathematics Deakin University Geelong Vic 3217 With the advent

### Ensemble Learning Better Predictions Through Diversity. Todd Holloway ETech 2008

Ensemble Learning Better Predictions Through Diversity Todd Holloway ETech 2008 Outline Building a classifier (a tutorial example) Neighbor method Major ideas and challenges in classification Ensembles

### Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Unit # 11 Sajjad Haider Fall 2013 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

### Model Combination. 24 Novembre 2009

Model Combination 24 Novembre 2009 Datamining 1 2009-2010 Plan 1 Principles of model combination 2 Resampling methods Bagging Random Forests Boosting 3 Hybrid methods Stacking Generic algorithm for mulistrategy

### Advanced Ensemble Strategies for Polynomial Models

Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer

### Increasing Classification Accuracy. Data Mining: Bagging and Boosting. Bagging 1. Bagging 2. Bagging. Boosting Meta-learning (stacking)

Data Mining: Bagging and Boosting Increasing Classification Accuracy Andrew Kusiak 2139 Seamans Center Iowa City, Iowa 52242-1527 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak Tel: 319-335

### Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes

Knowledge Discovery and Data Mining Lecture 19 - Bagging Tom Kelsey School of Computer Science University of St Andrews http://tom.host.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-19-B &

### ENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS

ENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS Michael Affenzeller (a), Stephan M. Winkler (b), Stefan Forstenlechner (c), Gabriel Kronberger (d), Michael Kommenda (e), Stefan

### L25: Ensemble learning

L25: Ensemble learning Introduction Methods for constructing ensembles Combination strategies Stacked generalization Mixtures of experts Bagging Boosting CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna

### Data Mining Practical Machine Learning Tools and Techniques

Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

### Gerry Hobbs, Department of Statistics, West Virginia University

Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit

### BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University

### DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com

### Leveraging Ensemble Models in SAS Enterprise Miner

ABSTRACT Paper SAS133-2014 Leveraging Ensemble Models in SAS Enterprise Miner Miguel Maldonado, Jared Dean, Wendy Czika, and Susan Haller SAS Institute Inc. Ensemble models combine two or more models to

### Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods

Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods Jerzy B laszczyński 1, Krzysztof Dembczyński 1, Wojciech Kot lowski 1, and Mariusz Paw lowski 2 1 Institute of Computing

### Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification

Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification Henrik Boström School of Humanities and Informatics University of Skövde P.O. Box 408, SE-541 28 Skövde

### Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk

Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk Ensembles 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training

### Prediction of Stock Performance Using Analytical Techniques

136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University

### Benchmarking of different classes of models used for credit scoring

Benchmarking of different classes of models used for credit scoring We use this competition as an opportunity to compare the performance of different classes of predictive models. In particular we want

### Decision Trees from large Databases: SLIQ

Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values

### AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM ABSTRACT Luis Alexandre Rodrigues and Nizam Omar Department of Electrical Engineering, Mackenzie Presbiterian University, Brazil, São Paulo 71251911@mackenzie.br,nizam.omar@mackenzie.br

### Supervised Learning with Unsupervised Output Separation

Supervised Learning with Unsupervised Output Separation Nathalie Japkowicz School of Information Technology and Engineering University of Ottawa 150 Louis Pasteur, P.O. Box 450 Stn. A Ottawa, Ontario,

### Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

### Financial Trading System using Combination of Textual and Numerical Data

Financial Trading System using Combination of Textual and Numerical Data Shital N. Dange Computer Science Department, Walchand Institute of Rajesh V. Argiddi Assistant Prof. Computer Science Department,

### Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

### II. RELATED WORK. Sentiment Mining

Sentiment Mining Using Ensemble Classification Models Matthew Whitehead and Larry Yaeger Indiana University School of Informatics 901 E. 10th St. Bloomington, IN 47408 {mewhiteh, larryy}@indiana.edu Abstract

### Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms

Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms Yin Zhao School of Mathematical Sciences Universiti Sains Malaysia (USM) Penang, Malaysia Yahya

### Metalearning for Dynamic Integration in Ensemble Methods

Metalearning for Dynamic Integration in Ensemble Methods Fábio Pinto 12 July 2013 Faculdade de Engenharia da Universidade do Porto Ph.D. in Informatics Engineering Supervisor: Doutor Carlos Soares Co-supervisor:

### Combining Multiple Models Across Algorithms and Samples for Improved Results

Combining Multiple Models Across Algorithms and Samples for Improved Results Haleh Vafaie, PhD. Federal Data Corporation Bethesda, MD Hvafaie@feddata.com Dean Abbott Abbott Consulting San Diego, CA dean@abbottconsulting.com

### Comparison of Data Mining Techniques used for Financial Data Analysis

Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract

### TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP

TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző csaba.fozo@lloydsbanking.com 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions

### D A T A M I N I N G C L A S S I F I C A T I O N

D A T A M I N I N G C L A S S I F I C A T I O N FABRICIO VOZNIKA LEO NARDO VIA NA INTRODUCTION Nowadays there is huge amount of data being collected and stored in databases everywhere across the globe.

### Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

### Unit 4 DECISION ANALYSIS. Lesson 37. Decision Theory and Decision Trees. Learning objectives:

Unit 4 DECISION ANALYSIS Lesson 37 Learning objectives: To learn how to use decision trees. To structure complex decision making problems. To analyze the above problems. To find out limitations & advantages

### Ensemble Methods. Adapted from slides by Todd Holloway h8p://abeau<fulwww.com/2007/11/23/ ensemble- machine- learning- tutorial/

Ensemble Methods Adapted from slides by Todd Holloway h8p://abeau

### Studying Auto Insurance Data

Studying Auto Insurance Data Ashutosh Nandeshwar February 23, 2010 1 Introduction To study auto insurance data using traditional and non-traditional tools, I downloaded a well-studied data from http://www.statsci.org/data/general/motorins.

### Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Evaluating the Accuracy of a Classifier Holdout, random subsampling, crossvalidation, and the bootstrap are common techniques for

### Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

### A Novel Feature Selection Method Based on an Integrated Data Envelopment Analysis and Entropy Mode

A Novel Feature Selection Method Based on an Integrated Data Envelopment Analysis and Entropy Mode Seyed Mojtaba Hosseini Bamakan, Peyman Gholami RESEARCH CENTRE OF FICTITIOUS ECONOMY & DATA SCIENCE UNIVERSITY

### DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

### Why Ensembles Win Data Mining Competitions

Why Ensembles Win Data Mining Competitions A Predictive Analytics Center of Excellence (PACE) Tech Talk November 14, 2012 Dean Abbott Abbott Analytics, Inc. Blog: http://abbottanalytics.blogspot.com URL:

### Building Ensembles of Neural Networks with Class-switching

Building Ensembles of Neural Networks with Class-switching Gonzalo Martínez-Muñoz, Aitor Sánchez-Martínez, Daniel Hernández-Lobato and Alberto Suárez Universidad Autónoma de Madrid, Avenida Francisco Tomás

### Ensemble Approaches for Regression: A Survey

Ensemble Approaches for Regression: A Survey JOÃO MENDES-MOREIRA, LIAAD-INESC TEC, FEUP, Universidade do Porto CARLOS SOARES, INESC TEC, FEP, Universidade do Porto ALíPIO MÁRIO JORGE, LIAAD-INESC TEC,

### Data Analytics and Business Intelligence (8696/8697)

http: // togaware. com Copyright 2014, Graham.Williams@togaware.com 1/36 Data Analytics and Business Intelligence (8696/8697) Ensemble Decision Trees Graham.Williams@togaware.com Data Scientist Australian

### Increasing the Accuracy of Predictive Algorithms: A Review of Ensembles of Classifiers

1906 Category: Software & Systems Design Increasing the Accuracy of Predictive Algorithms: A Review of Ensembles of Classifiers Sotiris Kotsiantis University of Patras, Greece & University of Peloponnese,

### Active Learning with Boosting for Spam Detection

Active Learning with Boosting for Spam Detection Nikhila Arkalgud Last update: March 22, 2008 Active Learning with Boosting for Spam Detection Last update: March 22, 2008 1 / 38 Outline 1 Spam Filters

### The Predictive Data Mining Revolution in Scorecards:

January 13, 2013 StatSoft White Paper The Predictive Data Mining Revolution in Scorecards: Accurate Risk Scoring via Ensemble Models Summary Predictive modeling methods, based on machine learning algorithms

### Data Mining Methods: Applications for Institutional Research

Data Mining Methods: Applications for Institutional Research Nora Galambos, PhD Office of Institutional Research, Planning & Effectiveness Stony Brook University NEAIR Annual Conference Philadelphia 2014

### Using Random Forest to Learn Imbalanced Data

Using Random Forest to Learn Imbalanced Data Chao Chen, chenchao@stat.berkeley.edu Department of Statistics,UC Berkeley Andy Liaw, andy liaw@merck.com Biometrics Research,Merck Research Labs Leo Breiman,

### Supervised Learning (Big Data Analytics)

Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used

### Data, Measurements, Features

Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

### T3: A Classification Algorithm for Data Mining

T3: A Classification Algorithm for Data Mining Christos Tjortjis and John Keane Department of Computation, UMIST, P.O. Box 88, Manchester, M60 1QD, UK {christos, jak}@co.umist.ac.uk Abstract. This paper

### ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA

ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA D.Lavanya 1 and Dr.K.Usha Rani 2 1 Research Scholar, Department of Computer Science, Sree Padmavathi Mahila Visvavidyalayam, Tirupati, Andhra Pradesh,

### Machine Learning using MapReduce

Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

### OPTIMIZATION AND FORECASTING WITH FINANCIAL TIME SERIES

OPTIMIZATION AND FORECASTING WITH FINANCIAL TIME SERIES Allan Din Geneva Research Collaboration Notes from seminar at CERN, June 25, 2002 General scope of GRC research activities Econophysics paradigm

### Resampling for Face Recognition

Resampling for Face Recognition Xiaoguang Lu and Anil K. Jain Dept. of Computer Science & Engineering, Michigan State University East Lansing, MI 48824 {lvxiaogu,jain}@cse.msu.edu Abstract. A number of

### EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER ANALYTICS LIFECYCLE Evaluate & Monitor Model Formulate Problem Data Preparation Deploy Model Data Exploration Validate Models

### Reducing multiclass to binary by coupling probability estimates

Reducing multiclass to inary y coupling proaility estimates Bianca Zadrozny Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92093-0114 zadrozny@cs.ucsd.edu

### Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Unit # 10 Sajjad Haider Fall 2012 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

### Data Mining Applications in Higher Education

Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

### Generalizing Random Forests Principles to other Methods: Random MultiNomial Logit, Random Naive Bayes, Anita Prinzie & Dirk Van den Poel

Generalizing Random Forests Principles to other Methods: Random MultiNomial Logit, Random Naive Bayes, Anita Prinzie & Dirk Van den Poel Copyright 2008 All rights reserved. Random Forests Forest of decision

### A Hybrid Approach to Learn with Imbalanced Classes using Evolutionary Algorithms

Proceedings of the International Conference on Computational and Mathematical Methods in Science and Engineering, CMMSE 2009 30 June, 1 3 July 2009. A Hybrid Approach to Learn with Imbalanced Classes using

### The More Trees, the Better! Scaling Up Performance Using Random Forest in SAS Enterprise Miner

Paper 3361-2015 The More Trees, the Better! Scaling Up Performance Using Random Forest in SAS Enterprise Miner Narmada Deve Panneerselvam, Spears School of Business, Oklahoma State University, Stillwater,

### Predicting borrowers chance of defaulting on credit loans

Predicting borrowers chance of defaulting on credit loans Junjie Liang (junjie87@stanford.edu) Abstract Credit score prediction is of great interests to banks as the outcome of the prediction algorithm

### Package acrm. R topics documented: February 19, 2015

Package acrm February 19, 2015 Type Package Title Convenience functions for analytical Customer Relationship Management Version 0.1.1 Date 2014-03-28 Imports dummies, randomforest, kernelfactory, ada Author

### Distributed forests for MapReduce-based machine learning

Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication

### Data mining knowledge representation

Data mining knowledge representation 1 What Defines a Data Mining Task? Task relevant data: where and how to retrieve the data to be used for mining Background knowledge: Concept hierarchies Interestingness

### HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION

HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION Chihli Hung 1, Jing Hong Chen 2, Stefan Wermter 3, 1,2 Department of Management Information Systems, Chung Yuan Christian University, Taiwan

### MS1b Statistical Data Mining

MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to

### CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

### Random forest algorithm in big data environment

Random forest algorithm in big data environment Yingchun Liu * School of Economics and Management, Beihang University, Beijing 100191, China Received 1 September 2014, www.cmnt.lv Abstract Random forest

### Beating the MLB Moneyline

Beating the MLB Moneyline Leland Chen llxchen@stanford.edu Andrew He andu@stanford.edu 1 Abstract Sports forecasting is a challenging task that has similarities to stock market prediction, requiring time-series

### Support Vector Machines with Clustering for Training with Very Large Datasets

Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano

### A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE

A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE Kasra Madadipouya 1 1 Department of Computing and Science, Asia Pacific University of Technology & Innovation ABSTRACT Today, enormous amount of data

### International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013

A Short-Term Traffic Prediction On A Distributed Network Using Multiple Regression Equation Ms.Sharmi.S 1 Research Scholar, MS University,Thirunelvelli Dr.M.Punithavalli Director, SREC,Coimbatore. Abstract:

### Learning outcomes. Knowledge and understanding. Competence and skills

Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges

### BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents

### REPORT DOCUMENTATION PAGE

REPORT DOCUMENTATION PAGE Form Approved OMB NO. 0704-0188 Public Reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions,

### Chapter 12 Bagging and Random Forests

Chapter 12 Bagging and Random Forests Xiaogang Su Department of Statistics and Actuarial Science University of Central Florida - 1 - Outline A brief introduction to the bootstrap Bagging: basic concepts

### A New Approach for Evaluation of Data Mining Techniques

181 A New Approach for Evaluation of Data Mining s Moawia Elfaki Yahia 1, Murtada El-mukashfi El-taher 2 1 College of Computer Science and IT King Faisal University Saudi Arabia, Alhasa 31982 2 Faculty

### Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors

Classification k-nearest neighbors Data Mining Dr. Engin YILDIZTEPE Reference Books Han, J., Kamber, M., Pei, J., (2011). Data Mining: Concepts and Techniques. Third edition. San Francisco: Morgan Kaufmann

### Data Mining Classification: Decision Trees

Data Mining Classification: Decision Trees Classification Decision Trees: what they are and how they work Hunt s (TDIDT) algorithm How to select the best split How to handle Inconsistent data Continuous

### Data Mining & Data Stream Mining Open Source Tools

Data Mining & Data Stream Mining Open Source Tools Darshana Parikh, Priyanka Tirkha Student M.Tech, Dept. of CSE, Sri Balaji College Of Engg. & Tech, Jaipur, Rajasthan, India Assistant Professor, Dept.

### AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree

### Introduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011

Introduction to Machine Learning Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 1 Outline 1. What is machine learning? 2. The basic of machine learning 3. Principles and effects of machine learning