# Why Ensembles Win Data Mining Competitions

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Why Ensembles Win Data Mining Competitions A Predictive Analytics Center of Excellence (PACE) Tech Talk November 14, 2012 Dean Abbott Abbott Analytics, Inc. Blog: URL: 1

2 Outline Motivation for Ensembles How Ensembles are Built Do Ensembles Violate Occams Razor? Why Do Ensembles Win? 2

4 Netflix Prize 2006 Netflix State-of-the-art (Cinematch) RMSE = Prize: reduce this RMSE by 10% => : Korbell team Progress Prize winner 107 algorithm ensemble Top algorithm: SVD with RMSE = nd algorithm: Restricted Boltzmann Machine with RMSE = Mini-ensemble (SVD+RBM) has RMSE =

5 Common Kinds of Ensembles vs. Single Models Ensembles { Single Classifiers From Zhuowen Tu, Ensemble Classification Methods: Bagging, Boosting, and Random Forests 5

6 What are Model Ensembles? Combining outputs from multiple models into single decision Models can be created using the same algorithm, or several different algorithms Decision Logic Ensemble Prediction 6

7 Creating Model Ensembles Step 1: Generate Component Models Can Vary Data or Model Parameters: Case (Record) Weights bootstrapping, sampling Data Values add noise, recode data Learning Parameters vary learning rates, pruning severity, random seeds Variable Subsets vary candidate inputs, features Single data set Multiple models and predictions 7

8 Creating Model Ensembles Step 2: Combining Models Combining Methods Estimation: Average Outputs Classification: Average probabilities or vote (best M of N) Variance Reduction Build complex, overfit models All models built in same manner Bias Reduction Build simple models Subsequent models weight records with errors more (or model actual errors) Multiple models and predictions Combine Decision or Prediction Value 8

9 How Model Complexity Effects Errors Giovanni Seni, John Elder, Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions, Morgan and Claypool Publishers, 2010 (ISBN: ) 9

10 Commonly Used Information- Theoretic Complexity Penalties BIC: Baysian Information Criterion AIC: Akaike Information Criterion MDL: Minimum Description Length For a nice summary: 10

11 Four Keys to Effective Ensembling Diversity of opinion Independence Decentralization Aggregation From The Wisdom of Crowds, James Surowiecki 11 11

12 Bagging Bagging Method Create many data sets by bootstrapping (can also do this with cross validation) Create one decision tree for each data set Combine decision trees by averaging (or voting) final decisions Primarily reduces model variance rather than bias Results On average, better than any individual tree Final Answer (average) 12

13 Boosting (Adaboost) Boosting Method Creating tree using training data set Score each data point, indicating when each incorrect decision is made (errors) Retrain, giving rows with incorrect decisions more weight. Repeat Final prediction is a weighted average of all models-> model regularization. Best to create weak models simple models (just a few splits for a decision tree) and let the boosting iterations find the complexity. Often used with trees or Naïve Bayes Results Usually better than individual tree or Bagging Reweight examples where classification incorrect Combine models via weighted sum 13

14 Random Forest Ensembles Random Forest (RF) Method Exact same methodology as Bagging, but with a twist At each split, rather than using the entire set of candidate inputs, use a random subset of candidate inputs Generates diversity of samples and inputs (splits) Results On average, better than any individual tree, Bagging, or even Boosting Final Answer (average) 14

15 Stochastic Gradient Boosting Implemented in MART (Jerry Friedman), and TreeNet (Salford Systems) Algorithm Begin with a simple model a constant value for a model Build a simple tree (perhaps 6 terminal nodes) now there are 6 possible levels, whereas before there was one level Score the model and compute errors. The score is the sum of all previous trees, weighted by a learning rate Build a new tree with the errors as the target variable. Results TreeNet has won 2 KDD-Cup competitions and numerous others It is less prone to outliers and overfit than Adaboost Predict errors in ensemble tree so far Combine models via weighted sum Build Final Answer (additive model) 15

16 Ensembles of Trees: Smoothers Ensembles smooth jagged decision boundaries Pictures from T.G. Dietterich. Ensemble methods in machine learning. In Multiple Classier Systems, Cagliari, Italy,

17 Heterogeneous Model Ensembles on Glass Data Percent Classification Error 40 % 35 % 30 % 25 % 20 % 15 % 10 % 5% 0% Max Error Min Error Avera ge Error Number Models Combin ed Model prediction diversity obtained by using different algorithms: tree, NN, RBF, Gaussian, Regression, k-nn Combining 3-5 models on average better than best single model Combining all 6 models not best (best is 3&4 model combination), but is close The is an example of reducing model variance through ensembles, but not model bias 17

18 Direct Marketing Example: Considerations or I-Miner From Abbott, D.W., "How to Improve Customer Acquisition Models with Ensembles", presented at Predictive Analytics World Conference, Washington, D.C., October 20, Steps: 1. Join by record all models applied to same data in same row order 2. Change probability names 3. Average probabilities 1. Decision is avg_prob > threshold 4. Decile Probability Ranks 18

19 Direct Marketing Example: Variable Inclusion in Model Ensembles Twenty-Five different variables represented in the ten models Only five were represented in seven or more models Twelve were represented in one or two models # Models with Common Variables # Models # Variables From Abbott, D.W., "How to Improve Customer Acquisition Models with Ensembles", presented at Predictive Analytics World Conference, Washington, D.C., October 20,

20 Fraud Detection Example: Deployment Stream Model scoring picks up scores from each model, combines in an ensemble, and pushes scores back to database 20

21 Fraud Detection Example: Overall Model Score on Validation Data Normalized Score Total Score (from validation population) Model Ensemble Average Average 5 Best Average 5 Worst Best Testing Worst Testing From Abbott, D, and Tom Konchan, Advanced Fraud Detection Techniques for Vendor Payments, Predictive Analytics Summit, San Diego, CA, February 24, Score weights false alarms and sensitivi ty Overall, ensemble is clearly best, and much better than best on testing data 21

22 Are Ensembles Better? Accuracy? Yes Interpretability? No Do Ensembles contradict Occam s Razor? Principle: simpler models generalize better; avoid overfit! They are more complex than single models (RF may have hundreds of trees in the ensemble) Yet these more complex models perform better on held-out data But are they really more complex? 22

23 Generalized Degrees of Freedom Linear Regression: a degree of freedom in the model is simple a parameter Does not extrapolate to non-linear methods Number of parameters in non-linear methods can produce more complexity or less Enter Generalized Degrees of Freedom (GDF) GDF (Ye 1998) randomly perturbs (adds noise to) the output variable, re-runs the modeling procedure, and measures the changes to the estimates (for same number of parameters) 23

24 The Math of GDF From Giovanni Seni, John Elder, Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions, Morgan and Claypool Publishers, 2010 (ISBN: ) 24

25 The Effect of GDF From Elder, J.F.E IV, The Generalization Paradox of Ensembles, Journal of Computational and Graphical Statistics, Volume 12, Number 4, Pages

26 Why Ensembles Win Performance, performance, performance Most competitions care only about performance, not about interpretation or ability to deploy Single model sometimes provide insufficient accuracy Neural networks become stuck in local minima Decision trees Run out of data Are greedy can get fooled early Single algorithms keep pushing performance using the same ideas (basis function / algorithm), and are incapable of thinking outside of their box Different algorithms or algorithms built using resample data achieve the same level of accuracy but on different cases they identify different ways to get the same level of accuracy 26

27 Conclusion Ensembles can achieve significant model performance improvements The key to good ensembles is diversity in sampling and variable selection Can be applied to single algorithm, or across multiple algorithms Just do it! 27

28 References Giovanni Seni, John Elder, Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions, Morgan and Claypool Publishers, 2010 (ISBN: ) Elder, J.F.E IV, The Generalization Paradox of Ensembles, Journal of Computational and Graphical Statistics, Volume 12, Number 4, Pages DOI: / Abbott, D.W., The Benefits of Creating Ensembles of Classifiers, Abbott Analytics, Inc., Abbott, D.W., A Comparison of Algorithms at PAKDD2007, Blog post at 28

29 References Tu, Zhuowen, Ensemble Classification Methods: Bagging, Boosting, and Random Forests, _CS_spring/cs269_2010_ensemble.pdf Ye, J. (1998), On Measuring and Correcting the Effects of Data Mining and Model Selection, Journal of the American Statistical Association, 93,

### Data Mining. Nonlinear Classification

Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15

### Data Mining Practical Machine Learning Tools and Techniques

Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

### Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

### Ensemble Methods. Adapted from slides by Todd Holloway h8p://abeau<fulwww.com/2007/11/23/ ensemble- machine- learning- tutorial/

Ensemble Methods Adapted from slides by Todd Holloway h8p://abeau

### Ensemble Learning Better Predictions Through Diversity. Todd Holloway ETech 2008

Ensemble Learning Better Predictions Through Diversity Todd Holloway ETech 2008 Outline Building a classifier (a tutorial example) Neighbor method Major ideas and challenges in classification Ensembles

### CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.

CI6227: Data Mining Lesson 11b: Ensemble Learning Sinno Jialin PAN Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore Acknowledgements: slides are adapted from the lecture notes

### The Predictive Data Mining Revolution in Scorecards:

January 13, 2013 StatSoft White Paper The Predictive Data Mining Revolution in Scorecards: Accurate Risk Scoring via Ensemble Models Summary Predictive modeling methods, based on machine learning algorithms

### Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Unit # 11 Sajjad Haider Fall 2013 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

### Model Combination. 24 Novembre 2009

Model Combination 24 Novembre 2009 Datamining 1 2009-2010 Plan 1 Principles of model combination 2 Resampling methods Bagging Random Forests Boosting 3 Hybrid methods Stacking Generic algorithm for mulistrategy

### Classification of Bad Accounts in Credit Card Industry

Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition

### Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes

Knowledge Discovery and Data Mining Lecture 19 - Bagging Tom Kelsey School of Computer Science University of St Andrews http://tom.host.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-19-B &

### Introduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011

Introduction to Machine Learning Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 1 Outline 1. What is machine learning? 2. The basic of machine learning 3. Principles and effects of machine learning

### Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk

Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk Ensembles 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training

### CS570 Data Mining Classification: Ensemble Methods

CS570 Data Mining Classification: Ensemble Methods Cengiz Günay Dept. Math & CS, Emory University Fall 2013 Some slides courtesy of Han-Kamber-Pei, Tan et al., and Li Xiong Günay (Emory) Classification:

### TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP

TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző csaba.fozo@lloydsbanking.com 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions

### Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification

### Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

### Decision Trees from large Databases: SLIQ

Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values

### Improper Payment Detection in Department of Defense Financial Transactions 1

Improper Payment Detection in Department of Defense Financial Transactions 1 Dean Abbott Abbott Consulting San Diego, CA dean@abbottconsulting.com Haleh Vafaie, PhD. Federal Data Corporation Bethesda,

### Gerry Hobbs, Department of Statistics, West Virginia University

Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit

### Using multiple models: Bagging, Boosting, Ensembles, Forests

Using multiple models: Bagging, Boosting, Ensembles, Forests Bagging Combining predictions from multiple models Different models obtained from bootstrap samples of training data Average predictions or

### Leveraging Ensemble Models in SAS Enterprise Miner

ABSTRACT Paper SAS133-2014 Leveraging Ensemble Models in SAS Enterprise Miner Miguel Maldonado, Jared Dean, Wendy Czika, and Susan Haller SAS Institute Inc. Ensemble models combine two or more models to

### Classification: Basic Concepts, Decision Trees, and Model Evaluation. General Approach for Building Classification Model

10 10 Classification: Basic Concepts, Decision Trees, and Model Evaluation Dr. Hui Xiong Rutgers University Introduction to Data Mining 1//009 1 General Approach for Building Classification Model Tid Attrib1

### Chapter 6. The stacking ensemble approach

82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

### An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century

An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century Nora Galambos, PhD Senior Data Scientist Office of Institutional Research, Planning & Effectiveness Stony Brook University AIRPO

### Data Analytics and Business Intelligence (8696/8697)

http: // togaware. com Copyright 2014, Graham.Williams@togaware.com 1/36 Data Analytics and Business Intelligence (8696/8697) Ensemble Decision Trees Graham.Williams@togaware.com Data Scientist Australian

### Data Mining Methods: Applications for Institutional Research

Data Mining Methods: Applications for Institutional Research Nora Galambos, PhD Office of Institutional Research, Planning & Effectiveness Stony Brook University NEAIR Annual Conference Philadelphia 2014

### The Generalization Paradox of Ensembles

The Generalization Paradox of Ensembles John F. ELDER IV Ensemble models built by methods such as bagging, boosting, and Bayesian model averaging appear dauntingly complex, yet tend to strongly outperform

### Chapter 11 Boosting. Xiaogang Su Department of Statistics University of Central Florida - 1 -

Chapter 11 Boosting Xiaogang Su Department of Statistics University of Central Florida - 1 - Perturb and Combine (P&C) Methods have been devised to take advantage of the instability of trees to create

### Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Evaluating the Accuracy of a Classifier Holdout, random subsampling, crossvalidation, and the bootstrap are common techniques for

### Combining Multiple Models Across Algorithms and Samples for Improved Results

Combining Multiple Models Across Algorithms and Samples for Improved Results Haleh Vafaie, PhD. Federal Data Corporation Bethesda, MD Hvafaie@feddata.com Dean Abbott Abbott Consulting San Diego, CA dean@abbottconsulting.com

### A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,

### BOOSTED REGRESSION TREES: A MODERN WAY TO ENHANCE ACTUARIAL MODELLING

BOOSTED REGRESSION TREES: A MODERN WAY TO ENHANCE ACTUARIAL MODELLING Xavier Conort xavier.conort@gear-analytics.com Session Number: TBR14 Insurance has always been a data business The industry has successfully

### On the effect of data set size on bias and variance in classification learning

On the effect of data set size on bias and variance in classification learning Abstract Damien Brain Geoffrey I Webb School of Computing and Mathematics Deakin University Geelong Vic 3217 With the advent

### Strategies for Building Predictive Models

Strategies for Building Predictive Models Dean Abbott Abbott Analytics, Inc. KNIME User Group Meeting February 14, 2014 Email: dean@abbottanalytics.com Blog: http://abbottanalytics.blogspot.com Twitter:

### Random forest algorithm in big data environment

Random forest algorithm in big data environment Yingchun Liu * School of Economics and Management, Beihang University, Beijing 100191, China Received 1 September 2014, www.cmnt.lv Abstract Random forest

### Tree Ensembles: The Power of Post- Processing. December 2012 Dan Steinberg Mikhail Golovnya Salford Systems

Tree Ensembles: The Power of Post- Processing December 2012 Dan Steinberg Mikhail Golovnya Salford Systems Course Outline Salford Systems quick overview Treenet an ensemble of boosted trees GPS modern

### Supervised Learning (Big Data Analytics)

Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used

### Generalizing Random Forests Principles to other Methods: Random MultiNomial Logit, Random Naive Bayes, Anita Prinzie & Dirk Van den Poel

Generalizing Random Forests Principles to other Methods: Random MultiNomial Logit, Random Naive Bayes, Anita Prinzie & Dirk Van den Poel Copyright 2008 All rights reserved. Random Forests Forest of decision

### Benchmarking of different classes of models used for credit scoring

Benchmarking of different classes of models used for credit scoring We use this competition as an opportunity to compare the performance of different classes of predictive models. In particular we want

### L25: Ensemble learning

L25: Ensemble learning Introduction Methods for constructing ensembles Combination strategies Stacked generalization Mixtures of experts Bagging Boosting CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna

### Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Xavier Conort xavier.conort@gear-analytics.com Motivation Location matters! Observed value at one location is

### The Artificial Prediction Market

The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory

### MACHINE LEARNING BRETT WUJEK, SAS INSTITUTE INC.

MACHINE LEARNING BRETT WUJEK, SAS INSTITUTE INC. AGENDA MACHINE LEARNING Background Use cases in healthcare, insurance, retail and banking Eamples: Unsupervised Learning Principle Component Analysis Supervised

### Data Mining & Data Stream Mining Open Source Tools

Data Mining & Data Stream Mining Open Source Tools Darshana Parikh, Priyanka Tirkha Student M.Tech, Dept. of CSE, Sri Balaji College Of Engg. & Tech, Jaipur, Rajasthan, India Assistant Professor, Dept.

### Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

### Risk pricing for Australian Motor Insurance

Risk pricing for Australian Motor Insurance Dr Richard Brookes November 2012 Contents 1. Background Scope How many models? 2. Approach Data Variable filtering GLM Interactions Credibility overlay 3. Model

### Why do statisticians "hate" us?

Why do statisticians "hate" us? David Hand, Heikki Mannila, Padhraic Smyth "Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data

### DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com

### AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree

### Advanced Ensemble Strategies for Polynomial Models

Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer

### Machine Learning. Mausam (based on slides by Tom Mitchell, Oren Etzioni and Pedro Domingos)

Machine Learning Mausam (based on slides by Tom Mitchell, Oren Etzioni and Pedro Domingos) What Is Machine Learning? A computer program is said to learn from experience E with respect to some class of

### Comparison of Data Mining Techniques used for Financial Data Analysis

Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract

### EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER ANALYTICS LIFECYCLE Evaluate & Monitor Model Formulate Problem Data Preparation Deploy Model Data Exploration Validate Models

### Welcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA

Welcome Xindong Wu Data Mining: Updates in Technologies Dept of Math and Computer Science Colorado School of Mines Golden, Colorado 80401, USA Email: xwu@ mines.edu Home Page: http://kais.mines.edu/~xwu/

### Distributed forests for MapReduce-based machine learning

Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication

### Ensemble Data Mining Methods

Ensemble Data Mining Methods Nikunj C. Oza, Ph.D., NASA Ames Research Center, USA INTRODUCTION Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods

### ICPSR Summer Program

ICPSR Summer Program Data Mining Tools for Exploring Big Data Department of Statistics Wharton School, University of Pennsylvania www-stat.wharton.upenn.edu/~stine Modern data mining combines familiar

### THE HYBRID CART-LOGIT MODEL IN CLASSIFICATION AND DATA MINING. Dan Steinberg and N. Scott Cardell

THE HYBID CAT-LOGIT MODEL IN CLASSIFICATION AND DATA MINING Introduction Dan Steinberg and N. Scott Cardell Most data-mining projects involve classification problems assigning objects to classes whether

### Better credit models benefit us all

Better credit models benefit us all Agenda Credit Scoring - Overview Random Forest - Overview Random Forest outperform logistic regression for credit scoring out of the box Interaction term hypothesis

### Data Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 7 of Data Mining by I. H. Witten and E. Frank

Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 7 of Data Mining by I. H. Witten and E. Frank Engineering the input and output Attribute selection Scheme independent, scheme

### New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction

Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.

### Neural Networks & Boosting

Neural Networks & Boosting Bob Stine Dept of Statistics, School University of Pennsylvania Questions How is logistic regression different from OLS? Logistic mean function for probabilities Larger weight

### Getting Even More Out of Ensemble Selection

Getting Even More Out of Ensemble Selection Quan Sun Department of Computer Science The University of Waikato Hamilton, New Zealand qs12@cs.waikato.ac.nz ABSTRACT Ensemble Selection uses forward stepwise

### Solving Regression Problems Using Competitive Ensemble Models

Solving Regression Problems Using Competitive Ensemble Models Yakov Frayman, Bernard F. Rolfe, and Geoffrey I. Webb School of Information Technology Deakin University Geelong, VIC, Australia {yfraym,brolfe,webb}@deakin.edu.au

### Using Adaptive Random Trees (ART) for optimal scorecard segmentation

A FAIR ISAAC WHITE PAPER Using Adaptive Random Trees (ART) for optimal scorecard segmentation By Chris Ralph Analytic Science Director April 2006 Summary Segmented systems of models are widely recognized

### Data Mining Techniques Chapter 6: Decision Trees

Data Mining Techniques Chapter 6: Decision Trees What is a classification decision tree?.......................................... 2 Visualizing decision trees...................................................

### Machine Learning using MapReduce

Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

### Monday Morning Data Mining

Monday Morning Data Mining Tim Ruhe Statistische Methoden der Datenanalyse Outline: - data mining - IceCube - Data mining in IceCube Computer Scientists are different... Fakultät Physik Fakultät Physik

### Ensembles and PMML in KNIME

Ensembles and PMML in KNIME Alexander Fillbrunn 1, Iris Adä 1, Thomas R. Gabriel 2 and Michael R. Berthold 1,2 1 Department of Computer and Information Science Universität Konstanz Konstanz, Germany First.Last@Uni-Konstanz.De

### Perspectives on Data Mining

Perspectives on Data Mining Niall Adams Department of Mathematics, Imperial College London n.adams@imperial.ac.uk April 2009 Objectives Give an introductory overview of data mining (DM) (or Knowledge Discovery

### ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA

ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA D.Lavanya 1 and Dr.K.Usha Rani 2 1 Research Scholar, Department of Computer Science, Sree Padmavathi Mahila Visvavidyalayam, Tirupati, Andhra Pradesh,

### C19 Machine Learning

C9 Machine Learning 8 Lectures Hilary Term 25 2 Tutorial Sheets A. Zisserman Overview: Supervised classification perceptron, support vector machine, loss functions, kernels, random forests, neural networks

### Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Unit # 10 Sajjad Haider Fall 2012 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

### PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

### Introducing diversity among the models of multi-label classification ensemble

Introducing diversity among the models of multi-label classification ensemble Lena Chekina, Lior Rokach and Bracha Shapira Ben-Gurion University of the Negev Dept. of Information Systems Engineering and

### Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

### Introduction to Machine Learning

Introduction to Machine Learning Prof. Alexander Ihler Prof. Max Welling icamp Tutorial July 22 What is machine learning? The ability of a machine to improve its performance based on previous results:

### Neural Networks and Support Vector Machines

INF5390 - Kunstig intelligens Neural Networks and Support Vector Machines Roar Fjellheim INF5390-13 Neural Networks and SVM 1 Outline Neural networks Perceptrons Neural networks Support vector machines

### Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods

Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods Jerzy B laszczyński 1, Krzysztof Dembczyński 1, Wojciech Kot lowski 1, and Mariusz Paw lowski 2 1 Institute of Computing

### KnowledgeSEEKER POWERFUL SEGMENTATION, STRATEGY DESIGN AND VISUALIZATION SOFTWARE

POWERFUL SEGMENTATION, STRATEGY DESIGN AND VISUALIZATION SOFTWARE Most Effective Modeling Application Designed to Address Business Challenges Applying a predictive strategy to reach a desired business

### Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation. Lecture Notes for Chapter 4. Introduction to Data Mining

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data

### Machine Learning Algorithms and Predictive Models for Undergraduate Student Retention

, 225 October, 2013, San Francisco, USA Machine Learning Algorithms and Predictive Models for Undergraduate Student Retention Ji-Wu Jia, Member IAENG, Manohar Mareboyana Abstract---In this paper, we have

### COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for

### MHI3000 Big Data Analytics for Health Care Final Project Report

MHI3000 Big Data Analytics for Health Care Final Project Report Zhongtian Fred Qiu (1002274530) http://gallery.azureml.net/details/81ddb2ab137046d4925584b5095ec7aa 1. Data pre-processing The data given

### An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

### Making Sense of the Mayhem: Machine Learning and March Madness

Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University atran3@stanford.edu ginzberg@stanford.edu I. Introduction III. Model The goal of our research

### Assessing classification methods for churn prediction by composite

Assessing classification methods for churn prediction by composite indicators M. Clemente*, V. Giner-Bosch, S. San Matías Department of Applied Statistics, Operations Research and Quality, Universitat

### Data Mining Prediction

Data Mining Prediction Jingpeng Li 1 of 23 What is Prediction? Predicting the identity of one thing based purely on the description of another related thing Not necessarily future events, just unknowns

### Data Mining Classification: Alternative Techniques. Instance-Based Classifiers. Lecture Notes for Chapter 5. Introduction to Data Mining

Data Mining Classification: Alternative Techniques Instance-Based Classifiers Lecture Notes for Chapter 5 Introduction to Data Mining by Tan, Steinbach, Kumar Set of Stored Cases Atr1... AtrN Class A B

### Identifying SPAM with Predictive Models

Identifying SPAM with Predictive Models Dan Steinberg and Mikhaylo Golovnya Salford Systems 1 Introduction The ECML-PKDD 2006 Discovery Challenge posed a topical problem for predictive modelers: how to

### testo dello schema Secondo livello Terzo livello Quarto livello Quinto livello

Extracting Knowledge from Biomedical Data through Logic Learning Machines and Rulex Marco Muselli Institute of Electronics, Computer and Telecommunication Engineering National Research Council of Italy,

### Data Mining in CRM & Direct Marketing. Jun Du The University of Western Ontario jdu43@uwo.ca

Data Mining in CRM & Direct Marketing Jun Du The University of Western Ontario jdu43@uwo.ca Outline Why CRM & Marketing Goals in CRM & Marketing Models and Methodologies Case Study: Response Model Case

### Data Mining Using SAS Enterprise Miner Randall Matignon, Piedmont, CA

Data Mining Using SAS Enterprise Miner Randall Matignon, Piedmont, CA An Overview of SAS Enterprise Miner The following article is in regards to Enterprise Miner v.4.3 that is available in SAS v9.1.3.

### Using Random Forest to Learn Imbalanced Data

Using Random Forest to Learn Imbalanced Data Chao Chen, chenchao@stat.berkeley.edu Department of Statistics,UC Berkeley Andy Liaw, andy liaw@merck.com Biometrics Research,Merck Research Labs Leo Breiman,

### Social Media Mining. Data Mining Essentials

Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

### Trees and Random Forests

Trees and Random Forests Adele Cutler Professor, Mathematics and Statistics Utah State University This research is partially supported by NIH 1R15AG037392-01 Cache Valley, Utah Utah State University Leo