Agenda. Mathias Lanner Sas Institute. Predictive Modeling Applications. Predictive Modeling Training Data. Beslutsträd och andra prediktiva modeller



Similar documents
Data Mining Techniques Chapter 6: Decision Trees

Internet Gambling Behavioral Markers: Using the Power of SAS Enterprise Miner 12.1 to Predict High-Risk Internet Gamblers

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Enhancing Compliance with Predictive Analytics

A Property & Casualty Insurance Predictive Modeling Process in SAS

Data Mining Using SAS Enterprise Miner Randall Matignon, Piedmont, CA

Predictive Modeling of Titanic Survivors: a Learning Competition

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP

New Work Item for ISO Predictive Analytics (Initial Notes and Thoughts) Introduction

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

Developing Risk Adjustment Techniques Using the System for Assessing Health Care Quality in the

!"!!"#$$%&'()*+$(,%!"#$%$&'()*""%(+,'-*&./#-$&'(-&(0*".$#-$1"(2&."3$'45"

A fast, powerful data mining workbench designed for small to midsize organizations

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

ECLT5810 E-Commerce Data Mining Technique SAS Enterprise Miner -- Regression Model I. Regression Node

11. Analysis of Case-control Studies Logistic Regression

A Property and Casualty Insurance Predictive Modeling Process in SAS

Gerry Hobbs, Department of Statistics, West Virginia University

Overview Classes Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Lecture 10: Regression Trees

Chapter 12 Discovering New Knowledge Data Mining

An Overview and Evaluation of Decision Tree Methodology

Chapter 7: Simple linear regression Learning Objectives

Local classification and local likelihoods

Developing Credit Scorecards Using Credit Scoring for SAS Enterprise Miner TM 12.1

A Comparison of Decision Tree and Logistic Regression Model Xianzhe Chen, North Dakota State University, Fargo, ND

Statistics in Retail Finance. Chapter 2: Statistical models of default

What is Data Mining? MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling

THE HYBRID CART-LOGIT MODEL IN CLASSIFICATION AND DATA MINING. Dan Steinberg and N. Scott Cardell

Predictive Data Mining in Very Large Data Sets: A Demonstration and Comparison Under Model Ensemble

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

Supervised Learning (Big Data Analytics)

Free Trial - BIRT Analytics - IAAs

Indiana State Core Curriculum Standards updated 2009 Algebra I

Data Mining Classification: Decision Trees

Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

DECISION TREE ANALYSIS: PREDICTION OF SERIOUS TRAFFIC OFFENDING

APPLICATION PROGRAMMING: DATA MINING AND DATA WAREHOUSING

Leveraging Ensemble Models in SAS Enterprise Miner

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign

Predictive Dynamix Inc

Using multiple models: Bagging, Boosting, Ensembles, Forests

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Modeling to improve the customer unit target selection for inspections of Commercial Losses in Brazilian Electric Sector - The case CEMIG

Nominal and ordinal logistic regression

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank

Data Mining for Knowledge Management. Classification

Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010.

Question 2 Naïve Bayes (16 points)

Modeling Lifetime Value in the Insurance Industry

Predictive Analytics Modeling Methodology Document

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.

JetBlue Airways Stock Price Analysis and Prediction

MORTGAGE LENDER PROTECTION UNDER INSURANCE ARRANGEMENTS Irina Genriha Latvian University, Tel ,

Better credit models benefit us all

SAS ENTERPRISE MINER 5.3

Smart Sell Re-quote project for an Insurance company.

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Prediction of Stock Performance Using Analytical Techniques

Logistic Regression.

Methods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL

Data Mining Practical Machine Learning Tools and Techniques

Simple Methods and Procedures Used in Forecasting

S The Difference Between Predictive Modeling and Regression Patricia B. Cerrito, University of Louisville, Louisville, KY

Detecting Spam. MGS 8040, Data Mining. Audrey Gies Matt Labbe Tatiana Restrepo

Weight of Evidence Module

Section 6: Model Selection, Logistic Regression and more...

Logistic Regression (1/24/13)

Data Mining: A Magic Technology for College Recruitment. Tongshan Chang, Ed.D.

Copyright 2006, SAS Institute Inc. All rights reserved. Predictive Modeling using SAS

VI. Introduction to Logistic Regression

EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing Classifier

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

College Tuition: Data mining and analysis

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Data Mining mit der JMSL Numerical Library for Java Applications

Stat 5303 (Oehlert): Tukey One Degree of Freedom 1

Knowledge Discovery and Data Mining

Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study)

An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century

Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

Data Mining Applications in Fund Raising

Binary Logistic Regression

The Operational Value of Social Media Information. Social Media and Customer Interaction

17. SIMPLE LINEAR REGRESSION II

Linear Classification. Volker Tresp Summer 2015

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

Clovis Community College Core Competencies Assessment Area II: Mathematics Algebra

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541

Transcription:

Agenda Introduktion till Prediktiva modeller Beslutsträd Beslutsträd och andra prediktiva modeller Mathias Lanner Sas Institute Pruning Regressioner Neurala Nätverk Utvärdering av modeller 2 Predictive Modeling Applications Predictive Modeling Database marketing Financial risk management Fraud detection Process monitoring Pattern detection 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs Numeric or categorical values 3 4

Predictive Modeling Score Data 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs Predictions 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs Predictions Score Data 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs Only input values known Score Data 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs 5 6 Predictions Predictive Modeling Essentials 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs Score Data 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs Predictions Predict s 7 8

Predictive Modeling Essentials Three Prediction Types Predict s 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs Predictions Decisions Rankings Estimates 9 10 Decision Predictions Ranking Predictions 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs Decisions primary secondary tertiary primary secondary Trained model uses input measurements to make best decision for each. 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs Rankings 720 520 620 580 470 Trained model uses input measurements to optimally rank each. 11 12

Estimate Predictions Model Essentials Predict Review 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs Estimates 0.65 0.33 0.54 0.47 0.28 Trained model uses input measurements to optimally estimate value. Predict s Decide, rank, estimate 13 14 Model Essentials Select Review Curse of Dimensionality Predict s 1 D 2 D 3 D 15 16

Input Selection Model Essentials Select Review Redundancy Irrelevancy Predict s Decide, rank, estimate Eradicate redundancies irrelevancies 17 18 Model Essentials Optimize Fool s Gold Predict s My model fits the training data perfectly... I ve struck it rich! 19 20

Model Complexity Data Splitting Too flexible Not flexible enough 21 22 Role Validation Data Role 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs Validation Data 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs Training data gives sequence of predictive models with increasing complexity. 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs Validation Data 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs 0.73 0.78 0.49 0.53 0.41 Validation data helps select best model from sequence 23 24

Validation Data Role Model Essentials Optimize 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs Validation Data 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs 0.73 0.78 0.49 0.53 0.41 0.73 0.78 0.49 0.53 0.41 Validation data helps select best model from Sequence. Predict s Decide, rank, estimate Eradicate redundancies irrelevancies Tune models with validation data 25 26 Agenda Predictive Modeling Tools Introduktion till Prediktiva modeller Beslutsträd Primary Decision Tree Regression Neural Network Pruning Regressioner Specialty Dmine Regression MBR AutoNeural Neurala Nätverk Utvärdering av modeller Rule Induction DMNeural Multiple Model Ensemble Two Stage 27 28

Predictive Modeling Tools Predictive Modeling Tools Primary Decision Tree Regression Neural Network Primary Decision Tree Regression Neural Network Specialty Dmine Regression MBR AutoNeural Specialty Dmine Regression MBR AutoNeural Rule Induction DMNeural Rule Induction DMNeural Multiple Model Ensemble Two Stage Multiple Model Ensemble Two Stage 29 30 Model Essentials Decision Trees Simple Prediction Illustration Predict s Prediction rules Split search Pruning Analysis goal: Predict the color of a dot based on its location in a scatter plot. 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 31 32

Model Essentials Decision Trees Decision Tree Prediction Rules Predict s Prediction rules Split search Pruning 40% leaf node root node <0.63 0.63 interior node <0.52 0.52 <0.51 0.51 60% 70% 55% 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 33 34 Decision Tree Prediction Rules Decision Tree Prediction Rules Decision = Estimate = 0.70 40% <0.63 0.63 <0.52 0.52 <0.51 0.51 60% 70% 55% 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 40% <0.63 0.63 <0.52 0.52 <0.51 0.51 60% 70% 55% 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 35 36

Model Essentials Decision Trees Predict s Prediction rules Demo Beslutsträd Split search Pruning 37 38 Agenda Model Essentials Decision Trees Introduktion till Prediktiva modeller Beslutsträd Predict s Prediction rules Pruning Regressioner Split search Neurala Nätverk Utvärdering av modeller Pruning 39 40

Binary Targets Decision Assessment 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs =1 =0 =0 =1 =0 primary outcome secondary outcome 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs =1 =0 =0 =1 =0 Decisions primary secondary primary secondary secondary Accuracy/Profit true positive true negative Focus on correct decisions 41 42 Decision Assessment (for Pessimists) Ranking Assessment 1: inputs 2: inputs 3: inputs 4: inputs 5: inputs =1 =0 =0 =1 =0 Decisions primary secondary primary secondary secondary Misclassification/Loss false positive false negative 1: inputs =1 2: inputs =0 3: inputs =0 4: inputs =1 5: inputs =0 Rankings 720 520 620 580 470 Concordance rank(=1) > rank(=0) Focus on incorrect decisions Focus on correct ordering 43 44

Ranking Assessment (for Pessimists) Estimate Assessment (only Pessimistic!) 1: inputs =1 2: inputs =0 3: inputs =0 4: inputs =1 5: inputs =0 Rankings 720 520 620 580 470 Discordance rank(=1) < rank(=0) 1: inputs =1 2: inputs =0 3: inputs =0 4: inputs =1 5: inputs =0 Estimates 0.65 0.33 0.54 0.47 0.28 Squared Error (-estimate) 2 Focus on incorrect ordering Focus on incorrect estimation 45 46 Predictive Modeling Assessments Optimistic Assessment, Pessimistic Stats 42% Misclassification D 1 D 0 R r Decisions Rankings Accuracy Misclassification Concordance Discordance 40% 38% 45% 40% 35% 0.26 Discordance Squared Error Increasing leaf count improves assessment measures on training data. 0.24 p 1 Estimates Squared Error 0.22 47 48

Unbiased Assessment 42% 40% 38% Misclassification Demo på Pruning 45% 40% 35% 0.26 0.24 0.22 Discordance Squared Error Increasing leaf count might worsen assessment measures on validation data. 49 50 Model Essentials Regressions Model Essentials Regressions Predict s Prediction formula Predict s Prediction formula Sequential selection Sequential selection Optimal sequence model Optimal sequence model 51 52

Linear Regression Prediction Formula Logistic Regression Prediction Formula intercept estimate input measurement y ^ = w^ 0 + w^ 1 + w^ 2 parameter estimate estimate log p^ 1 ^p ( ) = w^ 0 + w^ 1 + w^ 2 logit scores Choose intercept and parameter estimates to minimize. squared error function ( y i y ^ i ) 2 training data Choose intercept and parameter estimates to maximize. log-likelihood function log(p ^ i ) + log(1 ^ p i ) primary outcome training s secondary outcome training s 53 54 Logit Link Function Simple Prediction Illustration Regressions log 5 p^ 1 ^p ( ) logit link function 0 1-5 = w^ 0 + w^ 1 + w^ 2 doubling amount x i logit scores consequence 1 odds exp(w i ) 0.69 odds 2 w i Model interpretation odds ratio ^ p = logit equation logit( p ^) = w^ 0 + w^ 1 + w^ 2 1 1 + e -logit( p ^ ) logistic equation 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 55 56

Beyond the Prediction Formula Missing Values and Regression Modeling Missing values Inputs Extreme or unusual values C A B Non-numeric inputs Cases Nonlinearity and Non-additivity 57 58 Missing Values and Regression Modeling Missing Values and the Prediction Formula Inputs Prediction Formula: logit( p )= -2.1 + 0.072 0.89 1.24 Cases New Case: (,, ) = ( 2,,-1 ) Predicted Value: logit( p )= -2.1 + 0.144 0.8 + 1.24 59 60

Missing Value Causes Missing Value Remedies N/A Not applicable N/A Not applicable Synthetic distribution No match Non-disclosure No match Non-disclosure Estimation x i = f(,,x p ) 61 62 Model Essentials Regressions Sequential Selection Forward Predict s Prediction formula Input p-value Entry Cutoff Sequential selection Optimal sequence model 63 64

Sequential Selection Forward Sequential Selection Backward Input p-value Entry Cutoff Input p-value Stay Cutoff 65 66 Sequential Selection Backward Sequential Selection Stepwise Input p-value Stay Cutoff Input p-value Entry Cutoff Stay Cutoff 67 68

Sequential Selection Stepwise Model Essentials Regressions Input p-value Entry Cutoff Predict s Prediction formula Stay Cutoff Sequential selection Optimal sequence model 69 70 Model Fit versus Complexity Model fit statistic Select Model with Optimal Validation Fit Model fit statistic validation training Evaluate each sequence step. Choose simplest optimal model. Evaluate each sequence step. Choose simplest optimal model. 1 2 3 4 5 6 1 2 3 4 5 6 71 72

Agenda Model Essentials Neural Networks Introduktion till Prediktiva modeller Beslutsträd Predict s Prediction formula Pruning Regressioner None Neurala Nätverk Utvärdering av modeller Stopped training 73 74 Model Essentials Neural Networks Neural Network Prediction Formula Predict s Prediction formula estimate weight estimate hidden unit ^ ^ ^ ^ ^ y = w 00 + w 01 H 1 + w 02 H 2 + w 03 H 3 bias estimate None Stopped training -5 1 0-1 tanh 5 H 1 = tanh(w^ 10 + w^ 11 + w^ 12 ) H 2 = tanh(w^ 20 + w^ 21 + w^ 22 ) H 3 = tanh(w^ 30 + w^ 31 + w^ 32 ) activation function 75 76

Neural Network Binary Prediction Formula Neural Network Diagram ^ p log = w^ 00 + w^ 01 H 1 + w^ 02 H 2 + w^ ( 03 H 1 p ^ ) 3 logit link function H 1 = tanh(w^ 10 + w^ 11 + w^ 12 ) H 2 = tanh(w^ 20 + w^ 21 + w^ 22 ) H 3 = tanh(w^ 30 + w^ 31 + w^ 32 ) input layer ^ p log = w^ 00 + w^ 01 H 1 + w^ 02 H 2 + w^ ( 03 H 1 p ^ ) 3 H 1 H 2 H 3 hidden layer y layer H 1 = tanh(w^ 10 + w^ 11 + w^ 12 ) H 2 = tanh(w^ 20 + w^ 21 + w^ 22 ) H 3 = tanh(w^ 30 + w^ 31 + w^ 32 ) 77 78 Prediction Illustration Neural Networks Agenda logit( p ^) = w^ 00 +w^ 01 H 1 +w^ 02 H 2 +w^ 03 H 3 ^ p = logit equation H 1 = tanh( w^ 10 +w^ 11 +w^ 12 ) H 2 = tanh( w^ 20 +w^ 21 +w^ 22 ) H 3 = tanh( w^ 30 +w^ 31 +w^ 32 ) 1 1 + e -logit( p ^ ) logistic equation 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Introduktion till Prediktiva modeller Beslutsträd Pruning Regressioner Neurala Nätverk Utvärdering av modeller 79 80

Assessment Types Summary Statistics Summary The Model Comparison tool provides KS C ASE Summary statistics Statistical graphics Prediction Type Decisions 1,2,3, Rankings Statistic Accuracy / Misclassification Profit / Loss KS-statistic ROC Index (concordance) Gini coefficient ^ p E(Y) Estimates Average squared error SBC / Likelihood 81 82 Summary Statistics Summary Summary Statistics Summary Prediction Type Statistic Prediction Type Statistic Decisions Accuracy / Misclassification Profit / Loss KS-statistic Decisions Accuracy / Misclassification Profit / Loss KS-statistic 1,2,3, Rankings ROC Index (concordance) Gini coefficient 1,2,3, Rankings ROC Index (concordance) Gini coefficient ^ p E(Y) Estimates Average squared error SBC / Likelihood ^ p E(Y) Estimates Average squared error SBC / Likelihood 83 84

Statistical Graphics Summary Prediction Type Statistic Decisions 1,2,3, ^ p E(Y) Rankings Estimates Sensitivity charts Response rate charts 85