# EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

Save this PDF as:

Size: px
Start display at page:

Download "EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d."

## Transcription

1 EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER

2 ANALYTICS LIFECYCLE Evaluate & Monitor Model Formulate Problem Data Preparation Deploy Model Data Exploration Validate Models Develop Models Transform & Select

3 ANALYTICS LIFECYCLE DECISION TREES CAN HELP IN VARIOUS STAGES Evaluate & Monitor Model Formulate Problem Data Preparation Deploy Model Data Exploration Validate Models Develop Models Transform & Select

4 WHY DECISION TREES?

5 DECISION TREES ADVANTAGES Decision Trees are powerful predictive and explanatory modeling tools They are flexible in that they are able to model targets that are: Interval (regression trees) Ordinal, nominal and binary (classification trees) Trees can accommodate nonlinearities and interactions Trees are simple to understand and present

6 DECISION TREE EASY TO VISUALIZE

7 DECISION TREES ENGLISH RULES Node = 10 if Saving Balance >= AND Credit Card Balance < then Tree Node Identifier = 10 Number of Observations = 981 Predicted: INS=1 = 0.68 Predicted: INS=0 = 0.32

8 DECISION TREE BACKGROUND

9 WHAT ARE DECISION TREES? Decision trees are statistical models designed for supervised prediction problems. The tree is fitted to data by recursive partitioning. Partitioning refers to segmenting the data into subgroups that are as homogeneous as possible with respect to the target. Many algorithms CHAID, CART, C4.5, C5.0

10 2 TYPES OF TREES Classification tree target is categorical Regression tree target is continuous

11 DECISION TREES CLASSIFICATION TREE

12 DECISION TREES MULTI-WAY SPLITS

13 DECISION TREES REGRESSION TREE

14 DECISION TREES PARTITIONED INPUT SPACE

15 DECISION TREES MULTIVARIATE STEP FUNCTION

16 DECISION TREES DECISION REGIONS

17 DECISION TREES LEAVES OF A CLASSIFICATION TREE

18 USING DECISION TREES FOR INITIAL AND EXPLORATORY DATA ANALYSIS

19 DECISION TREES INITIAL DATA ANALYSIS AND EXPLORATORY DATA ANALYSIS Interpretability No strict assumptions concerning the functional form of the model Resistant to the curse of dimensionality Robust to outliers in the input space No need to create dummy variables for nominal inputs Missing values do not need to be imputed Computationally fast (usually)

20 USING DECISION TREES TO MODIFY INPUT SPACE

21 DECISION TREES MODIFYING THE INPUT SPACE Dimension Reduction Input subset selection Collapsing levels of nominal inputs Dimension Enhancement Discretizing interval inputs Stratified modeling

22 DECISION TREES INPUT SELECTION

23 DECISION TREES COLLAPSING LEVELS

24 INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER

25 DECISION TREES INTERACTIVE TRAINING Force and remove inputs Define split values Manually prune branches and leaves

26 DECISION TREE INTERACTIVE DECISION TREE TIP: Prior to invoking interactive mode, modify the Decision Tree properties to reflect the type of tree you wish to build.

27 BUILDING SEGMENTATION TREES

28 DECISION TREES SEGMENTATION TREES WITH MULTIPLE TARGETS Interactively build trees while considering more than one target.

29 DEMONSTRATION

31 SAS ENTERPRISE MINER BAGGING/BOOSTING TREES Use Start Groups & End Groups Nodes

32 SAS ENTERPRISE MINER GRADIENT BOOSTING Sequential ensemble of many trees Extremely good predictions Very effective at variable selection

33 SAS ENTERPRISE MINER RANDOM FOREST Predictive Model called a Forest Creates Several Trees Training Data sampled without replacement Input variables sampled Available in EM 13.1

34 TIPS AND RESOURCES

35 TIP INTERACTIVE DECISION TREE The Interactive Decision Tree may not use all of your data. It uses a sample of at most 20,000 observations to prevent the excessive time and memory consumption that can occur with large data sets. You can control the size and method for creating the sample with Project Start Code

36 TIP INTERACTIVE DECISION TREE %let EM_INTERACTIVE_TREE_MAXOBS= <maxnumber-of-observations-in-sample>; %let EM_INTERACTIVE_TREE_SAMPLEMETHOD=<RANDOM FIRSTN STRATIFY>;

37 TIP INTERACTIVE DECISION TREE %let EM_INTERACTIVE_TREE_MAXOBS = ; %let EM_INTERACTIVE_TREE_SAMPLEMETHOD = RANDOM;

38 LEARNING MORE DOCUMENTATION SAS Enterprise Miner In-product Help File Documentation: Getting Started with SAS Enterprise Miner Documentation PDF Sample Data ZIP Recorded Webinar:

39 LEARNING MORE SAS EDUCATION COURSES Decision Tree Modeling Data Mining Techniques: Theory and Practice

40 LEARNING MORE SAS PRESS

41 LEARNING MORE SAS PRESS Decision Trees for Analytics Using SAS Enterprise Miner By: Barry de Ville and Padraic Neville ISBN: Copyright Date: July 2013 SAS Bookstore: 1&pc=63319 Table of Contents [PDF] Free Chapter [PDF] Example Code and Data

42 THANK YOU FOR USING SAS

### What is Data Mining? MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling

MS4424 Data Mining & Modelling MS4424 Data Mining & Modelling Lecturer : Dr Iris Yeung Room No : P7509 Tel No : 2788 8566 Email : msiris@cityu.edu.hk 1 Aims To introduce the basic concepts of data mining

### Gerry Hobbs, Department of Statistics, West Virginia University

Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit

### Decision Trees What Are They?

Decision Trees What Are They? Introduction...1 Using Decision Trees with Other Modeling Approaches...5 Why Are Decision Trees So Useful?...8 Level of Measurement... 11 Introduction Decision trees are a

### Lecture 10: Regression Trees

Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,

### Classification and Regression Trees

Classification and Regression Trees Bob Stine Dept of Statistics, School University of Pennsylvania Trees Familiar metaphor Biology Decision tree Medical diagnosis Org chart Properties Recursive, partitioning

### Decision Trees from large Databases: SLIQ

Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values

### Enterprise Miner - Decision tree 1

Enterprise Miner - Decision tree 1 ECLT5810 E-Commerce Data Mining Technique SAS Enterprise Miner -- Decision Tree I. Tree Node Setting Tree Node Defaults - define default options that you commonly use

### Data Mining Classification: Decision Trees

Data Mining Classification: Decision Trees Classification Decision Trees: what they are and how they work Hunt s (TDIDT) algorithm How to select the best split How to handle Inconsistent data Continuous

### Leveraging Ensemble Models in SAS Enterprise Miner

ABSTRACT Paper SAS133-2014 Leveraging Ensemble Models in SAS Enterprise Miner Miguel Maldonado, Jared Dean, Wendy Czika, and Susan Haller SAS Institute Inc. Ensemble models combine two or more models to

### COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised

### Predictive Modeling of Titanic Survivors: a Learning Competition

SAS Analytics Day Predictive Modeling of Titanic Survivors: a Learning Competition Linda Schumacher Problem Introduction On April 15, 1912, the RMS Titanic sank resulting in the loss of 1502 out of 2224

### Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification

### Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

### Data Mining with R. Decision Trees and Random Forests. Hugh Murrell

Data Mining with R Decision Trees and Random Forests Hugh Murrell reference books These slides are based on a book by Graham Williams: Data Mining with Rattle and R, The Art of Excavating Data for Knowledge

### An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century

An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century Nora Galambos, PhD Senior Data Scientist Office of Institutional Research, Planning & Effectiveness Stony Brook University AIRPO

### Variable Selection and Transformation of Variables in SAS Enterprise Miner

Variable Selection and Transformation of Variables in SAS Enterprise Miner Kattamuri S. Sarma, Ph.D Ecostat Research Corp., White Plains NY kssarma@worldnet.att.net kssarma@ecostat-research.com 2 Issues

### Data mining and statistical models in marketing campaigns of BT Retail

Data mining and statistical models in marketing campaigns of BT Retail Francesco Vivarelli and Martyn Johnson Database Exploitation, Segmentation and Targeting group BT Retail Pp501 Holborn centre 120

### Using multiple models: Bagging, Boosting, Ensembles, Forests

Using multiple models: Bagging, Boosting, Ensembles, Forests Bagging Combining predictions from multiple models Different models obtained from bootstrap samples of training data Average predictions or

### Decision Trees and other predictive models. Mathias Lanner SAS Institute

Decision Trees and other predictive models Mathias Lanner SAS Institute Agenda Introduction to Predictive Models Decision Trees Pruning Regression Neural Network Model Assessment 2 Predictive Modeling

### INTRODUCTION TO DATA MINING SAS ENTERPRISE MINER

INTRODUCTION TO DATA MINING SAS ENTERPRISE MINER Mary-Elizabeth ( M-E ) Eddlestone Principal Systems Engineer, Analytics SAS Customer Loyalty, SAS Institute, Inc. AGENDA Overview/Introduction to Data Mining

### The Predictive Data Mining Revolution in Scorecards:

January 13, 2013 StatSoft White Paper The Predictive Data Mining Revolution in Scorecards: Accurate Risk Scoring via Ensemble Models Summary Predictive modeling methods, based on machine learning algorithms

### Software Course and the Case Practice Introduction of Credit Risk Data

Software Course and the Case Practice Introduction of Credit Risk Data Cheyu HUNG / 洪哲裕 StatSoft Holdings, Inc., Taiwan Branch November 27, 2013 Making the World More Productive Headquarters: StatSoft,

### Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Unit # 11 Sajjad Haider Fall 2013 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

### TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP

TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző csaba.fozo@lloydsbanking.com 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions

### Improving performance of Memory Based Reasoning model using Weight of Evidence coded categorical variables

Paper 10961-2016 Improving performance of Memory Based Reasoning model using Weight of Evidence coded categorical variables Vinoth Kumar Raja, Vignesh Dhanabal and Dr. Goutam Chakraborty, Oklahoma State

### Smart Grid Data Analytics for Decision Support

1 Smart Grid Data Analytics for Decision Support Prakash Ranganathan, Department of Electrical Engineering, University of North Dakota, Grand Forks, ND, USA Prakash.Ranganathan@engr.und.edu, 701-777-4431

### Microsoft Azure Machine learning Algorithms

Microsoft Azure Machine learning Algorithms Tomaž KAŠTRUN @tomaz_tsql Tomaz.kastrun@gmail.com http://tomaztsql.wordpress.com Our Sponsors Speaker info https://tomaztsql.wordpress.com Agenda Focus on explanation

### Data Mining for Knowledge Management. Classification

1 Data Mining for Knowledge Management Classification Themis Palpanas University of Trento http://disi.unitn.eu/~themis Data Mining for Knowledge Management 1 Thanks for slides to: Jiawei Han Eamonn Keogh

### Using Ensemble of Decision Trees to Forecast Travel Time

Using Ensemble of Decision Trees to Forecast Travel Time José P. González-Brenes Guido Matías Cortés What to Model? Goal Predict travel time at time t on route s using a set of explanatory variables We

### Classification and Regression Trees as a Part of Data Mining in Six Sigma Methodology

, October 20-22, 2010, San Francisco, USA Classification and Regression Trees as a Part of Data Mining in Six Sigma Methodology Andrej Trnka, Member, IAENG Abstract The paper deals with implementation

### Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data

Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.

### Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

### An Overview and Evaluation of Decision Tree Methodology

An Overview and Evaluation of Decision Tree Methodology ASA Quality and Productivity Conference Terri Moore Motorola Austin, TX terri.moore@motorola.com Carole Jesse Cargill, Inc. Wayzata, MN carole_jesse@cargill.com

### Social Media Mining. Data Mining Essentials

Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

### Business Analytics and Credit Scoring

Study Unit 5 Business Analytics and Credit Scoring ANL 309 Business Analytics Applications Introduction Process of credit scoring The role of business analytics in credit scoring Methods of logistic regression

### Enhancing Compliance with Predictive Analytics

Enhancing Compliance with Predictive Analytics FTA 2007 Revenue Estimation and Research Conference Reid Linn Tennessee Department of Revenue reid.linn@state.tn.us Sifting through a Gold Mine of Tax Data

### Implementation in Enterprise Miner: Decision Tree with Binary Response

Implementation in Enterprise Miner: Decision Tree with Binary Response Outline 8.1 Example 8.2 The Options in Tree Node 8.3 Tree Results 8.4 Example Continued Appendix A: Tree and Missing Values - 1 -

### Data Mining Methods: Applications for Institutional Research

Data Mining Methods: Applications for Institutional Research Nora Galambos, PhD Office of Institutional Research, Planning & Effectiveness Stony Brook University NEAIR Annual Conference Philadelphia 2014

### Abstract: Incremental Lift Modeling Approach

Analyzing Direct Marketing Campaign Performance Using Weight of Evidence coding and Information value through SAS Enterprise Miner Incremental Response Modeling Node Abstract: Data Mining and predictive

### Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

### Course Syllabus. Purposes of Course:

Course Syllabus Eco 5385.701 Predictive Analytics for Economists Summer 2014 TTh 6:00 8:50 pm and Sat. 12:00 2:50 pm First Day of Class: Tuesday, June 3 Last Day of Class: Tuesday, July 1 251 Maguire Building

### Classification and Prediction

Classification and Prediction Slides for Data Mining: Concepts and Techniques Chapter 7 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser

### An Introduction to Ensemble Learning in Credit Risk Modelling

An Introduction to Ensemble Learning in Credit Risk Modelling October 15, 2014 Han Sheng Sun, BMO Zi Jin, Wells Fargo Disclaimer The opinions expressed in this presentation and on the following slides

### Variable selection using random forests

Pattern Recognition Letters 31 (2010) January 25, 2012 Outline 1 2 Sensitivity to n and p Sensitivity to mtry and ntree 3 Procedure Starting example 4 Prostate data Four high dimensional classication datasets

### Riku Mäkeläinen & Sakari Forslund TeliaSonera Finland / Consumer Marketing

Increasing Profitability of MMS Activation Campaigns Traditional Modelling Methods vs. Two-Stage Modelling SAS Forum International 2004 - Copenhagen 15.-17.6.2004 Riku Mäkeläinen & Sakari Forslund TeliaSonera

### MACHINE LEARNING AN INTRODUCTION

AN INTRODUCTION JOSEFIN ROSÉN, SENIOR ANALYTICAL EXPERT, SAS INSTITUTE JOSEFIN.ROSEN@SAS.COM TWITTER: @ROSENJOSEFIN AGENDA What is machine learning? When, where and how is machine learning used? Exemple

### Data Mining Jargon. Bob Muenchen The Statistical Consulting Center

Data Mining Jargon Bob Muenchen The Statistical Consulting Center Data mining is the automated search for useful patterns in data. It uses tools from many different disciplines, each of which uses its

### Data Mining Techniques Chapter 6: Decision Trees

Data Mining Techniques Chapter 6: Decision Trees What is a classification decision tree?.......................................... 2 Visualizing decision trees...................................................

### Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes

Knowledge Discovery and Data Mining Lecture 19 - Bagging Tom Kelsey School of Computer Science University of St Andrews http://tom.host.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-19-B &

### COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for

### Why do statisticians "hate" us?

Why do statisticians "hate" us? David Hand, Heikki Mannila, Padhraic Smyth "Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data

### ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

DATABASE MARKETING Fall 2015, max 24 credits Dead line 15.10. ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS PART A Gains chart with excel Prepare a gains chart from the data in \\work\courses\e\27\e20100\ass4b.xls.

### Data, Measurements, Features

Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

### Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

### Model Deployment. Dr. Saed Sayad. University of Toronto 2010 saed.sayad@utoronto.ca. http://chem-eng.utoronto.ca/~datamining/

Model Deployment Dr. Saed Sayad University of Toronto 2010 saed.sayad@utoronto.ca http://chem-eng.utoronto.ca/~datamining/ 1 Model Deployment Creation of the model is generally not the end of the project.

### Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com

SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING

Data Mining with SAS Mathias Lanner mathias.lanner@swe.sas.com Copyright 2010 SAS Institute Inc. All rights reserved. Agenda Data mining Introduction Data mining applications Data mining techniques SEMMA

### DATA ANALYTICS USING R

DATA ANALYTICS USING R Duration: 90 Hours Intended audience and scope: The course is targeted at fresh engineers, practicing engineers and scientists who are interested in learning and understanding data

### What is Data mining?

STAT : DATA MIIG Javier Cabrera Fall Business Question Answer Business Question What is Data mining? Find Data Data Processing Extract Information Data Analysis Internal Databases Data Warehouses Internet

### M15_BERE8380_12_SE_C15.7.qxd 2/21/11 3:59 PM Page 1. 15.7 Analytics and Data Mining 1

M15_BERE8380_12_SE_C15.7.qxd 2/21/11 3:59 PM Page 1 15.7 Analytics and Data Mining 15.7 Analytics and Data Mining 1 Section 1.5 noted that advances in computing processing during the past 40 years have

### Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 2. Tid Refund Marital Status

Data Mining Classification: Basic Concepts, Decision Trees, and Evaluation Lecture tes for Chapter 4 Introduction to Data Mining by Tan, Steinbach, Kumar Classification: Definition Given a collection of

### Data Mining. Nonlinear Classification

Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15

### A Comparison of Decision Tree and Logistic Regression Model Xianzhe Chen, North Dakota State University, Fargo, ND

Paper D02-2009 A Comparison of Decision Tree and Logistic Regression Model Xianzhe Chen, North Dakota State University, Fargo, ND ABSTRACT This paper applies a decision tree model and logistic regression

### Survey Analysis: Data Mining versus Standard Statistical Analysis for Better Analysis of Survey Responses

Survey Analysis: Data Mining versus Standard Statistical Analysis for Better Analysis of Survey Responses Salford Systems Data Mining 2006 March 27-31 2006 San Diego, CA By Dean Abbott Abbott Analytics

### Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation. Lecture Notes for Chapter 4. Introduction to Data Mining

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data

### Data Mining Practical Machine Learning Tools and Techniques

Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

### Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental

### Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

### Importance or the Role of Data Warehousing and Data Mining in Business Applications

Journal of The International Association of Advanced Technology and Science Importance or the Role of Data Warehousing and Data Mining in Business Applications ATUL ARORA ANKIT MALIK Abstract Information

### Semester 2 Statistics Short courses

Semester 2 Statistics Short courses Course: STAA0001 - Basic Statistics Blackboard Site: STAA0001 Dates: Sat 10 th Sept and 22 Oct 2016 (9 am 5 pm) Room EN409 Assumed Knowledge: None Day 1: Exploratory

### Data Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 7 of Data Mining by I. H. Witten and E. Frank

Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 7 of Data Mining by I. H. Witten and E. Frank Engineering the input and output Attribute selection Scheme independent, scheme

### The More Trees, the Better! Scaling Up Performance Using Random Forest in SAS Enterprise Miner

Paper 3361-2015 The More Trees, the Better! Scaling Up Performance Using Random Forest in SAS Enterprise Miner Narmada Deve Panneerselvam, Spears School of Business, Oklahoma State University, Stillwater,

### THE LAST THING A FISH NOTICES IS THE WATER IN WHICH IT SWIMS COMPETITIVE MARKET ANALYSIS: AN EXAMPLE FOR MOTOR INSURANCE PRICING RISK

THE LAST THING A FISH NOTICES IS THE WATER IN WHICH IT SWIMS COMPETITIVE MARKET ANALYSIS: AN EXAMPLE FOR MOTOR INSURANCE Topic: PRICING RISK Authors: Santoni, Alessandro Towers Perrin Via Boezio, 6 00193

### Model-Based Recursive Partitioning for Detecting Interaction Effects in Subgroups

Model-Based Recursive Partitioning for Detecting Interaction Effects in Subgroups Achim Zeileis, Torsten Hothorn, Kurt Hornik http://eeecon.uibk.ac.at/~zeileis/ Overview Motivation: Trees, leaves, and

### Predictive Analytics in the Public Sector: Using Data Mining to Assist Better Target Selection for Audit

Predictive Analytics in the Public Sector: Using Data Mining to Assist Better Target Selection for Audit Duncan Cleary Revenue Irish Tax and Customs, Ireland dcleary@revenue.ie Abstract: Revenue, the Irish

### Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

### BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376

Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.

### Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví Pavel Kříž Seminář z aktuárských věd MFF 4. dubna 2014 Summary 1. Application areas of Insurance Analytics 2. Insurance Analytics

### Framing Business Problems as Data Mining Problems

Framing Business Problems as Data Mining Problems Asoka Diggs Data Scientist, Intel IT January 21, 2016 Legal Notices This presentation is for informational purposes only. INTEL MAKES NO WARRANTIES, EXPRESS

### Data Mining: Overview. What is Data Mining?

Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,

### Generalizing Random Forests Principles to other Methods: Random MultiNomial Logit, Random Naive Bayes, Anita Prinzie & Dirk Van den Poel

Generalizing Random Forests Principles to other Methods: Random MultiNomial Logit, Random Naive Bayes, Anita Prinzie & Dirk Van den Poel Copyright 2008 All rights reserved. Random Forests Forest of decision

### Methods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL

Paper SA01-2012 Methods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL ABSTRACT Analysts typically consider combinations

### BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an

### Data mining techniques: decision trees

Data mining techniques: decision trees 1/39 Agenda Rule systems Building rule systems vs rule systems Quick reference 2/39 1 Agenda Rule systems Building rule systems vs rule systems Quick reference 3/39

### Applying CHAID for logistic regression diagnostics and classification accuracy improvement

MPRA Munich Personal RePEc Archive Applying CHAID for logistic regression diagnostics and classification accuracy improvement Evgeny Antipov and Elena Pokryshevskaya The State University Higher School

### A Property & Casualty Insurance Predictive Modeling Process in SAS

Paper AA-02-2015 A Property & Casualty Insurance Predictive Modeling Process in SAS 1.0 ABSTRACT Mei Najim, Sedgwick Claim Management Services, Chicago, Illinois Predictive analytics has been developing

### In this presentation, you will be introduced to data mining and the relationship with meaningful use.

In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine

### Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal

Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether

### Predictive Modeling and Big Data

Predictive Modeling and Presented by Eileen Burns, FSA, MAAA Milliman Agenda Current uses of predictive modeling in the life insurance industry Potential applications of 2 1 June 16, 2014 [Enter presentation

### Data Mining. Knowledge Discovery, Data Warehousing and Machine Learning Final remarks. Lecturer: JERZY STEFANOWSKI

Data Mining Knowledge Discovery, Data Warehousing and Machine Learning Final remarks Lecturer: JERZY STEFANOWSKI Email: Jerzy.Stefanowski@cs.put.poznan.pl Data Mining a step in A KDD Process Data mining:

### Decision Trees for Business Intelligence and Data Mining: Using SAS Enterprise Miner. Copyright 2006, SAS Institute Inc.

The correct bibliographic citation for this manual is as follows: deville, Barry. 2006. Decision Trees for Business Intelligence and Data Mining: Using SAS Enterprise Miner. Cary, NC: SAS Institute Inc.

### Automation through Structured Risk Minimization. Robert Cooley, Ph.D. VP Technical Operations Knowledge Extraction Engines (KXEN), Inc.

Automation through Structured Risk Minimization Robert Cooley, Ph.D. VP Technical Operations Knowledge Extraction Engines (KXEN), Inc. Personal Motivation & Background When the solution is simple, God

### Data mining is used to develop models for the early prediction of freshmen GPA. Since

1 USING DATA MINING TO PREDICT FRESHMEN OUTCOMES Nora Galambos, PhD Senior Data Scientist Office of Institutional Research, Planning & Effectiveness Stony Brook University Abstract Data mining is used

### Regression Modeling Strategies

Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions

### Data Mining Using SAS Enterprise Miner Randall Matignon, Piedmont, CA

Data Mining Using SAS Enterprise Miner Randall Matignon, Piedmont, CA An Overview of SAS Enterprise Miner The following article is in regards to Enterprise Miner v.4.3 that is available in SAS v9.1.3.

### Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Unit # 10 Sajjad Haider Fall 2012 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

### STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT