1 EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER

2 ANALYTICS LIFECYCLE Evaluate & Monitor Model Formulate Problem Data Preparation Deploy Model Data Exploration Validate Models Develop Models Transform & Select

3 ANALYTICS LIFECYCLE DECISION TREES CAN HELP IN VARIOUS STAGES Evaluate & Monitor Model Formulate Problem Data Preparation Deploy Model Data Exploration Validate Models Develop Models Transform & Select

4 WHY DECISION TREES?

5 DECISION TREES ADVANTAGES Decision Trees are powerful predictive and explanatory modeling tools They are flexible in that they are able to model targets that are: Interval (regression trees) Ordinal, nominal and binary (classification trees) Trees can accommodate nonlinearities and interactions Trees are simple to understand and present

6 DECISION TREE EASY TO VISUALIZE

7 DECISION TREES ENGLISH RULES Node = 10 if Saving Balance >= AND Credit Card Balance < then Tree Node Identifier = 10 Number of Observations = 981 Predicted: INS=1 = 0.68 Predicted: INS=0 = 0.32

8 DECISION TREE BACKGROUND

9 WHAT ARE DECISION TREES? Decision trees are statistical models designed for supervised prediction problems. The tree is fitted to data by recursive partitioning. Partitioning refers to segmenting the data into subgroups that are as homogeneous as possible with respect to the target. Many algorithms CHAID, CART, C4.5, C5.0

10 2 TYPES OF TREES Classification tree target is categorical Regression tree target is continuous

11 DECISION TREES CLASSIFICATION TREE

12 DECISION TREES MULTI-WAY SPLITS

13 DECISION TREES REGRESSION TREE

14 DECISION TREES PARTITIONED INPUT SPACE

15 DECISION TREES MULTIVARIATE STEP FUNCTION

16 DECISION TREES DECISION REGIONS

17 DECISION TREES LEAVES OF A CLASSIFICATION TREE

18 USING DECISION TREES FOR INITIAL AND EXPLORATORY DATA ANALYSIS

19 DECISION TREES INITIAL DATA ANALYSIS AND EXPLORATORY DATA ANALYSIS Interpretability No strict assumptions concerning the functional form of the model Resistant to the curse of dimensionality Robust to outliers in the input space No need to create dummy variables for nominal inputs Missing values do not need to be imputed Computationally fast (usually)

20 USING DECISION TREES TO MODIFY INPUT SPACE

21 DECISION TREES MODIFYING THE INPUT SPACE Dimension Reduction Input subset selection Collapsing levels of nominal inputs Dimension Enhancement Discretizing interval inputs Stratified modeling

22 DECISION TREES INPUT SELECTION

23 DECISION TREES COLLAPSING LEVELS

24 INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER

25 DECISION TREES INTERACTIVE TRAINING Force and remove inputs Define split values Manually prune branches and leaves

26 DECISION TREE INTERACTIVE DECISION TREE TIP: Prior to invoking interactive mode, modify the Decision Tree properties to reflect the type of tree you wish to build.

27 BUILDING SEGMENTATION TREES

28 DECISION TREES SEGMENTATION TREES WITH MULTIPLE TARGETS Interactively build trees while considering more than one target.

29 DEMONSTRATION

31 SAS ENTERPRISE MINER BAGGING/BOOSTING TREES Use Start Groups & End Groups Nodes

32 SAS ENTERPRISE MINER GRADIENT BOOSTING Sequential ensemble of many trees Extremely good predictions Very effective at variable selection

33 SAS ENTERPRISE MINER RANDOM FOREST Predictive Model called a Forest Creates Several Trees Training Data sampled without replacement Input variables sampled Available in EM 13.1

34 TIPS AND RESOURCES

35 TIP INTERACTIVE DECISION TREE The Interactive Decision Tree may not use all of your data. It uses a sample of at most 20,000 observations to prevent the excessive time and memory consumption that can occur with large data sets. You can control the size and method for creating the sample with Project Start Code

36 TIP INTERACTIVE DECISION TREE %let EM_INTERACTIVE_TREE_MAXOBS= <maxnumber-of-observations-in-sample>; %let EM_INTERACTIVE_TREE_SAMPLEMETHOD=<RANDOM FIRSTN STRATIFY>;

37 TIP INTERACTIVE DECISION TREE %let EM_INTERACTIVE_TREE_MAXOBS = ; %let EM_INTERACTIVE_TREE_SAMPLEMETHOD = RANDOM;

38 LEARNING MORE DOCUMENTATION SAS Enterprise Miner In-product Help File Documentation: Getting Started with SAS Enterprise Miner Documentation PDF Sample Data ZIP Recorded Webinar:

39 LEARNING MORE SAS EDUCATION COURSES Decision Tree Modeling Data Mining Techniques: Theory and Practice

40 LEARNING MORE SAS PRESS

41 LEARNING MORE SAS PRESS Decision Trees for Analytics Using SAS Enterprise Miner By: Barry de Ville and Padraic Neville ISBN: Copyright Date: July 2013 SAS Bookstore: 1&pc=63319 Table of Contents [PDF] Free Chapter [PDF] Example Code and Data

42 THANK YOU FOR USING SAS

