Hochschule Düsseldorf University of Applied Scienses Fachbereich Wirtschaftswissenschaften W Business Analytics (M.Sc.) IT in Business Analytics IT Applications in Business Analytics SS2016 / Lecture 07 Use Case 1 (Two Class Classification) SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 1
Let s get started be a business analytics consultant! SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 2
Case 1 Bike Sales SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 3
Point of Departure 2016 Polygon Whether you're making a go at XC mountain bike racing or simply looking to upgrade your confidence level on the trail, the Polygon hardtail mountain bike proves to be the perfect choice. The Polygon feature sour race-proven 29er geometry with a low-slung bottom bracket and incredibly short chainstays for a planted sensation, snappy handling, and efficient power transfer. It's the obvious mountain bike for anyone who demands speed and reliability. SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 4
Point of Departure Bike Shop We run a bike shop, both stationary and online. Based on an online competition we collected a couple of new customer records. We want to send an email to the most promising new customers to advertise our new 2016 mountain bike model, the Polygon. Who are they? SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 5
The best team will win 4x Teams volunteer to deliver the best proposal for the email campaign. Main Deliverable Proposal for list of new customers to send an email. Evaluate the best prediction model Use the ROC AUC (area under curve) value Present your results (next week) What have you done and why? (use your Knime workflows to explain) What is your conclusion and proposal? Compile a few slides, max. 10 minutes presentation SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 6
CRISP DM Phases and Tasks Business Understanding Determine Business Objectives Background. Business Objectives. Business Success Criteria. Assess Situation Inventory of Resources, Requirements, Assumptions and Constraints. Risks and Contingencies Terminology. Costs and Benefits. Determine Data Mining Goals Data Mining Goals. Data Mining Success Criteria. Data Understanding Collect Initial Data Initial Data Collection Report. Describe Data Data Description Report. Explore Data Data Exploration Report. Verify Data Quality Data Quality Report. Data Preparation Select Data Rationale for Inclusion/ Exclusion. Clean Data Data Cleaning Report. Construct Data Derived Attributes. Generated Records. Integrate Data Merged Data. Format Data Reformatted Data. Dataset Dataset Description. Modelling Select Modelling Technique Modelling Technique. Modelling Assumptions. Generate Test Design Test Design. Build Model Parameter Settings Models. Model Description. Assess Model Model Assessment. Revised Parameter Settings. Evaluation Evaluate Results Assessment of Data. Mining Results w.r.t. Business Success Criteria. Approved Models. Review Process Review of Process. Determine Next Steps List of Possible Actions. Decision. Deployment Plan Deployment Deployment Plan. Plan Monitoring and Maintenance Monitoring and Maintenance Plan. Produce Final Report Final Report. Final Presentation. Review Project Experience Documentation. Produce Project Plan Project Plan. Initial Assessment of Tools and Techniques. SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 7
Available Data Sheet: ExistingCustomers >>> Use for model training and test. Sheet: NewCustomers >>> Select promising emails receivers. https://wiwi.hs-duesseldorf.de/personen/thomas.zeutschler/seiten/default.aspx SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 8
Knime Sample Implementation Beat the teacher. Area Under Curve = 0,756 https://wiwi.hsduesseldorf.de/personen/thomas.zeuts chler/seiten/default.aspx Receiver Operating Characteristic (ROC), is a graphical plot that illustrates the performance of a binary classifier system as its discrimination threshold is varied. SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 9
Want to beat your teacher? (AUC 0,756) Do you have a full understanding of the business problem? What is about data quality? Do we need further data preparation? What is the class of the problem to solve (tip: cheat-sheet)? How to select the right / best prediction model? SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 10
Cheating SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 11
Two Class Classification SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 12
Two Class Classification Introduction Also called Binary Classification Statistical Problem: Classify the elements of a given set into two groups by applying a certain classification method. Application in economies: Customer selection, e.g. Whom to send an email? Portfolio decisions, e.g. What stocks or products to buy? Any kind of Yes/No assignment Application in medical testing: Has a patient a certain disease or not? SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 13
Two Class Classification Similar Problems Super-Problem: Statistical Classification One Class (unary) Classification Identify specific elements among others. Application: outlier detection, anomaly detection, novelty detection Multi-Class (multinomial) Classification Classify the elements of a given set into more than two groups by applying a certain classification method. Application: clustering, attribute assignment, just more then 2 classes SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 14
Two Class Classification Confusion Matrix Purpose: Evaluate the performance of a certain classification algorithm. Biker Buyer? Yes Predicted Class No Actual Class Yes No SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 15
Two Class Classification Confusion Matrix Purpose: Evaluate the performance of a certain classification algorithm. Biker Buyer? Yes Predicted Class No Actual Class Yes No true positives false positive true negatives false negatives error correct SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 16
Two Class Classification Confusion Matrix Purpose: Evaluate the performance of a certain classification algorithm. Biker Buyer? Population = 3.017 Yes Predicted Class No Actual Class Yes No 96 204 77 2.640 SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 17
Two Class Classification Confusion Matrix Purpose: Evaluate the performance of a certain classification algorithm. Total Population predicted condition positive negative Prevalence = Σ Condition positive / Σ Total population real condition positive negative true positive false positive (type I error) false negative (type II error) true negative True Positive Rate (TPR) = Σ True positive / Σ Condition positive (also called Sensitivity, Recall) False Positive Rate (FPR) = Σ False positive / Σ Condition negative (also called Fall-out) False Negative Rate (FNR) = Σ False negative / Σ Condition positive (also called Miss rate) True Negative Rate (TNR) = Σ True negative / Σ Condition negative (also called Specificity (SPC)) Accuracy (ACC) = (Σ True positive + Σ True negative) / Σ Total population Positive Predictive Value (PPV), = Σ True positive / Σ Test outcome positive (also called Precision) False Discovery Rate (FDR) = Σ False positive / Σ Test outcome positive False Omission Rate (FOR) = Σ False negative / Σ Test outcome negative Negative Predictive Value (NPV) = Σ True negative / Σ Test outcome negative Positive Likelihood Ratio (LR+) = TPR / FPR Negative Likelihood Ratio (LR ) = FNR / TNR Diagnostic Odds Ratio (DOR) = LR+ / LR SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 18
Classification Method Comparison Linearly separable pattern: Binary (2-classes) classification http://tjo-en.hatenablog.com/entry/2014/01/06/234155 SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 19
Classification Method Comparison Linearly inseparable pattern: Binary Classification for a simple XOR pattern http://tjo-en.hatenablog.com/entry/2014/01/06/234155 SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 20
Classification Method Comparison Linearly separable pattern: 3-classes classification http://tjo-en.hatenablog.com/entry/2014/01/06/234155 SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 21
Classification Method Comparison Linearly inseparable pattern: Binary Classification for a complex XOR pattern http://tjo-en.hatenablog.com/entry/2014/01/06/234155 SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 22
Classification Method Comparison 4-classes classification for a complex pattern http://tjo-en.hatenablog.com/entry/2014/01/06/234155 SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 23
Classification Method Comparison Try to understand the pattern of data... by applying visual data analysis by applying pairwise comparison of attributes Is your data Linear Separable? Yes: Logistic Regression, Neuronal Networks be cautious on Decision Tree or Random Forrest No: Random Forrest or SVM???: Random Forrest good balance of generalization and accuracy, and its computational cost is relatively low But: Neuronal Networks can (not must) be the best solution but it s not easy to tune them to deliver good results (many parameters). SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 24
Decision Tree Learning SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 25
Decision Tree Learning A supervised learning method. Purpose: Predict the value of a certain target variable of an item based on observations on other variables from other items. If the target variable is from a finite set of values, then we call it classification tree. Otherwise a regression tree. Leaves represent class labels, whereas Branches represent conjunctions of features (variables) that lead to those class labels. Decision Tree (partial) for Bike Sales Sample SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 26
Decision Tree Learning A decision trees describe data, not decisions. A decision tree can be used as input for decision making, e.g. a prediction. Computation: Recursive Partitioning Recursively split the data set into subsets based on an attribute-value-test. (Greedy Algorithm) The recursion is completed when the subset at a node has all the same value of the target variable, or when splitting no longer adds value to the predictions. This approach is called top-down induction of decision trees Different algorithms and metrics have been developed to solve the core in decision tree generation: What is the right variable at each step that best splits the set of items? Greedy Algorithm: making the locally optimal choice at each stage of recursive process. SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 27
Decision Tree Learning in Knime Metric (quality measure) for splitting: Gini Index or Gini Impurity : Given a set of m items i of {1,2,,m} and f i be the fraction of items labeled with the value v i. Information Gain Ratio: Based on the entropy* of an information: Information Gain is defined as = Entropy(parent) - Weighted Sum of Entropy(Children) *the expected value of an information. SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 28
Decision Tree Learning in Knime Pruning Method Pruning reduces tree size and avoids overfitting which increases the generalization performance, and thus, the prediction quality. Available is the "Minimal Description Length" (MDL) pruning or it can also be switched off. Reduced Error Pruning Just relevant if execution speed matters. Otherwise switch it off. Skip nominal columns with domain information Always switch on. This ensures that columns with too many nominal values (e.g. the customer name in the bike sales sample) are automatically skipped. SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 29
Bike Sales Solutions SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 30
Bike Sales using Decision Tree SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 31
Bike Sales using Optimized Random Forrest SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 32
Result Comparision Decision Tree Optimized Random Forrest SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 33
Bike Sales reevaluation by common sense Just 2000 new customers? Let s send everyone an email SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 34
Lecture Summary & Homework SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 35
Lessons Learned Try to understand the business problem end-to-end. Try think beyond the scope of your current knowledge and work. That s analytical thinking. Even simple looking analytical problems may get tricky. You must follow multiple analytical paths to find the best solution. SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 36
Homework Read the post Classification performance comparison http://tjo-en.hatenablog.com/entry/2014/01/06/234155 Read the article Predicting Good Probabilities With Supervised Learning http://machinelearning.wustl.edu/mlpapers/paper_files/icml2005_nicule scu-mizilc05.pdf SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 37
Any Questions? SS 2016 - IT Applications in Business Analytics - 6. Analytical Use Case 1 38