Free Trial - BIRT Analytics - IAAs

Similar documents
Customer Analytics. Turn Big Data into Big Value

CoolaData Predictive Analytics

Simple Predictive Analytics Curtis Seare

birt Analytics data sheet Reduce the time from analysis to action

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Generalized Linear Models

IBM SPSS Direct Marketing 23

11. Analysis of Case-control Studies Logistic Regression

IBM SPSS Direct Marketing 22

SUGI 29 Statistics and Data Analysis

Correlational Research

Tutorial #7A: LC Segmentation with Ratings-based Conjoint Data

Binary Logistic Regression

ln(p/(1-p)) = α +β*age35plus, where p is the probability or odds of drinking

Chapter 25 Specifying Forecasting Models

Statistics in Retail Finance. Chapter 2: Statistical models of default

Logs Transformation in a Regression Equation

Easily Identify Your Best Customers

How To Run Statistical Tests in Excel

A quick guide to. Social Media

Finding Supporters. Political Predictive Analytics Using Logistic Regression. Multivariate Solutions

Credit Risk Analysis Using Logistic Regression Modeling

Quick Start. Creating a Scoring Application. RStat. Based on a Decision Tree Model

Simple Linear Regression

Using Excel for Statistical Analysis

STATISTICA Formula Guide: Logistic Regression. Table of Contents

IBM SPSS Direct Marketing 19

Get to Know the IBM SPSS Product Portfolio

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank

QualysGuard WAS. Getting Started Guide Version 3.3. March 21, 2014

Overview Classes Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Logistic Regression (1/24/13)

Categorical Data Analysis

Implementing a Customer Lifetime Value Predictive Model: Use Case

OpenText Actuate Big Data Analytics 5.2

Modeling Lifetime Value in the Insurance Industry

Predictive Analytics: Extracts from Red Olive foundational course

The. biddible. Guide to AdWords at Christmas

Agenda. Mathias Lanner Sas Institute. Predictive Modeling Applications. Predictive Modeling Training Data. Beslutsträd och andra prediktiva modeller

Predicting Successful Completion of the Nursing Program: An Analysis of Prerequisites and Demographic Variables

Data Mining Techniques Chapter 6: Decision Trees

Ordinal Regression. Chapter

Business Intelligence. Tutorial for Rapid Miner (Advanced Decision Tree and CRISP-DM Model with an example of Market Segmentation*)

SAS Visual Analytics 7.2 for SAS Cloud: Quick-Start Guide

How to Get More Value from Your Survey Data

Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)

Final Exam Practice Problem Answers

Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study)

Factors affecting online sales

Veeam MarketReach User Guide. Automate Your Marketing. Grow Your Business.

Alex Vidras, David Tysinger. Merkle Inc.

How to set the main menu of STATA to default factory settings standards

Paper D Ranking Predictors in Logistic Regression. Doug Thompson, Assurant Health, Milwaukee, WI

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Week TSX Index

Data Mining Techniques in CRM

Designing a Lead Lifecycle in Salesforce

A quick guide to... Social Media

Magaseek embraces data-driven decision making to enhance site design.

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Chapter 3 Quantitative Demand Analysis

Improve Marketing Campaign ROI using Uplift Modeling. Ryan Zhao

PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY

Elements of statistics (MATH0487-1)

SAS Software to Fit the Generalized Linear Model

Product recommendations and promotions (couponing and discounts) Cross-sell and Upsell strategies

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén Table Of Contents

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

Simple Linear Regression, Scatterplots, and Bivariate Correlation

QualysGuard WAS. Getting Started Guide Version 4.1. April 24, 2015

Data Mining Algorithms Part 1. Dejan Sarka

Social Business Intelligence For Retail Industry

Statistics 305: Introduction to Biostatistical Methods for Health Sciences

Dell Spotlight on Active Directory Server Health Wizard Configuration Guide

ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.

Using Group Policy to Remotely Install Steelhead Mobile Software

TNS EX A MINE BehaviourForecast Predictive Analytics for CRM. TNS Infratest Applied Marketing Science

Analytics for cross-channel campaigns

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

Oracle Data Miner (Extension of SQL Developer 4.0)

Virtual Terminal User s Guide

InfiniteInsight 6.5 sp4

Intel Retail Client Manager

THE INFLUENCE OF MARKETING INTELLIGENCE ON PERFORMANCES OF ROMANIAN RETAILERS. Adrian MICU 1 Angela-Eliza MICU 2 Nicoleta CRISTACHE 3 Edit LUKACS 4

Role of Social Networking in Marketing using Data Mining

Solving Insurance Business Problems Using Statistical Methods Anup Cheriyan

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

International Statistical Institute, 56th Session, 2007: Phil Everson

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

What s New in Analytics: Fall 2015

Transcription:

Free Trial - BIRT Analytics - IAAs 11. Predict Customer Gender Once we log in to BIRT Analytics Free Trial we would see that we have some predefined advanced analysis ready to be used. Those saved analysis is what we call Instant Advanced Analysis (IAAs). If we double click over My folders > Demo Retail Customer Analytics we will see a list of seventeen (17) saved analysis, as an introduction of what we can do with BIRT Analytics in an environment of Customer Analytics in a retail commerce (our demo database is based in a home improvement retailer example).

Those eleven categories are built to cover distinct areas of Customer Analytics that are of high value to any retailer, and try to answer some questions like: 1. How is the performance of my sales by product category? 2. An RFM approach: Who are our best RFM customers? How do they look like? 3. Advanced segmentation of your customers in order to focus your marketing efforts. 4. Who are your churn customers? How do they look like? And what is more important, which one of your loyal customers is more likely to become a churner in the future? 5. How your products are associated in a basket? Which is the best next offer when a customer have product A and B in his basket? 6. Discover new cross sell opportunities. 7. The value of my customers will grow or it will decrease in the next months? 8. What is the voice of the customer telling about us in the Social Media? 9. Is there any relationship between data in the twitter interactions? 10. Can we predict the total audience a tweet could reach? 11. Can we complete empty gender values from our customer s data to target them with the correct marketing campaign? 2

Now, we are going to answer the tenth question: Can we complete empty gender values from our customer s data to target them with the correct marketing campaign? When we double-click this analysis, BIRT Analytics goes to Analytics > Advanced and shows the Parameters tab, with the starting data of the model. 3

This logistic regression (Analytics > Advanced > Logistic regression) is defined using certain parameters: A Domain that are all the Customers that had a Gender defined (values are not null) A dependent variable, the one we want to predict. If a certain customer is Female (a new column with a 1 (yes) or 0 (no) result, a binary response). A selection of continuous independent variables: customer age, if this customer is an internet customer, if this customer allowed us to send emails to him and if this customer is a store customer 4

We can see how looks like the logistic regression calculated in the Results tab. BIRT Analytics provide the logistic function that recreates the predictive model. Providing this equation of the four independent variables, it returns a prediction of the probability that this customer is a female. Below the equation there is a 5-star qualification of the goodness of fit of the equation compared with the real data of the original Domain. In our case, we have 5 stars that means that this predictive model is really accurate. This rating is done using the p-value of the Chi Squared test that is showed in the Statistics tab. 5

This third tab shows all the test and data used in the qualification of the goodness of fit of the logistic model. The tap is divided in two main parts, the upper are the global fitting test (evaluate all the equation) and the lower table shows specific tests for each of the coefficients of the equation (including the intercept). This kind of regression is globally evaluated by two distinct tests: Chi squared test and its p-value Log Likelihood ratio statistic test The first one needs to be a high value, or its p-value need to be as small as possible (under 0.01 we can assume that the model fits the train sample). The Log likelihood is always negative and needs to be close to zero. In our example, we have a good first test, with a Chi Squared quite high and a p-value smaller than 0.01, but in the other hand, the log likelihood value is a big negative value. That means that this model doesn t fit as well as expected at a first sight. 6

Each coefficient in the logistic equation has its tests. Standard error is a measure of the mean error that we are assuming when we compare the logistic equation (the predictive model) against the real data. Odds ratio test is a test to measure the influence of the independent variable (related to that coefficient) over the dependent. As bigger is the ratio, better is the relationship between dependent variable and the independent. The Upper and Lower Confidence level are defined for each coefficient. It is related to the Odds ratio. The p value of the log likelihood ratio for a certain coefficient. Only evaluates this coefficient, but it has the same interpretation as the global test. The significance level is based in the distinct values of the p-value of the log likelihood ratio, and is a 0 to 5 scale to evaluate how relevant is a certain independent variable in the equation. One of the tricks of logistic regression is that is not only based in one test to evaluate the goodness of fit. It needs a multiple variable test to analyze if the model is good enough or not. In our example, analyzing each coefficient we can conclude that: Intercept: Has a low standard error, but it s Odds Ratio is quite low, so this coefficient could be zero, because of the low relevance in predicting the dependent variable. Age: This variable is slightly relevant due to its Odds ratio, although its significance level is the highest. Internet Customer EQ Y: This coefficient is more relevant that the Age (a bigger Odds ratio) and it also has a high significance level. Mailable EQ Y: This categorical variable is the least relevant of all the predictors, because of its Odds ratio value. Store Customer EQ Y: This is the best variable, the most relevant, that defines this model. It has the highest Odds ratio and a high significance level. This model could be applied in a new column to classify those that doesn t have a Gender assigned to predict their probability to be a female. If you want to know more about this data mining technique you can find more documentation of linear regressions in: http://developer.actuate.com/resources/documentation/birt-analytics/4-4/ Copyright 2014 Actuate Corporation. All rights reserved. Actuate, legodo, BIRT ihub, BIRT ihub F-Type, BIRT Analytics, Actuate Customer Communications Suite, The Actuate Document Accessibility Appliance, BIRT ondemand, BIRT Viewer Toolkit, and the Actuate logo are trademarks or registered trademarks of Actuate Corporation and/ or its affiliates in the U.S. and certain other countries. The use of the word partner or partnership does not imply a legal partnership relationship between Actuate and any other company. All other brands, names or trademarks mentioned may be trademarks of their respective owners. Actuate Corporation 951 Mariners Island Boulevard San Mateo, CA 94404 Tel: (+1) 888-422-8828 BIRTAnalytics@actuate.com www.actuate.com/birtanalytics 7