V. Kumar Andrew Petersen Instructor s Presentation Slides

Similar documents
A Neural Network based Approach for Predicting Customer Churn in Cellular Network Services

VI. Introduction to Logistic Regression

Printing Letters Correctly

Introduction to Event History Analysis DUSTIN BROWN POPULATION RESEARCH CENTER

On the Use of Continuous Duration Models to Predict Customer Churn in the ADSL Industry in Portugal

2. Illustration of the Nikkei 225 option data

Expert Systems with Applications

Copperplate Victorian Handwriting. Victorian. Exploring your History. Created by Causeway Museum Service

Participating Life Insurance Products with Alternative Guarantees: Reconciling Policyholders and Insurers Interests

Data Mining Practical Machine Learning Tools and Techniques

The ABC s of Web Site Evaluation

Statistics in Retail Finance. Chapter 6: Behavioural models

Multivariate Normal Distribution Rebecca Jennings, Mary Wakeman-Linn, Xin Zhao November 11, 2010

9 Summary of California Law (10th), Partnership

Dynamic Reputation Based Trust Management Using Neural Network Approach

Predicting Customer Default Times using Survival Analysis Methods in SAS

Nominal and ordinal logistic regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Predicting U.S. Industrial Production with Oil and Natural Gas Prices

Gamma Distribution Fitting

Survival analysis methods in Insurance Applications in car insurance contracts

STATEMENTS OF COST SPECIAL ASSESSMENTS SEPTEMBER, 2014

B. Franklin, Printer. LESSON 3: Learning the Printing Trade

Module 5 (Lectures 17 to 19) MAT FOUNDATIONS

Vulnerability assessment framework for interdependent critical infrastructures: case-study for Great Britain s rail network

PEARSON S VERSUS SPEARMAN S AND KENDALL S CORRELATION COEFFICIENTS FOR CONTINUOUS DATA. Nian Shong Chok. BS, Winona State University, 2008

Dr.Web anti-viruses Visual standards

Module 7 (Lecture 24 to 28) RETAINING WALLS

Gordon S. Linoff Founder Data Miners, Inc.

3. Regression & Exponential Smoothing

Data Mining Techniques in Customer Churn Prediction

SUMAN DUVVURU STAT 567 PROJECT REPORT

CS570 Data Mining Classification: Ensemble Methods

5 Modeling Survival Data with Parametric Regression

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Customer retention and price elasticity

The Word 2007/2010 Equation Editor

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.

Cross-buying behaviour and customer loyalty in the insurance sector

Unconstrained Handwritten Character Recognition Using Different Classification Strategies

INCREASING FORECASTING ACCURACY OF TREND DEMAND BY NON-LINEAR OPTIMIZATION OF THE SMOOTHING CONSTANT

How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK

Griswold Center for Economic Policy Studies Working Paper No. 244, June 2015

NCSS Statistical Software

MANDATE OF THE BOARD

Real-world and risk-neutral probabilities in the regulation on the transparency of structured products

County Council of Cuyahoga County, Ohio. Resolution No. R

Predictive Modeling Techniques in Insurance

Modeling Customer Lifetime Value Using Survival Analysis An Application in the Telecommunications Industry

Assessment of the Environmental Performance of European Countries over Time: Addressing the Role of Carbon Leakage and Nuclear Waste

ICAEW CERTIFICATE IN INSOLVENCY SYLLABUS JULY 2013

A Comparative Assessment of the Performance of Ensemble Learning in Customer Churn Prediction

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Neural Computation - Assignment

McGraw Hill Wonders. Reading At A Glance. Kendra Stuppi

Advanced Database Marketing Innovative Methodologies and Applications for Managing Customer Relationships

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

SAS Software to Fit the Generalized Linear Model

Comparison of sales forecasting models for an innovative agro-industrial product: Bass model versus logistic function

European Journal of Operational Research

Practice Writing the Letter A

Java Modules for Time Series Analysis

Statistical Analysis of Life Insurance Policy Termination and Survivorship

A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA

Classification Problems

Note: This information copied with permission. Thanks to Kevin K. Custer W3KKC, Masters Communications, Inc.

Customer churn analysis a case study

MS1b Statistical Data Mining

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

2016 TOURNAMENT OF CHAMPIONS NORTHWEST ALL-STARS & SEATTLE YOUTH BASKETBALL

Predictive Data modeling for health care: Comparative performance study of different prediction models

An Application of the Cox Proportional Hazards Model to the Construction of Objective Vintages for Credit in Financial Institutions, Using PROC PHREG

WINDOWS MEDIA PLAYER 9 SKINS ACCEPTANCE CRITERIA SPECIFICATION

ECO 745: Theory of International Economics. Jack Rossbach August 26, Week 1

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges

Application of the Multi Criteria Decision Making Methods for Project Selection

Inequality of Opportunity in Adult Health in Colombia

Capital Project Budget Status (Summary) Telecomm Training Other Total Budget YTD Variance

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541

Microsoft Azure Machine learning Algorithms

New Work Item for ISO Predictive Analytics (Initial Notes and Thoughts) Introduction

advertise to announce that a product is for sale or that a service is offered in order to encourage people to buy it or to use it

Better credit models benefit us all

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )

Modeling customer loyalty using customer lifetime value b

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as

PROGRAM ON HOUSING AND URBAN POLICY

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

MyOWNMcMaster Degree Pathway: Diploma in Business Administration & Bachelor of Arts in History

CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis

Islands 1 Wordlist. Islands Alphabet booklet. Islands 1 Grammar Booklet. Islands 1 Pupils Book. Islands 1 Activity Book.

Requirements The MyOWNMcMaster degree pathway has three parts: diploma, elective and undergraduate courses.

Including the Salesperson Effect in Purchasing Behavior Models Using PROC GLIMMIX

The MyOWNMcMaster degree pathway has three parts: diploma, elective and undergraduate courses.

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Distance to Event vs. Propensity of Event A Survival Analysis vs. Logistic Regression Approach

Expense Reports. University of Kansas 2/12/2014

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Transcription:

V. Kumar Andrew Petersen Instructor s Presentation Slides

Chapter Six Customer Churn 2

Introduction The most effective way to manage customer churn is to understand the causes or determinants of customer churning behavior, predict which customers are most likely to leave, and conduct promotion or other strategies to encourage them to stay (given that they are likely to be profitable to bring back). The focus will be on the following two questions: What are the drivers of customer churn? Given that a customer has not left the firm yet, when is the customer likely to end the relationship? 3

Review of Customer Churn Models Review of Customer Churn Models Research Interest Specification Estimation Representative Studies Churn or not Binomial logit MLE Kim and Yoon (2004) Hierarchical logistic regression Capraro, Broniarczyk and Srivastava (2003) Logistic regression Buckinx and Van den Poel (2005) Ahn, Han, and Lee (2006) Brockett, Golden, Guillen, Nielsen, Parner and Perez-Marin (2008) ARD neural network Bayesian Buckinx and Van den Poel (2005) Random forests --- Buckinx and Van den Poel (2005) Bagging & boosting --- Lemmens and Croux (2006) classification trees Cost-sensitive classifier --- Glady, Baesens, and Croux (2009) Time-series regression MLE Danaher (2002) Hazard Dekimpe and Degraeve (1997) Jamal and Bucklin (2006) Proportional hazard Van den Poel and Larivière (2004) Brockett, Golden, Guillen, Nielsen, Parner and Perez-Marin (2008) 4

Data for Empirical Examples A dataset titled: Customer Churn Variable Customer Customer Number (from 1 to 500) Duration The time in days that the acquired prospect has been or was a customer, rightcensored at 730 days Censor 1 if the customer was still a customer at the end of the observation window, 0 otherwise Avg_Ret_Exp Average number of dollars spent on marketing efforts to try and retain that customer per month. Avg_Ret_Exp_SQ Square of the average number of dollars spent on marketing efforts to try and retain that customer per month. Total_Crossbuy The total number of categories the customer has purchased during the customer s lifetime Total_Freq The total number of purchase occasions the customer had with the firm in the customer s lifetime Total_Freq_SQ The square of the total number of purchase occasions the customer had with the firm in the customer s lifetime Industry 1 if the prospect is in the B2B industry, 0 otherwise Revenue Annual sales revenue of the prospect s firm (in millions of dollars) Employees Number of employees in the prospect s firm 5

Customer Churn UU jjjj = UUzz jjjj, ss nn (jj {cchuuuuuu, nnnn cchuuuuuu}) UU jjjj = VV jjjj + ee jjjj. Pr(cchuuuuuu jj) = PrUU cchuuuuuu,nn > UU nnnn cchuuuuuu,nn. 6

Customer Churn yy iiii = XX iiii ββ + εε iiii, ii = 1,2,, nn, aaaaaa tt = 1,2,, TT εε iiii = μμ ii = ηη iitt, EE[εε iiii ] = 0, vvvvvv(εε iiii ) = σσ μμ 2 + σσ ηη 2 = σσ 2 aaaaaa εε iiii ~NN(0, σσ 2 ) 7

Customer Churn AA iiii = ααyy iiii + XX iiii θθ + WW iiii γγ + ωω iiii 0, AA iitt = αα(xx iiii ββ + εε iiii ) + XX iiii θθ + WW iiii γγ + ωω iiii = RR iiii δδ + ττ iiii Pr(aa iiii = 1) = Pr(RR iiii δδ + ττ iiii 0) = Ф(RR iiii δδ) and Pr(aa iiii = 0) = 1 Ф(RR iiii δδ), 8

Customer Churn ff(tt; λλ) = λλee λλλλ, FF(tt; λλ) = 1 ee λλλλ ; h(tt; λλ) = ff(tt; λλ) 1 FF(tt; λλ) = λλ. LL ii tt ii,1, tt ii,2 = [ SStt ii,1 1 SStt ii,1 SStt ii,2 ] 1 dd ii[ SStt ii,1 1 SStt ii,2 ]dd ii, NN LLLL = (1 dd ii )ln [ ee λλttii,1 1 ee λλtt ii,1 ] dd ii λλtt ii,1 1 + dd ii λλtt ii,2, ii=1 ee λλtt ii,2 9

Customer Churn h jj (tt xx iiii, ii jj) = h 0jj (tt) exp(xx iiii ββ jj ) h 0jj (tt) = 1 σσ jj tt ( 1 σσ jj 1) exp(ββ0jj ) LL(iiii ii jj) = SS jj (tt xx iiii, ii jj) 1 dd iiii ff jj (tt xx iiii, ii jj) dd iiii JJ LL(ii) = LL(ii ii jj) Pr(jj), jj = 1,2,, JJ jj 1 10

Empirical Example: Customer Churn Determine the drivers of customer churn Predict the expected duration of the customers who have yet to churn Determine the predictive accuracy of the model. Variable Customer Customer Number (from 1 to 500) Duration The time in days that the acquired prospect has been or was a customer, right-censored at 730 days Censor 1 if the customer was still a customer at the end of the observation window, 0 otherwise Avg_Ret_Exp Average number of dollars spent on marketing efforts to try and retain that customer per month. Avg_Ret_Exp_SQ Total_Crossbuy Square of the average number of dollars spent on marketing efforts to try and retain that customer per month. The total number of categories the customer has purchased during the customer s lifetime Total_Freq The total number of purchase occasions the customer had with the firm in the customer s lifetime Total_Freq_SQ Industry Revenue Employees The square of the total number of purchase occasions the customer had with the firm in the customer s lifetime 1 if the prospect is in the B2B industry, 0 otherwise Annual sales revenue of the prospect s firm (in millions of dollars) Number of employees in the prospect s firm 11

Empirical Example: Customer Churn ln(dddddddddddddddd ii ) = XX ii ββ + σσεε ii where ln(duration i ) is the natural log of the duration of customer i, X i is a matrix of the time invariant independent variables for each customer i, β is a vector of parameter estimates, σ is the estimated scale parameter, and ε i is the random disturbance term. Variable Estimate Standard Error P > ChiSq Intercept 5.770 0.052 < 0.0001 Avg_Ret_Exp 0.009 0.001 < 0.0001 Avg_Ret_Exp_SQ -0.0002 0.00001 < 0.0001 Total_Crossbuy 0.098 0.007 < 0.0001 Total_Freq 0.028 0.007 < 0.0001 Total_Freq_SQ -0.001 0.0003 0.0050 Industry -0.028 0.019 0.1372 Revenue 0.004 0.001 < 0.0001 Employees 0.0004 0.00001 < 0.0001 Scale (σ) 0.158 0.007 12

Duration ratio The ratio of survival times between the baseline and current case is the following: TT(XX ii + δδ) = eeeeee ((XX ii+δδ) XX ii )ββ = eexxxx ββββ TT(XX ii ) where T(.) is the hazard model, X i is the value of the focal independent variable for customer i, δ is the change in the value of the independent variable, and exp(.) is the exponential function. Variable Duration Ratio Avg_Ret_Exp (0.0088 0.0004*Avg_Ret_Exp) Total_Crossbuy 1.103 Total_Freq (0.027 0.002*Total_Freq) Revenue 1.004 Employees 1.0004 13

The predictive accuracy of the model EE(DDDDDDDDDDDDDDDD) = exp(xx ii ββ) Predicted Churn Actual Churn 0 1 Total 0 231 38 269 1 37 194 231 Total 268 232 500 In this case the sum of the diagonal is 425 and it is accurate 85.0% of the time (425/500) 14

Implementation & Summary PROC LIFEREG from SAS to estimate the Accelerated Failure Time (AFT) model to explain the drivers of customer churn Customer churn can be considered a negative outcome of the customer retention process. The modeling of churn is as simple as a probability modeling, whether customers will churn or not, and it can be estimated by a logit model. Neural network, bagging and boosting classification trees, and cost-sensitive classifiers / time-series techniques / Hazard models can be adopted. 15