V. Kumar Andrew Petersen Instructor s Presentation Slides
Chapter Six Customer Churn 2
Introduction The most effective way to manage customer churn is to understand the causes or determinants of customer churning behavior, predict which customers are most likely to leave, and conduct promotion or other strategies to encourage them to stay (given that they are likely to be profitable to bring back). The focus will be on the following two questions: What are the drivers of customer churn? Given that a customer has not left the firm yet, when is the customer likely to end the relationship? 3
Review of Customer Churn Models Review of Customer Churn Models Research Interest Specification Estimation Representative Studies Churn or not Binomial logit MLE Kim and Yoon (2004) Hierarchical logistic regression Capraro, Broniarczyk and Srivastava (2003) Logistic regression Buckinx and Van den Poel (2005) Ahn, Han, and Lee (2006) Brockett, Golden, Guillen, Nielsen, Parner and Perez-Marin (2008) ARD neural network Bayesian Buckinx and Van den Poel (2005) Random forests --- Buckinx and Van den Poel (2005) Bagging & boosting --- Lemmens and Croux (2006) classification trees Cost-sensitive classifier --- Glady, Baesens, and Croux (2009) Time-series regression MLE Danaher (2002) Hazard Dekimpe and Degraeve (1997) Jamal and Bucklin (2006) Proportional hazard Van den Poel and Larivière (2004) Brockett, Golden, Guillen, Nielsen, Parner and Perez-Marin (2008) 4
Data for Empirical Examples A dataset titled: Customer Churn Variable Customer Customer Number (from 1 to 500) Duration The time in days that the acquired prospect has been or was a customer, rightcensored at 730 days Censor 1 if the customer was still a customer at the end of the observation window, 0 otherwise Avg_Ret_Exp Average number of dollars spent on marketing efforts to try and retain that customer per month. Avg_Ret_Exp_SQ Square of the average number of dollars spent on marketing efforts to try and retain that customer per month. Total_Crossbuy The total number of categories the customer has purchased during the customer s lifetime Total_Freq The total number of purchase occasions the customer had with the firm in the customer s lifetime Total_Freq_SQ The square of the total number of purchase occasions the customer had with the firm in the customer s lifetime Industry 1 if the prospect is in the B2B industry, 0 otherwise Revenue Annual sales revenue of the prospect s firm (in millions of dollars) Employees Number of employees in the prospect s firm 5
Customer Churn UU jjjj = UUzz jjjj, ss nn (jj {cchuuuuuu, nnnn cchuuuuuu}) UU jjjj = VV jjjj + ee jjjj. Pr(cchuuuuuu jj) = PrUU cchuuuuuu,nn > UU nnnn cchuuuuuu,nn. 6
Customer Churn yy iiii = XX iiii ββ + εε iiii, ii = 1,2,, nn, aaaaaa tt = 1,2,, TT εε iiii = μμ ii = ηη iitt, EE[εε iiii ] = 0, vvvvvv(εε iiii ) = σσ μμ 2 + σσ ηη 2 = σσ 2 aaaaaa εε iiii ~NN(0, σσ 2 ) 7
Customer Churn AA iiii = ααyy iiii + XX iiii θθ + WW iiii γγ + ωω iiii 0, AA iitt = αα(xx iiii ββ + εε iiii ) + XX iiii θθ + WW iiii γγ + ωω iiii = RR iiii δδ + ττ iiii Pr(aa iiii = 1) = Pr(RR iiii δδ + ττ iiii 0) = Ф(RR iiii δδ) and Pr(aa iiii = 0) = 1 Ф(RR iiii δδ), 8
Customer Churn ff(tt; λλ) = λλee λλλλ, FF(tt; λλ) = 1 ee λλλλ ; h(tt; λλ) = ff(tt; λλ) 1 FF(tt; λλ) = λλ. LL ii tt ii,1, tt ii,2 = [ SStt ii,1 1 SStt ii,1 SStt ii,2 ] 1 dd ii[ SStt ii,1 1 SStt ii,2 ]dd ii, NN LLLL = (1 dd ii )ln [ ee λλttii,1 1 ee λλtt ii,1 ] dd ii λλtt ii,1 1 + dd ii λλtt ii,2, ii=1 ee λλtt ii,2 9
Customer Churn h jj (tt xx iiii, ii jj) = h 0jj (tt) exp(xx iiii ββ jj ) h 0jj (tt) = 1 σσ jj tt ( 1 σσ jj 1) exp(ββ0jj ) LL(iiii ii jj) = SS jj (tt xx iiii, ii jj) 1 dd iiii ff jj (tt xx iiii, ii jj) dd iiii JJ LL(ii) = LL(ii ii jj) Pr(jj), jj = 1,2,, JJ jj 1 10
Empirical Example: Customer Churn Determine the drivers of customer churn Predict the expected duration of the customers who have yet to churn Determine the predictive accuracy of the model. Variable Customer Customer Number (from 1 to 500) Duration The time in days that the acquired prospect has been or was a customer, right-censored at 730 days Censor 1 if the customer was still a customer at the end of the observation window, 0 otherwise Avg_Ret_Exp Average number of dollars spent on marketing efforts to try and retain that customer per month. Avg_Ret_Exp_SQ Total_Crossbuy Square of the average number of dollars spent on marketing efforts to try and retain that customer per month. The total number of categories the customer has purchased during the customer s lifetime Total_Freq The total number of purchase occasions the customer had with the firm in the customer s lifetime Total_Freq_SQ Industry Revenue Employees The square of the total number of purchase occasions the customer had with the firm in the customer s lifetime 1 if the prospect is in the B2B industry, 0 otherwise Annual sales revenue of the prospect s firm (in millions of dollars) Number of employees in the prospect s firm 11
Empirical Example: Customer Churn ln(dddddddddddddddd ii ) = XX ii ββ + σσεε ii where ln(duration i ) is the natural log of the duration of customer i, X i is a matrix of the time invariant independent variables for each customer i, β is a vector of parameter estimates, σ is the estimated scale parameter, and ε i is the random disturbance term. Variable Estimate Standard Error P > ChiSq Intercept 5.770 0.052 < 0.0001 Avg_Ret_Exp 0.009 0.001 < 0.0001 Avg_Ret_Exp_SQ -0.0002 0.00001 < 0.0001 Total_Crossbuy 0.098 0.007 < 0.0001 Total_Freq 0.028 0.007 < 0.0001 Total_Freq_SQ -0.001 0.0003 0.0050 Industry -0.028 0.019 0.1372 Revenue 0.004 0.001 < 0.0001 Employees 0.0004 0.00001 < 0.0001 Scale (σ) 0.158 0.007 12
Duration ratio The ratio of survival times between the baseline and current case is the following: TT(XX ii + δδ) = eeeeee ((XX ii+δδ) XX ii )ββ = eexxxx ββββ TT(XX ii ) where T(.) is the hazard model, X i is the value of the focal independent variable for customer i, δ is the change in the value of the independent variable, and exp(.) is the exponential function. Variable Duration Ratio Avg_Ret_Exp (0.0088 0.0004*Avg_Ret_Exp) Total_Crossbuy 1.103 Total_Freq (0.027 0.002*Total_Freq) Revenue 1.004 Employees 1.0004 13
The predictive accuracy of the model EE(DDDDDDDDDDDDDDDD) = exp(xx ii ββ) Predicted Churn Actual Churn 0 1 Total 0 231 38 269 1 37 194 231 Total 268 232 500 In this case the sum of the diagonal is 425 and it is accurate 85.0% of the time (425/500) 14
Implementation & Summary PROC LIFEREG from SAS to estimate the Accelerated Failure Time (AFT) model to explain the drivers of customer churn Customer churn can be considered a negative outcome of the customer retention process. The modeling of churn is as simple as a probability modeling, whether customers will churn or not, and it can be estimated by a logit model. Neural network, bagging and boosting classification trees, and cost-sensitive classifiers / time-series techniques / Hazard models can be adopted. 15