The data warehouse concept, model development with Enterprise Miner and implementation Dr. Miltiadis Sarakinos Swisscom Fixnet AG, Switzerland
Analytical CRM: analysing customers and understanding their behaviour Data Analysis Data Mining - Predictive Modelling - Clustering/Segmentation -CLTV Campaign Evaluation - Response analysis - Channel analysis - Skill analysis - Effectiveness of data mining models and campaigns Market Research Campaign Design - Target group definition - Offer design - Customer contact programs - Channel, Skill Management - Control mechanisms Campaign Execution - Dialog programs - Retention management - Prevention management - Winback 2
The Data Warehouse Concept Operational Systems Data Mining Business Analytical Engine Campaign Design Model Analysis Billing Samples Campaign Management External Data DWH Data Mart CCDB Direct Mail Outbound Customer Reporting Campaign Performance Sales Force 3
Data Mining for CRM Data Mining Inbound Campaigns: Call centers Outbound Campaigns: Marketing Customer Contact: special offer, inform, etc. 4
Typical Predictive Modelling Scheme Model Calibration Nov Apr Model May Apply model to predict future Deliver: customer score for churn/winback and product affinity 5
Business Rule to select offering Example: Customer id 12345 inbound campaign Churn Prod A Prod B Age Risk Affinity Affinity Value Business Ruleset Propose Prod A 6
Closed Loop! Record customer response, including refusals! Do not duplicate offer: Do not disturb customer Save costs (mailing, call agent time etc)! Response analysis (mine reaction data, campaing results): Meaningful only if sufficiently large sample is available. 7
Data Mining 8
The name of the game(1): Data Quality! Internal Data are delivered with very high quality: no missing or wrong variables! Demographic Data At least 70-80% coverage 9
The name of the Game (2): Data Preparation! Join data from different sources:. Call center Internet Revenues SAS Creditwor thiness Portfolio Sociodem 10
The name of the Game (3): Data Preparation! Aggregate variables according to business sense, past experience, instinct: eg aggregate over calltimes, destinations, tarifs etc! Create several derivative variables: Subscription/Revenue Inland Traffic / Total Traffic! Avoid categorical attributes with a large number of variables. Group values within one attribute in a business relevant way: e.g. instead of nationality create a flag foreigner, etc.! Result: Initial number of variables used for model calibration explodes. Currently, around 450, growing tendency 11
The Name of the Game (4): What about model time stability?! The challenge: model must generalise over time. Make a model that predicts the future instead of just describing the past! What about customers going to vacation? Seasonal effects? February is 10% shorter than March.! Combine data over several months. Filter out short term fluctuations, noise.! Observation: our model profiles do not change dramatically over time, hence we believe our models to be stable 12
The Name of the Game (5): Variable Reduction! Typical variable selection methods: decision tree, logistic regression, variable selection node (chisq) but also variables which from experience play a role! Remove highly correlating variables! Prune variable set for final model: try to achieve maximal precision with the minimal variable set. Model is more likely to be predictive in this way.! Several iterations to clean up and believe the result 13
The Models 14
Churn/Winback 100% 100% Market Share Swisscom High Churn Rates Market Share Swisscom Low Churn Rates, Winbacks Other Carriers Other Carriers Monopole falls Today 15
Churn/Winback Modeling! Consumer has several choices: Carrier Preselection Call-by-call! Multiline (ISDN) users can have different preferred carrier for each number! Customer motives affected by various criteria/offers: ADSL, frequent flyer miles etc! Difficult and irrelevant to define in terms of switch to different preferred carrier.! Therefore: define in terms of revenue loss/gain! Churner profiles not strongly different from loyal customer profiles. Monthly rates are O(1%). Strongly differentiated segments would decay quickly hence selfdestruct 16
Typical Flow Neural Network Training Progress 17
Churn/Winback Model Residential Customers 6 Cumulative Lift 5 4 3 2 Churn Improvement with respect to random selection Winback 1 0 Random Selection Score Percentile 0 10 20 30 40 50 60 70 80 90 100 Example: The 10% of the customers with the highest score contain 30% of potential churners, the 30% with the highest score capture about 60% potential ISDN users etc. 18
Product Upselling! Analog Customers ----> ISDN Due to ADSL: ISDN no longer for faster surfing. Different value proposition: Incoming call number recognition, several simultaneous callers (families), SMS! Ebill: personalized secure login. Phone call list, daily update, manage account. Customer can suppress paper bill. Free product but: cost saving, increases retention 19
Product upselling Cumulative Gain 100 90 80 70 60 50 40 30 20 10 0 ebill ISDN Improvement with respect to random selection No model random selection Score Percentile 0 10 20 30 40 50 60 70 80 90 100 Example: The 10% of the customers with the highest score contain 34% of potential ISDN users, the 30% with the highest score capture about 65% of potential ISDN users etc. 20
Model Evaluation! Models have been implemented in the least few months! Response rate analysis requires larger samples 21
Thank you 22