Churn Prediction in MMORPGs: A Social Influence Based Approach



Similar documents
Analyzing Customer Churn in the Software as a Service (SaaS) Industry

Chapter 6. The stacking ensemble approach

How To Identify A Churner

CHURN PREDICTION IN MOBILE TELECOM SYSTEM USING DATA MINING TECHNIQUES

SOCIAL NETWORK ANALYSIS EVALUATING THE CUSTOMER S INFLUENCE FACTOR OVER BUSINESS EVENTS

Analyzing Human Behavior from Multiplayer Online Game Logs - A Knowledge Discovery Approach -

DATA MINING TECHNIQUES AND APPLICATIONS

Mining Telecommunication Networks to Enhance Customer Lifetime Predictions

Enhanced Boosted Trees Technique for Customer Churn Prediction Model

DORMANCY PREDICTION MODEL IN A PREPAID PREDOMINANT MOBILE MARKET : A CUSTOMER VALUE MANAGEMENT APPROACH

Network Interactions in Mobile Networks

Expert Systems with Applications

Prediction of Stock Performance Using Analytical Techniques

Making Sense of the Mayhem: Machine Learning and March Madness

Easily Identify Your Best Customers

Churn Prediction. Vladislav Lazarov. Marius Capota.

Data quality in Accounting Information Systems

Equity forecast: Predicting long term stock price movement using machine learning

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

IBM SPSS Direct Marketing 23

not possible or was possible at a high cost for collecting the data.

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

IBM SPSS Direct Marketing 22

Predicting Student Performance by Using Data Mining Methods for Classification

A Neural Network based Approach for Predicting Customer Churn in Cellular Network Services

User Modeling for Telecommunication Applications: Experiences and Practical Implications

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

Data Mining Yelp Data - Predicting rating stars from review text

Modeling Customer Lifetime Value Using Survival Analysis An Application in the Telecommunications Industry

Schneps, Leila; Colmez, Coralie. Math on Trial : How Numbers Get Used and Abused in the Courtroom. New York, NY, USA: Basic Books, p i.

Data Mining in Telecommunication

Social Media Mining. Data Mining Essentials

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Predicting Customer Churn in the Telecommunications Industry An Application of Survival Analysis Modeling Using SAS

Using Data Mining for Mobile Communication Clustering and Characterization

Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification

MHI3000 Big Data Analytics for Health Care Final Project Report

Predict Influencers in the Social Network

Churn Modeling for Mobile Telecommunications:

COMPARING NEURAL NETWORK ALGORITHM PERFORMANCE USING SPSS AND NEUROSOLUTIONS

Working with telecommunications

Azure Machine Learning, SQL Data Mining and R

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL

Data Mining. Nonlinear Classification

Customer Churn Prediction, Segmentation and Fraud Detection in Telecommunication Industry

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

Knowledge Discovery and Data Mining

Predicting Flight Delays

International Journal of Advanced Computer Technology (IJACT) ISSN: PRIVACY PRESERVING DATA MINING IN HEALTH CARE APPLICATIONS

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

Recognizing Informed Option Trading

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH

Distributed forests for MapReduce-based machine learning

Business Analytics and Data Mining for CRM Business Analytics and Data Mining for CRM: Jumpstart workshop

Providing a Customer Churn Prediction Model Using Random Forest and Boosted TreesTechniques (Case Study: Solico Food Industries Group)

IBM SPSS Modeler Social Network Analysis 15 User Guide

Getting Even More Out of Ensemble Selection

E-commerce Transaction Anomaly Classification

MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph

Introduction to Support Vector Machines. Colin Campbell, Bristol University

Predict the Popularity of YouTube Videos Using Early View Data

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

A Neuro-Fuzzy Classifier for Customer Churn Prediction

Benchmarking of different classes of models used for credit scoring

Data Mining with R. Decision Trees and Random Forests. Hugh Murrell

IDENTIFIC ATION OF SOFTWARE EROSION USING LOGISTIC REGRESSION

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

Mimicking human fake review detection on Trustpilot

FEATURE SELECTIONS AND CLASSIFICATION MODEL FOR CUSTOMER CHURN

Data Mining Techniques in Customer Churn Prediction

A DUAL-STEP MULTI-ALGORITHM APPROACH FOR CHURN PREDICTION IN PRE-PAID TELECOMMUNICATIONS SERVICE PROVIDERS

How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK

Dynamic Predictive Modeling in Claims Management - Is it a Game Changer?

DIGITS CENTER FOR DIGITAL INNOVATION, TECHNOLOGY, AND STRATEGY THOUGHT LEADERSHIP FOR THE DIGITAL AGE

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

An Overview of Knowledge Discovery Database and Data mining Techniques

Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100

Issues in Information Systems Volume 16, Issue IV, pp , 2015

Scalar Valued Functions of Several Variables; the Gradient Vector

Server Load Prediction

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

Data Mining Algorithms Part 1. Dejan Sarka

Mining Life Insurance Data for Customer Attrition Analysis

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

How To Predict Churn On A Telecom Social Network

Chapter 12 Discovering New Knowledge Data Mining

Leveraging Ensemble Models in SAS Enterprise Miner

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

DeuceScan: Deuce-Based Fast Handoff Scheme in IEEE Wireless Networks

Data mining and statistical models in marketing campaigns of BT Retail

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET)

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: X DATA MINING TECHNIQUES AND STOCK MARKET

Data Mining Practical Machine Learning Tools and Techniques

Customer Classification And Prediction Based On Data Mining Technique

Transcription:

Churn Prediction in MMORPGs: A Social Influence Based Approach Jaya Kawale Dept of Computer Science & Engg University Of Minnesota Minneapolis, MN 55455 Email: kawale@cs.umn.edu Aditya Pal Dept of Computer Science & Engg University Of Minnesota Minneapolis, MN 55455 Email: apal@cs.umn.edu Jaideep Srivastava Dept of Computer Science & Engg University Of Minnesota Minneapolis, MN 55455 Email: srivasta@cs.umn.edu Abstract Massively Multiplayer Online Role Playing Games (MMORPGs) are computer based games in which players interact with one another in the virtual world. Worldwide revenues for MMORPGs have seen amazing growth in last few years and it is more than a 2 billion dollars industry as per current estimates. Huge amount of revenue potential has attracted several gaming companies to launch online role playing games. One of the major problems these companies suffer apart from fierce competition is erosion of their customer base. Churn is a big problem for the gaming companies as churners impact negatively in the wordof-mouth reports for potential and existing customers leading to further erosion of user base. We study the problem of player churn in the popular MMORPG EverQuest II. The problem of churn prediction has been studied extensively in the past in various domains and social network analysis has recently been applied to the problem to understand the effects of the strength of social ties and the structure and dynamics of a social network in churn. In this paper, we propose a churn prediction model based on examining social influence among players and their personal engagement in the game. We hypothesize that social influence is a vector quantity, with components negative influence and positive influence. We propose a modified diffusion model to propagate the influence vector in the players network which represents the social influence on the player from his network. We measure a players personal engagement based on his activity patterns and use it in the modified diffusion model and churn prediction. Our method for churn prediction which combines social influence and player engagement factors has shown to improve prediction accuracy significantly for our dataset as compared to prediction using the conventional diffusion model or the player engagement factor, thus validating our hypothesis that combination of both these factors could lead to a more accurate churn prediction. I. INTRODUCTION Massively Multi-player Online Role Playing Games (MMORPGs) games is a genre of computer games in which players can assume a role or a fantasy character and interact with one another in a virtual game world. One of the distinguishing factors of role playing games is a persistent world and a never ending quest for exploration. MMORPGs have achieved significant growth in the past few years and according to Mmorg Chart 1 the total number of active subscriptions in 2008 was over 16 million. There are a number of popular game providers battling for the market share and some of the 1 http://www.mmogchart.com/ more prominent ones are World of Warcraft 2, Lineage 3, Final Fantasy XI 4, Eve Online 5 and EverQuest II 6. Churn analysis assumes importance for game publishers as it helps them understand the several factors leading to users leaving the game. It holds the key in understanding the behavior of players and the various factors that influence players to leave the game ranging from personal commitment, competing products, shifting interest to social influence. The target audience of MMORPGs is very large. By understanding such factors, the game provider can offer incentives to likely churners in order to keep them interested in the game. Additionally, acquiring new customers can be much more expensive than retaining the old ones. Churn Analysis has been widely studied several domains. A variety of churn analysis techniques have been developed as a solution to identifying the subset of customers who are likely to churn. In this paper we examine churn among players in the virtual world of EverQuest II. The virtual worlds of the online games provide a good indicator of understanding human behavior in the real world and this has led to increased interest from fields like social sciences, psychology and communications. A. Our Contribution We propose a churn prediction model that takes into account social influence of players and their engagement in the game. In our model we consider that player influence is a vector quantity (instead of scalar, as used in typical diffusion models) with two components: 1) Negative influence, 2) Positive influence. The two components of influence vector signify how strongly a player feels for and against the game. We argue that an influence vector models the real world more accurately. Based on players engagement, we propose a modified diffusion model (MDM), which takes into account the players game engagement and social influence from neighbors in influence propagation. 2 http://www.worldofwarcraft.com/index.xml 3 http://www.lineage.com/ 4 http://www.playonline.com/ff11us/index.shtml 5 http://www.eveonline.com/ 6 http://everquest2.station.sony.com/

II. LITERATURE REVIEW Churn Prediction is an important problem studied across several areas like banking, insurance, retailing, telecommunications, etc. A wide variety of techniques have been applied to predict churn in the diverse applications. A decision tree based approach has been most widely used in the churn prediction. Classification and decision trees have been used to determine either the class label or the churn risk. CHAMP [1] (Churn Analysis, Modeling, and Prediction) predicts churn factors for cellular phone customers using a decision tree model. Decision tree approach to predict churn using complaints data has been found to perform better in comparison with neural networks and regression [2]. Decision trees have been used to determine classification rules from which the most significant variables could be identified [3]. Another popular technique used for churn prediction is logistic regression [4], [5]. Latent semantic analysis has been used in predicting policy churn in insurance industries [6]. Survival analysis is another class of statistical techniques that is used to model time to event data. It is useful to provide answers to questions such as what fraction of a population will survive after a time interval. Survival analysis techniques have been used to predict churn in the telecommunications industry [7] and in predicting switching behavior in banking services [8]. [9] has used ordinal regression for churn prediction. Customer tenure is being modeled as an ordinal response variable to predict the time to churn. Some of the recent work in this area has used Support Vector Machines [10], [11], [12] and Random forests [11] as classifiers. [11] has compared the use of SVM, Random forests and LR in predicting churn in newspaper services. It has been shown that Random Forests perform better than SVM. Social Network Analysis has been used to predict customer churn in mobile networks [13]. Network analysis has been used in the past to identify influential in a network for targeting individuals in marketing campaigns [14]. [13] uses an energy propagation model to spread influence in the telecom social network. The use the frequency of calls between two people to define the strength of ties between two people. Using the underlying graph, they initiate a diffusion process where the churners are taken as a seed and they spread influence in the network. The amount of influence every neighbor of a churner gets depends upon his relative strength of the tie in the neighborhood. Once a non churner node accumulates sufficient amount of energy it is labeled as a churner and it starts propagating energy on its own. III. DATASET DESCRIPTION AND ANALYSIS We used game data from Sony Online Corporation s popular game EverQuest II, which provides a platform for players to group together with other players and participate in activities like complete quests, explore the world, kill monsters and gain treasures and experience. The dataset consists of complete session data for the month of August 2006. Session data contains the time and length of session, quests played by users, points received by the users. The data also contains list of churners for the month of August, September, October 2006. We use the dataset to build the network graph and model player engagement in the game. TABLE I GRAPH CHARACTERISTICS Characteristics Value Number of Nodes 6213 Number of Edges 153983 Average degree of Nodes 24.78 Average experience points shared 210897 We build the player graph based on grouping of players, where an edge exists between two players if they participate in the quest together. The weight of the edge indicates the strength of the tie between the two players, which in this case is the total number of points shared between them. Table I describes the graph characteristics of the dataset. The degree distribution of players in the graph follows a power law and is as shown in Fig. 1. Fig. 2 shows the total number of active players in a given month over the period of time. Fig. 1. Fig. 2. Degree distribution of all players Total number of active players The data provided by Sony also contains the list of EQ2 players who unsubscribed from the game in the month of august, september, october. Table II shows the number of churners for the given months. Fig. 2 shows the total number of active players in a given month.

TABLE II NUMBER OF CHURNERS Month Churners August 334 September 308 October 380 Network based analysis for churn prediction [13] have the basis that ties of an individual with other individuals in the network are good predictors of their interest in the games. They have successfully shown improvement in churn prediction using the energy propagation model. To understand the social effects of churn behavior in players, we examine the probability of nodes to churn in a network given that k of their immediate neighbors have churned. Fig.3 shows the probability of players from the August graph to churn in the subsequent months given that k of his neighbors have churned. As seen in the figure the probability of churn increases with increase in the number of churner neighbors. This explains that churn behavior has a social component. session lengths were longer in the initial periods than in the later periods of time. TABLE III BETA FUNCTION TO FIT THE PLOT OF CHURNERS Alpha vs Beta Percentage General Shape α = β 0.02% Symmetrical α < β 75% Positively skewed α > β 25% Negatively skewed Fig. 4. Average session length of non churners and churners Fig. 3. Probability to Churn in subsequent months A. Player Engagement Player engagement can be defined as the time spent by the player in the game. We measure player engagement using the session time and length. We hypothesize that the player engagement over a long span of time can be used to predict churn behavior. Fig. 4 shows the distribution of average session lengths for non churners and churners. It can be seen that the churners show a decreasing average session length as compared to non churners. This could partly explains their waning interest in the game. We further studied the individual plots of the august churners over a long period of time and tried to fit Beta function to it. Fig. 5 shows a sample distribution fit for a churner. A beta function is defined by two shape parameters α and β which determine the shape of the distribution. Table III-A shows the distribution of alpha and beta values of the Beta distribution. From the table it can be seen that for nearly 2/3 rd of the Churners the distribution was positively skewed. In other words it means that for nearly 2/3rd of the Churners, the Fig. 5. Positively skewed player engagement of a churner IV. METHODOLOGY In the previous section we identified two main aspects that are predominant in churners in our game dataset 1) decrease in player engagement of churners over time until they finally churn, 2) increase in churn propensity of players with the increase in the number of churner neighbors. Based on these two observations we hypothesize that the social influence and player engagement combined together can be good measures for churner prediction. We define churn prediction function, P that takes two factors to predict whether a player would churn in subsequent months or not. Churn Prediction = P (Player engagement, Social influence) (1)

A. Preliminary Definitions We represent player network in the form of a weighted undirected graph G(V, E) where V (G) is the set of vertices corresponding to the players and E(G) is the set of edges between the vertices, such that two vertices in V (G) share an edge if the corresponding players have played a game together and the weight of the edge equals the total number of points shared between the two players. Let N(x) = {y V (G) : (x, y) E(G)} be the set of neighbors of x. Let e xy represent the weight of the edge between x and y, e x be the sum of the weight of all edges from x. Let Φ be the function that models player s engagement, such that Φ x (t) represents x th player s engagement in the game at time t. We can define slope S x (t) of player x at time t as follows: S x (t) = dφ x(t) dt Φ x(t + δt) Φ x (t δt) δt We are more interested in the sign of the slope rather than the magnitude of it, hence the approximation. The main reason for not considering the magnitude of slope is that it would require us to calculate weighted average of slope and very large values of slope could skew the weighted values and the model would become biased towards some localized points. Also, to keep the model simple and computationally inexpensive, we consider just the sign of slope as an indicator of player s interest in the game. B. Modified Diffusion Model In our diffusion model, we consider every node to have an influence vector containing two components, namely, positive influence and negative influence. Negative influence represents how much user is influenced against the game whereas positive influence represents how much user is influence in favor of the game. Two valued influence vectors helps in modeling the real world more accurately. The intuitive argument in favor of influence vector is that when player s communicate they share their good/bad experiences with each other. So a player acquires both good and bad opinion about the game from others. A player strongly interested in the game would subdue the bad things he hear and remember the good things and vice-versa. Additionally, when he tries to influence other players, he would maintain the positive influence on him and try to spread positive influence on others. In standard diffusion model, a node has to give some of his influence to others, thereby leaving less influence with a node believing strongly. In our diffusion model, node preserves his positive influence as it is (he might gain more from his neighbors) and converts his negative influence to positive and spreads on his neighbors. So in a way user s positive influence is conserved and he is able to positively influence other nodes. We observe that the standard diffusion model do not show very encouraging results on our dataset, whereas the modified diffusion model shows much better results. The spread factor, γ is the portion of influence user transfers to his network. A user with increasing game engagement (2) would convert some of his negative influence to positive influence and spread γ proportion of converted influence amongst his neighbors in proportion with strength of tie with the neighbors. The total influence of the graph remains constant, whereas the positive influence or negative influence values change. We initialize churners with negative influence ni = 1 and positive influence pi = 0 and non-churners with ni = p and pi = 1 p initially. The following algorithm illustrates the propagation step for a given node x at time t: if S x (t) < 0 then {convert positive energy to negative and propagate} if ni(x) < pi(x) then i = ni(x) else i = pi(x) end if pi(x) = pi(x) i (1 γ) {spread i γ to neighbors} for y N(x) do ni(y) = ni(y) + i γ e xy e x end for end if {if S x (t) > 0, interchange ni(.) with pi(.) in above step} {if S x (t) = 0, do not consider this node} Modified diffusion model is run on player network of August 2006. The algorithm starts with initial time of 1st August 2006 taking discrete steps of 1 hour and finishes at 31st August 2006. At any given time, the algorithm iteratively runs on all graph nodes and applies the propagation step as described above. After the algorithm stops execution we store the influence vector, which represents the influence on this user from the player network. The influence vector is used for predicting the churners and non-churners. C. Churn Prediction We use the shape parameters of player engagement curve. The beta function fits the player engagement best as per our analysis and the α and β values of beta function constitute the shape parameters of the player engagement curve. We took shape parameters and the influence vector computed using the modified diffusion model and ran several classification algorithms in order to classify the nodes as churners and non churners. V. EXPERIMENT AND RESULTS The main premise of our approach is that churn prediction can be captured effectively using engagement of a player in the game and the social influence on a player by his gaming buddies. To validate our approach we ran experiments with three different models: 1) Simple Diffusion Model. 2) Classification based on Network and Player engagement.

Fig. 6. Alpha vs Beta plot of all players C. Modified Diffusion Model Finally, we run the Modified Diffusion Model using the player graph for the month of Aug 06. We propagate the influence vector for players based on their engagement curve slope over the time period of game play in the month of August. Table IX shows the parameters value that we used in our modified diffusion model. TABLE IX PARAMETER VALUES USED IN MDM Parameter Values Initial energy for Churners pe = 0, ne = 1 Initial energy for non Churners pe = 0.8, ne = 0.2 Spreading Factor 0.7 3) Modified Diffusion Model. A. Simple Diffusion Model The simple diffusion model is the same as defined in [13]. The parameters of the simple diffusion model are as shown in the Table IV. The prediction accuracy for the simple diffusion model is shown in Table VI. TABLE IV PARAMETER VALUES IN SIMPLE DIFFUSION MODEL Parameter Values Initial energy for Churners e = 1 Initial energy for non Churners e = 0 Spreading Factor 0.7 B. Classification based on Player Engagement and Network Statistics To study the effect of player engagement and his raw network characteristics on churn prediction, we built another model which consisted of the feature set as defined in Table V. We used WEKA [15] for classification of churners and nonchurners based on our training and test set as described in Table X. The shape parameters are the α and β parameters of the beta function that captures the player engagement the best. Fig. 6 show the plot of α and β for all the players. It indicates that churners are concentrated in a small region whereas nonchurners are spread out. The summary of the results using various classifiers is shown in Table VII. The improvement in prediction accuracy over the simple diffusion model shows that engagement features and raw network characteristics are better predictor of churners over our game dataset. TABLE V ENGAGEMENT AND RAW NETWORK CHARACTERISTICS The shape parameters to fit the session length curve were obtained by fitting beta function on the session plots from Jan 06 to Aug 06. We ran several classifiers on the final influence vector and the shape parameters. Table X shows the description of test and training data as used by all the classifiers. Table VIII shows the precision and recall values given by the various classification techniques. VI. CONCLUSION The results show that MDM outperformed both a Simple Diffusion Model and a Network and Engagement Feature based classification. A simple diffusion model is able to capture social influence among game players and a network and engagement feature based classification is able to capture player engagement features. An MDM scheme is able to effectively combine social influence and player engagement and provide a significant improvement in prediction accuracy. We however have not compared our approach with features that might not capture player engagement and network features. Such an approach could possibly give more prediction accuracy, however it was not the main focus of the paper. Our results show a significant improvement in prediction accuracy by combining social influence along with player engagement in churn prediction. A simple diffusion model is able to capture influence amongst game players. However churn in a game is also a function of the engagement of the player in the game. We effectively combine the two features to achieve significant amount of prediction accuracy improvement over a simple diffusion method. VII. FUTURE WORK For our future work we intend to do a deeper analysis of variables useful in capturing the engagement of a player in the game. For this paper we used the average length of sessions as an indicator of engagement in game play. However, we Feature Variable Alpha Beta Number of neighbors Number of churner neighbors Characteristic Engagement Engagement Network Network TABLE X TRAINING AND TESTING DATASET Dataset Size Number of Churners Training 4026 334 (Aug) Testing 2187 688 (Sept + Oct)

TABLE VI SIMPLE DIFFUSION MODEL Method Precision Recall Correct Predicted Total Predicted Simple Diffusion Model 17.9 11.2 77 430 TABLE VII NETWORK FEATURE SET WITH DIFFERENT CLASSIFIERS Method Precision Recall Correct Predicted Total Predicted AdaBoostM1 42.8 14.7 101 236 ADTree 42.9 12.2 84 196 JRip 43.2 18.8 129 299 J48 47.3 11.3 78 165 NaiveBayes 46.8 12.6 87 186 TABLE VIII MODIFIED DIFFUSION MODEL WITH DIFFERENT CLASSIFIERS Method Precision Recall Correct Predicted Total Predicted AdaBoostM1 50.1 29.8 205 409 ADTree 46.5 41.3 284 611 JRip 43.1 18.8 129 299 J48 38.5 21.5 148 384 NaiveBayes 49.7 23.3 160 322 could further use other variables in the EverQuest II data like the quests completed or the health of a character or the achievement points gained which would provide a deeper insight into engagement of a player. Another aspect we want to study is the impact of the current engagement of a player on the group and the impact of the engagement of the group on a player. If all the players in the group tend to move towards the mean engagement in the game, this would imply that a players engagement in the game is also a function of the groups engagement in the game. The influence propagation can possibly accommodate this. Classical models of Churn prediction provide a RFM (Recency Frequency and Money) analysis of Churn. In future, we will work analyzing player engagement from the RFM perspective. The recency and frequency values in our case could possibly be defined by the recency and frequency of sessions and the money value can be defined by the amount of time a player puts in the game play. ACKNOWLEDGEMENTS The research reported herein was supported by the National Science Foundation via award number IIS-0729421, and the Army Research Institute via award number W91WAW-08- C-0106. The data used for this research was provided by the SONY corporation. We gratefully acknowledge all our sponsors. The findings presented do not in any way represent, either directly or through implication, the policies of these organizations. We also thank Nishith Pathak, Muhammad Ahmad, Rakesh Ramakrishnan, Kyong Jin Shim, Colin Delong for their helpful suggestions and feedback. REFERENCES [1] P. Datta, B. Masand, D. R. Mani, and B. Li, Automated cellular modeling and prediction on a large scale, Artif. Intell. Rev., vol. 14, no. 6, pp. 485 502, 2000. [2] J. Hadden, A. Tiwari, R. Roy, and D. Ruta, Churn prediction using complaints data, in Proceedings Of World Academy Of Science, Engineering and Technology. [3] K. Ng and H. Liu, Customer retention via data mining, Artif. Intell. Rev., vol. 14, no. 6, pp. 569 590, 2000. [4] W. Buckinx and D. V. D. Poel, Customer base analysis: Partial defection of behaviorally-loyal clients in a non-contractual fmcg retail setting, Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium 03/178, Ghent University, Faculty of Economics and Business Administration, May 2003. [5] D. L.. B. S. E. Jones, Michael A. ; Mothersbaugh, Switching barriers and repurchase intentions in services, Journal of Retailing, 2000. [6] K. Morik and H. Köpcke, Analysing customer churn in insurance data: a case study, in PKDD 04: Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, (New York, NY, USA), pp. 325 336, Springer-Verlag New York, Inc., 2004. [7] J. Lu, Predicting customer churn in the telecommunications industry an application of survival analysis modeling using sas, [8] G. I. Maria Mavri, Customer switching behaviour in greek banking services using survival analysis, Managerial Finance, vol. 34, pp. 186 197, 2008. [9] R. K. Gopal and S. K. Meher, Customer churn time prediction in mobile telecommunication industry using ordinal regression, in PAKDD, pp. 884 889, 2008. [10] Customer churn prediction using improved one-class support vector machine, vol. 3584/2005, pp. 300 306. [11] V. Coussement K, Churn prediction in subscription services: An application of support vector machines while comparing two parameterselection techniques, vol. 34(1), pp. 313 27, 2008. [12] T.-S. Lee, C.-C. Chiu, Y.-C. Chou, and C.-J. Lu, Mining the customer credit using classification and regression tree and multivariate adaptive regression splines, Computational Statistics and Data Analysis, vol. 50, pp. 1113 1130, February 2006. [13] K. Dasgupta, R. Singh, B. Viswanathan, D. Chakraborty, S. Mukherjea, A. A. Nanavati, and A. Joshi, Social ties and their relevance to churn in mobile telecom networks, in EDBT 08: Proceedings of the 11th international conference on Extending database technology, (New York, NY, USA), pp. 668 677, ACM, 2008. [14] S. Hill, F. Provost, and C. Volinsky, Network-based marketing: Identifying likely adopters via consumer networks, Statistical Science, vol. 22, no. 2, pp. 256 275, 2006. [15] S. R. Garner, Weka: The waikato environment for knowledge analysis, in In Proc. of the New Zealand Computer Science Research Students Conference, pp. 57 64, 1995.