Quality Technology & Quantitative Management Vol. 12, No. 1, pp. 93-104, 2015 QTQM ICAQM 2015 All Models are Wrong but Some are Useful: the Use of Predictive Analytics in Direct Marketing Barry Leventhal * BarryAnalytics Limited, London, UK (Received September 2013, accepted February 2014) Abstract: This paper discusses the use of predictive analytics techniques in direct marketing (DM) and presents the CRISP-DM process for data mining. It explains that DM models generally perform poorly in terms of model-fitting criteria, but illustrates how they are assessed, using gains analyses and lift charts, and how they are shown to have significant business value. Keywords: Analytical model, CRISP-DM, data mining, direct marketing, gains analysis, lift chart, predictive analytics, ROC curve. 1. Introduction T his paper was inspired by Box s famous quote: Essentially, all models are wrong, but some are useful (Box and Draper [1]), because there can be no industry in which this maxim is truer or more appropriate than in DM. Yet, for the last forty years or more, marketing statisticians have been applying predictive analytics techniques and businesses have been reaping the benefits. Those business benefits can be substantial, even though the model might be considered very wrong. For this reason, marketing statisticians generally employ alternative model evaluation methods rather than using traditional statistical diagnostics of accuracy. The main purpose of this paper is to explain how Box s quote applies in DM and show how statisticians set out to identify models that will be wrong but useful. Background to the Use of Models in DM McCorkell and Holder [11] define DM as a process for communicating a message to individuals, with a view to obtaining a response. A variety of different media may be used to deliver these messages, including direct mail, telephone and email. Therefore, by definition, DM is an ideal discipline for setting up tests or experiments to measure how alternative messages ( treatments ) perform, which media are most effective and which audiences or segments are most responsive. As Webber [15] explains, in a review on the evolution of DM, the practice originated in the late 19 th century and since then has continuously evolved to take advantage of technological advances. In recent years, new business processes have been introduced, including the centralisation of information into a single customer view and the transition to interacting with customers across multiple channels principally online via the Internet * Corresponding author. E-mail: barry@barryanalytics.com
94 Leventhal and offline via branches, post and telephone. According to Webber, the release of statistical software packages such as SAS and SPSS, from the late 1960s onwards, meant that statisticians were able to use multivariate models to rank customers on a continuum from high to low, in terms of predicted behaviour. This is how the use of predictive analytics in DM began and has continued to the present day, albeit now using a wider range of technologies, analytical tools and techniques. Why and How Are Predictive Models Used? DM works by identifying groups of people who will be the best recipients for marketing messages. For example, some of the groups and messages could be: Non-customers who are most likely to be interested in the company s products and services for offering a trial Existing customers who are most likely to purchase additional products for communicating cross-sell or up-sell promotions Existing customers who may be thinking of switching to a competitor s product for making a retention offer Each of these activities could form the basis for a separate marketing campaign, designed to recruit new customers, generate increased sales or reduce attrition (respectively). In each case, direct marketers are not really interested in model predictions for individual customers. Instead, it s the ability of a model to identify a group of, say, 20% of customers who are most likely to exhibit the target behaviour that s particularly important to the marketer. Table 1 contains further examples of DM applications of predictive analytics and the typical modelling techniques that are likely to be employed. Direct Marketing Application Customer recruitment from a prospect database Cross-sell / up-sell campaign Next best offer Table 1: Examples of DM applications of predictive analytics. Business Questions Which prospects are most likely to purchase a product/service? Which customers of Product X are most likely to purchase Product Y? Which customers of Product Z are most likely to purchase more of Product Z? Which product/service is each customer likely to purchase next? Typical Modelling Techniques Decision Tree / Logistic Regression Decision Tree / Logistic Regression / Neural Net Decision Tree / Logistic Regression / Neural Net Customer retention Customer lifecycle management Win-back campaign Customer future value or lifetime value Which customers are most likely to lapse or attrite (or cease using a product) How long before each customer becomes likely to lapse or attrite? Which past customers are most likely to respond to a win-back offer What is the predicted future value of purchasing, or contributions to profit, for each customer Decision Tree / Logistic Regression / Neural Net Survival analysis Decision Tree / Logistic Regression / Neural Net Multiple regression
All Models Are Wrong but Some Are Useful 95 The Use of Descriptive Models It would be misleading to suggest that DM only employs predictive models descriptive techniques are also used, such as multivariate analysis methods for customer segmentation. Segmentation enables a company to offer more tailored products and services to its customers, according to the needs and value of each segment. However, segmentation is generally seen to be a more strategic tool, while predictive models are used in ways that are more tactical therefore most day-to-day model development and deployment activities tend to involve predictive methods. For this reason, this paper mainly focuses on predictive models which is where Box s maxim is most apposite. 2. How DM Models Are Developed and Deployed As we have explained, the value of a model comes from its deployment to select a subset of customers to be targeted, yielding an improved return over other methods of selection. This process of developing and deploying DM models is one example of Data Mining, which may be defined more generally as: A process of discovering and interpreting previously unknown patterns in data to solve business problems (Kashner and Zaima [9]). In 1996, a consortium of companies jointly agreed on a standard set of phases and detailed tasks for data mining this process, known as the Cross Industry Standard Process for Data Mining, or CRISP-DM (Chapman et al. [2]), describes the lifecycle of a data mining project as a series of six phases. The high-level process is illustrated in Figure 1, while the phases in the process are summarised in Table 2. Figure 1. The Data Mining Process (CRISP-DM).
96 Leventhal While all six phases are essential to the delivery of a data mining project, Data Preparation usually takes most time it will typically account for 60% to 70% of the total work effort and is critical to the project s success. The following tasks are included in the Data Preparation phase: (1) Data are extracted from one or more sources (e.g. from tables in a relational database), manipulated into a consistent format (e.g. summarised or aggregated), and joined together into a single file or table. (2) Variables are profiled, examined for usefulness and transformed or recoded as required. (3) New variables are derived in order to improve model performance. The resulting analytic dataset is a file or table containing all the data to be used for a specific analysis and modelling project. Table 2. Phases in the data mining process. Business Understanding This initial phase focuses on understanding the project objectives and requirements from a business perspective, and then converting this knowledge into a data mining problem definition, and a preliminary plan designed to achieve the objectives. Data Understanding The data understanding phase starts with an initial data collection and proceeds with activities in order to get familiar with the data, to identify data quality problems, to discover first insights into the data, or to detect interesting subsets to form hypotheses for hidden information. Data Preparation The data preparation phase covers all activities to construct the final dataset (data that will be fed into the modelling tool(s)) from the initial raw data. Data preparation tasks are likely to be performed multiple times, and not in any prescribed order. Tasks include table, record, and attribute selection as well as transformation and cleaning of data for modelling tools. Modelling In this phase, various modelling techniques are selected and applied, and their parameters are calibrated to optimal values. Typically, there are several techniques for the same data mining problem type. Some techniques have specific requirements on the form of data. Therefore, stepping back to the data preparation phase is often needed. Evaluation At this stage in the project you have built a model (or models) that appears to have high quality, from a data analysis perspective. Before proceeding to final deployment of the model, it is important to more thoroughly evaluate the model, and review the steps executed to construct the model, to be certain it properly achieves the business objectives. A key objective is to determine if there is some important business issue that has not been sufficiently considered. At the end of this phase, a decision on the use of the data mining results should be reached. Deployment Creation of the model is generally not the end of the project. Even if the purpose of the model is to increase knowledge of the data, the knowledge gained will need to be organized and presented in a way that the end-client can use it. Depending on the requirements, the deployment phase can be as simple as generating a report or as complex as implementing a repeatable data mining process. In many cases it will be the end-client, not the data analyst, who will carry out the deployment steps. However, even if the analyst will carry out the deployment, it is important for the end-client to understand up front what actions will need to be carried out in order to actually make use of the created models. Source: Modified from http://www.crisp-dm.org/process/index.htm
All Models Are Wrong but Some Are Useful 97 Another key feature of CRISP-DM is the separation of the Modelling and Deployment phases this has several implications: (1) The modelling phase is likely to involve analysis of a representative sample from the database (typically up to 100,000 records). There are several reasons for preferring to use a sample for model development, rather than the entire customer base (which may contain millions of records). First, some analytical techniques are computationally intensive and so, while model precision increases with sample size, there is little to be gained once the size exceeds 100,000 cases. However, more importantly, modelling on an excessive sample size is likely to produce an over-fitted model which will fail to be useful when put into practice. (2) Prior to model-building, it is usual for the dataset to be split into either two or three development and testing/validation sub-samples, so that the results can be assessed on data that were not used for fitting the model. (3) In most business applications of data mining, the modelling phase takes place only once, (until the model next needs to be updated), however the deployment phase is carried out often on a regular basis e.g. to update customers propensity scores each month. Given these differences between Modelling and Deployment, the user may decide to deploy separate software solutions for these parts of the process. For example, a logistic regression model might be developed using statistical software, in order to obtain a regression equation. The calculation of this equation could then be programmed for application to the company s database, in order to generate model scores for all customers. Alternatively, and preferably (in order to save time and avoid programming errors), an automated method of communicating models would be used for example, a machine-readable method of model definition, known as Predictive Model Mark-up Language (PMML), has been developed for this purpose (see: http://www.dmg.org/). Evaluation includes assessing the model against criteria that were agreed at the outset, during the Business Understanding phase. This involves producing a gains analysis and lift charts to predict the improvement in targeting that will result from deploying the model the importance of these measures is discussed later in this paper. The final phase of the process, Deployment, puts the model into action typically by contacting a group of individuals selected according to their model scores, as part of a marketing campaign. The deployment stage is used to test whether a model is useful and measure its benefit, by including a control group that was randomly selected from the same population. Then, by comparing response rates to the campaign for the model group vs. the control group, the actual value of the model can be derived. One of the key strengths of DM is that it involves capturing responses and measuring performance. Therefore, even after a model has been proven to be useful, its deployment should continue be monitored by using a control group as part of each execution of the campaign. This monitoring is necessary in order to identify when the model ceases to deliver improvements in targeting which could occur if the market starts to change. Software Options for Data Mining Many different products are available for carrying out data mining these primarily fall into two groups: statistics software and data mining packages. Statistics software packages are useful for tasks such as data manipulation and transformation, analysis and model development. Some examples of current packages used are IBM SPSS Statistics, MATLAB, Minitab, R, S~PLUS, SAS, Stata, Statistica, SYSTAT.
98 Leventhal Similar facilities are provided by data mining packages however these products are designed to speed up the process by automating some of the operations. For example, users of DM may hold hundreds of variables summarising the behaviour of their customers, therefore simple tasks such as profiling and transforming variables can become very time-consuming when being applied in a large-scale data warehouse environment. The workload is compounded when there are large numbers of marketing models to be built, e.g. for all of the company s products and all regions and segments which the company serves. In these situations, data mining packages can be useful to speed up the data preparation tasks or carry out model development as an automated process. Some examples of data mining packages used are FiCO Model Building, IBM Smart Analytics System, IBM SPSS Modeler, KnowledgeSTUDIO, KXEN Analytic Framework, Oracle Data Mining, Portrait Customer Analytics, RapidMiner, SAS Enterprise Miner, SQL Server Analytics Server, Teradata Warehouse Miner, TIBCO Spotfire Miner. As companies start to use more and more models, issues such as version control, scheduling model runs and model performance monitoring become increasingly taxing. This has led software vendors to start offering model management tools to aid the heavy users. Current products in this space include IBM Predictive Enterprise Services, KXEN Modelling Factory, SAS Model Manager and Teradata Model Manager. How Well Do DM Models Work? In terms of measures such as R 2 type statistics, predictive models are universally poor at predicting DM outcomes. In scientific disciplines, researchers might reasonably expect a model R 2 to exceed 50%, and so be satisfied that the majority of variation has been explained. However, marketing statisticians often have to accept R 2 values of less than 10% - to paraphrase Box, their models are very wrong but could they still be useful? The main reason why marketing models suffer from very low R 2 values is that they are trying to predict future decisions, e.g. purchase or attrition during the next business cycle, using past behaviour and decisions - but usually without the knowledge of people s actual plans or intentions. Typically, the information that s stored in a customer database covers subject areas such as: Customer contact details. Details of products and services that have been purchased in the past. Details of purchasing/usage behavior. Responses to promotions received. Therefore, marketing models are usually unable to include factors such as customers current needs and interests, or existing market conditions. Thus, the outcome is often poorer model performance, high error variances or noise, and low R 2 values (or equivalent statistics). One exception occurs where the database is able to identify customer behaviours or events which indicate that a new purchase, or switch to a competitor s product, has become likely. For example a loan provider might discover that a customer is considering a change, when they make an enquiry about the outstanding balance or exit terms. If such enquiries are recorded in the database, then they can be used as predictive variables an event-based attrition model.
All Models Are Wrong but Some Are Useful 99 Could DM Models Still Be Useful? A more common way to examine and demonstrate the accuracy of DM models is via gains analyses and lift charts the latter are very similar to Receiver Operating Characteristics (ROC) curves see, for example, Fawcett [3]. These more descriptive methods tend to be preferred over R 2 statistics for several reasons: (1) Gains analyses and lift charts may be grasped and interpreted by people with no knowledge of statistics, whereas the model R 2 does require statistical understanding. (2) The results of these analyses have direct implications for the business, whereas the R 2 value does not. (3) When squared, a small positive correlation between actual and predicted outcomes gives a very low R 2, however the gains/lift chart will demonstrate the usefulness. As an example of the last point, suppose that a binary classifier has been developed to predict a negative or positive outcome the resultant relationship between the actual and predicted values is shown in Table 3. Table 3. Example for binary classifier. Actual Predicted positive negative Total positive 300 (60%) 200 (40%) 500 (100%) negative 200 (40%) 300 (60%) 500 (100%) total 500 (50%) 500 (50%) 1000 (100%) The correlation (R) may be calculated direct from the cell counts in Table 3, as the phi coefficient = (300 2 200 2 )/ 500 500 500 500 = 0.2. Therefore the R 2 value for this classifier is 0.04. However, from Table 3 it can be seen that the classifier increases on the proportion of positives identified, from 50% to 60%. In other words, the model picks up 20% more positives than a random selection of the same size (300 vs. 250). This might be deemed to be a useful targeting tool by the marketer, but probably would have been regarded as poor on R 2 grounds alone. More generally, Randolph and Edmonson [13] discuss the use of the binomial effect size display (BESD) to present the magnitude of effect sizes to an evaluation audience, and discuss the wider issues in making model statistics more meaningful to non-statistical users. For the above situation of two groups with equal sample sizes and homogeneous variances, they show that the predicted target group will have success rate (0.5 + R/2), where R is the correlation (given by the phi coefficient). For the predicted non-target group, the predicted success rate will be (0.5 R/2). Therefore the targeting success rate depends linearly on R, unlike the traditional goodness-of-fit that depends on R 2. 3. Evaluation of Model Performance Using Gains Analysis and Lift Charts A gains analysis presents numbers of customers and positive/negative outcome rates analysed by model score bands, in order to examine how outcomes are distributed across model scores. Depending on the total sample size, the analysis will typically split the predicted model scores into around 10 to 20 bands based on quantiles. An example of a gains analysis is shown in Table 4, for a model to predict lapsing of a (hypothetical) financial product.
100 Leventhal Separate gains analyses are produced for the model development and validation subsets the validation sample results give the best indication of how the model is likely to perform in the future, if market conditions remain unchanged. However the comparison with the development sample results is also useful, as a method of detecting over fitting i.e. a model that discriminates much better on the development file than on the validation file. Model score Band Non -lapsers Number Table 4. Example gains analysis (validation sample). % of nonlapsers Cum. % of non-lapsers Lapsers Number % of lapsers Cum. % of lapsers All customers Number % of customer Cum. % of customers 1 (highest) 2990 2.8% 2.8% 1030 21.2% 21.2% 4020 3.6% 3.6% 583.4 2 3720 3.5% 6.3% 840 17.3% 38.5% 4560 4.1% 7.8% 496.3 3 4050 3.8% 10.2% 450 9.3% 47.7% 4500 4.1% 11.8% 403.9 4 4180 4.0% 14.1% 380 7.8% 55.6% 4560 4.1% 15.9% 348.5 5 4130 3.9% 18.0% 400 8.2% 63.8% 4530 4.1% 20.0% 318.4 6 3970 3.8% 21.8% 370 7.6% 71.4% 4340 3.9% 24.0% 298.0 7 4340 4.1% 25.9% 110 2.3% 73.7% 4450 4.0% 28.0% 263.3 8 4550 4.3% 30.2% 140 2.9% 76.5% 4690 4.2% 32.2% 237.6 9 3890 3.7% 33.9% 120 2.5% 79.0% 4010 3.6% 35.8% 220.5 10 4370 4.1% 38.0% 90 1.9% 80.9% 4460 4.0% 39.9% 202.8 11 4400 4.2% 42.1% 120 2.5% 83.3% 4520 4.1% 44.0% 189.6 12 4780 4.5% 46.7% 80 1.6% 85.0% 4860 4.4% 48.3% 175.8 13 3650 3.4% 50.1% 80 1.6% 86.6% 3730 3.4% 51.7% 167.5 14 4480 4.2% 54.3% 70 1.4% 88.1% 4550 4.1% 55.8% 157.7 15 4860 4.6% 58.9% 130 2.7% 90.7% 4990 4.5% 60.3% 150.4 16 4230 4.0% 62.9% 120 2.5% 93.2% 4350 3.9% 64.3% 145.0 17 3070 2.9% 65.8% 30 0.6% 93.8% 3100 2.8% 67.1% 139.9 18 5270 5.0% 70.8% 80 1.6% 95.5% 5350 4.8% 71.9% 132.8 19 4350 4.1% 74.9% 60 1.2% 96.7% 4410 4.0% 75.9% 127.4 20 4390 4.1% 79.1% 60 1.2% 97.9% 4450 4.0% 79.9% 122.6 21 2110 2.0% 81.1% 0 0.0% 97.9% 2110 1.9% 81.8% 119.7 22 6630 6.3% 87.3% 50 1.0% 99.0% 6680 6.0% 87.9% 112.7 23 3370 3.2% 90.5% 10 0.2% 99.2% 3380 3.1% 90.9% 109.1 24 5270 5.0% 95.5% 10 0.2% 99.4% 5280 4.8% 95.7% 103.9 25 (lowest) 4750 4.5% 100.0% 30 0.6% 100.0% 4780 4.3% 100.0% 100.0 Total 105800 100.0% 4860 100.0% 110660 100.0% Overall lapse rate: 4.39% Index In practice we find that the evaluation is most realistic if the gains analysis is produced for a separate period of time, as well as for a separate sample of customers. The lift charts display some of the gains analysis results graphically, making it more straightforward to compare model performance for development and validation sub-samples. Two different formats are often useful cumulative index (see Figure 2a) and cumulative percentage of responses captured (see Figure 2b). The latter chart is equivalent to a ROC curve the relationship between these is explained by Vuk and Curk [14]. If there is a known cost of promoting to each individual, for example the cost of printing and posting a catalogue, in a direct mail campaign, then the gains analysis can be used to compare the promotion costs for the top score bands against the corresponding costs that would be incurred to reach the same number of responders without the aid of a model.
All Models Are Wrong but Some Are Useful 101 Figure 2a - Lift chart on cumulative index 700 600 500 x e d400 In e tiv la 300 u m u C 200 Dev. Val. 100 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Bands 100% Figure 2b - Lift chart on cumulative lapsers identifed by model 90% 80% rs 70% e s p a 60% L e 50% tiv la u 40% m u C 30% 20% Dev. Val. 10% 0% 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Bands Figure 2. Example lift charts. For example, based on Table 4, if the company wanted to reach 80% of future lapsers with a retention mailshot, it could achieve this by selecting the 39.9% of customers in score bands 1 to 10. Without the model (or any another method of targeting), it would have to contact 80% of customers. Therefore, the model halves the number of contacts that need to be made, and therefore saves significant costs. Finally, if the average contribution value of each positive response is known or can be estimated, then a notional total profit or loss can be calculated for each score band in the gains analysis. This value can be used to determine which score bands should targeted in future campaigns, according to the business objectives. For example, the business might wish to send messages to the largest number of people and break even overall, or it might wish to limit the campaign to the profitable score bands, or maximise the profitability of this campaign. Therefore, the gains analysis can be a very useful tool for helping marketers to plan their future campaigns.
102 Leventhal Continuing the above example, suppose that the cost per contact is 1 Euro, and the expected value of reaching each future lapser is 15 Euros - which allows for the fact that not all lapsers will be persuaded to stay with the company. Then, using Table 4 we can estimate the cumulative profitability of a business retention campaign - which is shown in Figure 3. We see that, by contacting customers in the top 15 bands, the campaign will break even overall. From Table 4, we see that this would involve contacting 60% of customers and would reach 90% of lapsers. Alternatively, in order to maximise profitability, only the first 6 bands would be targeted which would result in contacting 24% of customers and would reach 71% of lapsers. Figure 3. Predicted cumulative profit/loss of a targeted business retention campaign. Impact of the Internet and Big Data As Webber [15] points out, the Web has transformed DM in numerous ways and this has consequently impacted on the ways in which predictive analytics gets used, the tools and techniques that are most effective and the information available for data mining. The principal change is that the Internet provides a much lower cost channel for communicating offers, via email at virtually zero cost or via online browsing. This shift implies that companies no longer need to identify which subset of customers they should be targeting because they can easily contact all customers (for whom they hold email addresses) but instead they need to know which product to offer each individual in order to maximise the overall return. This represents a significant change in question for predictive analytics, resulting in the emergence of new techniques such as collaborative filtering systems. Many companies continue to operate both offline and online marketing, in order to serve those customers who still transact in bricks and mortar shops and branches. Some customers may browse for products online and then purchase offline, either in person or by
All Models Are Wrong but Some Are Useful 103 telephone. This switching behaviour presents new challenges for predictive analytics, first and foremost being the need for data on each customer that spans both online and offline channels. Integrated data is required in order to identify the communications that a customer receives and relate them to the purchases they have made. Therefore, data integration has become an important issue in multi-channel marketing. Multi-channel marketing also leads to new problems, such as the question of how to quantify the relative effects of all the different marketing media being employed by a company? For example, if a company is investing in a website for viewing and purchasing products, email marketing for communicating offers, online and offline advertising, then it would want to assess how much each activity is contributing to total sales. This has resulted in an increased focus on media attribution modelling in recent years. Finally, the growth of the Web has been one of the factors contributing to the Big Data phenomenon in recent years. As yet, there is no consensus on how to define Big Data, however, according to McKinsey [12], Big data refers to data sets whose size is beyond the ability of typical database software tools to capture, store, manage and analyse. And, according to Franks [4], Big does not just refer to the volume of data Big Data also has increased velocity (the rate at which data are transmitted and received), complexity and variety, compared with traditional data sources. Big Data developments are not limited to the Web or to the DM sector they occur in other information types and sectors that are being driven by advances in data capture technologies. Other examples include telematics data for car insurance, smart metering by utilities and genome data for cancer research. So there may be lessons to be learnt and benefits to be gained, from researchers in different industries sharing the approaches being used for big data analytics. Franks [4] discusses some of the opportunities for applying analytics to these new sources. In DM, web data on browsing behaviour could indicate a customer s current interests or purchase intentions, and so could help marketers to communicate suitable offers at the right time. Hence there is considerable interest in Big Data currently, while still recognizing the need to continue using existing sources of customer data. Developments in data mining may be tracked via web sites such as KDnuggets [10]. Industry best practices, case studies and advances may be followed in a variety of sources, such as the journals of Direct, Data and Digital Marketing Practice [6], Financial Services Marketing [7] and Marketing Analytics [8] and the forthcoming Applied Marketing Analytics [5]. 4. Conclusions Predictive analytics in DM operates on the basis that all models are wrong, but some are useful. Models are developed using a data mining process that s designed to assess whether a model will be useful to the business, employing evaluation measures based on gains or value generated, in preference to statistical diagnostics. DM is an ideal medium for conducting tests and experiments; industry best practice is to continue monitoring a model, throughout its deployment life, in order to identify when its usefulness declines and the next model development needs to be considered. Changing market conditions and innovations in information sources, such as Big Data, imply that the use of predictive analytics will constantly evolves and advance, for as long as businesses continue to operate segmented marketing to consumers.
104 Leventhal References 1. Box, G. E. P., and Draper, N. R. (1987). Empirical Model Building and Response Surfaces, John Wiley & Sons, New York. 2. Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., Wirth, R. (2000). CRISP-DM 1.0 Step-by-step data mining guide. Available from: http://www.the-modeling-agency.com/crisp-dm.pdf. 3. Fawcett, T. (2004). ROC Graphs: Notes and Practical Considerations for Researchers. Tech Report HPL-2003-4, HP Laboratories. Available: http://binf.gmu.edu/mmasso/ ROC101.pdf. 4. Franks, B. (2012). Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics, John Wiley & Sons, Hoboken, New Jersey. 5. Journal of Applied Marketing Analytics, published by Henry Stewart Publications. 6. Journal of Direct, Data and Digital Marketing Practice, published by Palgrave Macmillan. 7. Journal of Financial Services Marketing, published by Palgrave Macmillan. 8. Journal of Marketing Analytics, published by Palgrave Macmillan. 9. Kashner, J. and Zaima, A. (2007). A data mining primer for the data warehouse professional. The Data Administration Newsletter. Available: http://www.tdan.com/view-articles/5827 10. KDnuggets, website: http://www.kdnuggets.com/. 11. McCorkell, G. and Holder, D. (2006). What do we mean by direct, data and digital marketing? The IDM Marketing Guide, 1, 1.1-1 1.1.34, Institute of Direct Marketing, Teddington, UK. 12. McKinsey, (2011). Big Data: The Next Frontier for Innovation, Competition and Productivity, McKinsey Global Institute. Available: http://www.mckinsey.com/insights/ business_ technology/big_data_the_next_frontier_for_innovation. 13. Randolph, J. J. and Edmondson, R. S. (2005). Using the Binomial effect size display (BESD) to present the magnitude of effect sizes to the evaluation audience. Practical Assessment, Research & Evaluation, 10(14), 1-7. 14. Vuk, M. and Curk, T. (2006). ROC curve, lift chart and calibration plot. Metodoloski zvezki, 3(1), 89-108. 15. Webber, R. (2013). The evolution of direct, data and digital marketing. Journal of Direct, Data and Digital Marketing Practice, 14, 291-309. Author s Biography: Barry Leventhal is a marketing statistician who runs an independent analytics consultancy based in the UK. Previously, Barry was Director of Advanced Analytics for Teradata (UK). Prior to that, he had statistical roles in a customer management consultancy, a market analysis company and a market research agency. He holds a BSc and a Ph.D. from University College London, and a diploma in Computer Science from Cambridge University. He is a fellow of the Royal Statistical Society, Market Research Society and Institute of Direct Marketing. He chairs the Census & Geodemographics Group, which is an MRS advisory board, and serves on the executive board of the IDM journal.