ADVANCED MARKETING ANALYTICS:



Similar documents
Customer Analytics. Turn Big Data into Big Value

RFM Analysis: The Key to Understanding Customer Buying Behavior

MODELING CUSTOMER RELATIONSHIPS AS MARKOV CHAINS. Journal of Interactive Marketing, 14(2), Spring 2000, 43-55

Five Ways Retailers Can Profit from Customer Intelligence

MODELING CUSTOMER RELATIONSHIPS AS MARKOV CHAINS

AN INTRODUCTION TO PREMIUM TREND

Prescriptive Analytics. A business guide

Marketing: it s the marketing portion of a CRM like Salesforce.com. This database comes with the following tables

ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis

Data Functionality in Marketing

CoolaData Predictive Analytics

Collaborative Forecasting

Easily Identify the Right Customers

Successful Steps and Simple Ideas to Maximise your Direct Marketing Return On Investment

3 More on Accumulation and Discount Functions

In-Depth Guide Advanced Spreadsheet Techniques

Chapter 7: Data Mining

The Customer and Marketing Analytics Maturity Model

THE THREE "Rs" OF PREDICTIVE ANALYTICS

Direct Marketing of Insurance. Integration of Marketing, Pricing and Underwriting

The Top FIVE Metrics. For Revenue Generation Marketers

Data Mining Algorithms Part 1. Dejan Sarka

Customer Segmentation and Predictive Modeling It s not an either / or decision.

Maths Workshop for Parents 2. Fractions and Algebra

Costing and Break-Even Analysis

The Basics of Interest Theory

9. 3 CUSTOMER RELATIONSHIP MANAGEMENT SYSTEMS

Free Trial - BIRT Analytics - IAAs

Simple Predictive Analytics Curtis Seare

By Ken Thompson, ServQ Alliance

During the analysis of cash flows we assume that if time is discrete when:

CHAPTER 10. FINANCIAL ANALYSIS

Chapter 4: Implementing the CRM Strategy

How to Get More Value from Your Survey Data

Introduction. Why Metrics Matter. World s Easiest Marketing.

OPTIMIZING THE CUSTOMER JOURNEY USING OMNI-CHANNEL MARKETING By Novantas

Notes on Excel Forecasting Tools. Data Table, Scenario Manager, Goal Seek, & Solver

The lifetime value of a Land Line Phone Subscriber

How To Create A Business Benefit Dashboard Analysis Report In Microsoft Excel

CHAPTER 1. Compound Interest

Introduction to Binomial Trees

Onward Reserve: 2.7x Revenue From Segmented Newsletters, 18x Conversions From Automated s.

Intelligence Reporting Standard Reports

Online Display Advertising: Its Quantified Effect on Organic Down-Funnel Activity. A CBS Interactive White Paper

FINDING BIG PROFITS IN THE AGE OF BIG DATA

Keys To Unlocking Your Web Marketing Genius. Increase Customer Retention by Analyzing Visitor Segments. By Jim Novo. Based on WebTrends TAKE 10 Series

Improve Marketing Campaign ROI using Uplift Modeling. Ryan Zhao

How To Choose Between Buying Or Leasing Business Equipment

Numbers 101: Cost and Value Over Time

Dimensional modeling for CRM Applications

Paper AA Get the highest bangs for your marketing bucks using Incremental Response Models in SAS Enterprise Miner TM

Multi-state transition models with actuarial applications c

The Scientific Guide To: Marketing 30% OFF

Why Modern B2B Marketers Need Predictive Marketing

Predictive Analytics for Database Marketing

(Refer Slide Time 00:56)

27PercentWeekly. By Ryan Jones. Part II in the Series Start Small and Retire Early Trading Weekly Options

Modeling Multi-Channel Response Behavior Richard J. Courtheoux

Beyond the Click : The B2B Marketer s Guide to Display Advertising

Marketing Automation: One Step at a Time

Marketing Strategies for Retail Customers Based on Predictive Behavior Models

Optimizing Enrollment Management with Predictive Modeling

Product recommendations and promotions (couponing and discounts) Cross-sell and Upsell strategies

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

11.3 BREAK-EVEN ANALYSIS. Fixed and Variable Costs

THE PREDICTIVE MODELLING PROCESS

Concentrated Stock Overlay INCREMENTAL INCOME FROM CONCENTRATED WEALTH

The Five "I"s of One-to-One Marketing by Don Peppers and Martha Rogers, Ph.D.

If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C?

The Basics of Graphical Models

Multi-channel marketing s positive impact on your ROI.

Statistics in Retail Finance. Chapter 6: Behavioural models

Pastel Accounting Business Intelligence Centre

NFP. Capitalizing on merge strategies to boost your return on donor marketing

Life Insurance Modelling: Notes for teachers. Overview

Compliance. Technology. Process. Using Automated Decisioning and Business Rules to Improve Real-time Risk Management

Two-State Options. John Norstad. January 12, 1999 Updated: November 3, 2011.

Explode Six Direct Marketing Myths

Simple Interest. and Simple Discount

Acquiring new customers is 6x- 7x more expensive than retaining existing customers

Managing the Next Best Activity Decision

Mathematics of Risk. Introduction. Case Study #1 Personal Auto Insurance Pricing. Mathematical Concepts Illustrated. Background

Comprehensive Business Budgeting

Sage PFW ERP Intelligence

OPTIMAL DESIGN OF A MULTITIER REWARD SCHEME. Amir Gandomi *, Saeed Zolfaghari **

Reducing Customer Churn

Marketing Practice. Tactical CRM. Three Steps to Mining Profits, Not Data

Social Media-based Customer Loyalty Programs

This paper is directed to small business owners desiring to use. analytical algorithms in order to improve sales, reduce attrition rates raise

Customer Lifetime Value and it`s determination using the SAS Enterprise Miner and the SAS OROS-Software

Lead Nurturing. Chloe 04/25/2014

Frequency Matters. The keys to optimizing send frequency

Six Strategies for Achieving

Financial ratio analysis

Data Mining for Business Analytics

Working with telecommunications

3.2 Roulette and Markov Chains

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product

Data Analytical Framework for Customer Centric Solutions

Transcription:

ADVANCED MARKETING ANALYTICS: MARKOV CHAIN MODELS IN MARKETING a whitepaper presented by:

ADVANCED MARKETING ANALYTICS: MARKOV CHAIN MODELS IN MARKETING CONTENTS EXECUTIVE SUMMARY EXECUTIVE SUMMARY... 1 INTRODUCTION... 2 OVERVIEW OF MARKOV CHAIN MODELS... 2 SAMPLE MARKOV CHAIN ANALYSES... 5 MORE STATE SPACE EXAMPLES... 9 ASSUMPTIONS AND LIMITATIONS... 12 ADVANCED MARKOV CHAIN CONCEPTS... 13 SUMMARY... 16 Marketing strategists and analysts are constantly in need of improving their analytic capabilities. Keeping up with changing markets and evolving customer behavior is a difficult task. The incremental value gained from learning and adopting more advanced analysis and modeling techniques can positively affect the bottom line in ways just being understood in the marketing community. Techniques that have been successfully used for years in other domains are just now being reinvented by marketers. In this whitepaper, we discuss Markov chains and their great potential for use in marketing analysis and modeling. The topic is presented at a high level and with little need for mathematical know-how. The power and flexibility of these models should become evident to both strategists and analysts. 1 - LITYX, LLC

INTRODUCTION Markov chain models are well-understood and well-developed modeling techniques in mathematics and statistics. They have been in use for many years across a wide range of disciplines. However, in the marketing domain, they have yet to come of age despite their potential power for better understanding and predicting behavior of customers, products, or markets. In this whitepaper, we will give an overview of Markov chain models and describe some examples of their potential in marketing and customer management. OVERVIEW OF MARKOV CHAIN MODELS A Markov chain is a mathematical model that can describe the pattern of changes in behavior over time. It is a class of models that are called stochastic models because they account for randomness in behavior. Markov chain models can range from very simple and easy to understand, to extremely complex models of real world behavior. The decision regarding the amount of complexity built into a Markov chain model depends on many factors, such as certain assumptions one is willing to make, the amount of data available for building the model, and the inherent skills of the modeling team. However, even the simplest versions of Markov chain models can often perform quite well, and the incremental gains that could be realized by moving to more complex versions may not always be worth the effort (although in some cases, the extra effort can pay off tremendously). We will discuss both simple and complex models, but begin with an overview of the basic Markov chain framework. First, let s be clear about what we are modeling. The behavior we want to understand can take many forms, but in marketing it is typically customer behavior, product behavior, or market behavior. In our examples, we will focus on customer behavior, but extensions to product or market behavior will be clearly evident. In a Markov chain model, time is broken into discrete intervals, such as hours, days, weeks, months, quarters, or years. Specifically, this is called a discrete-time Markov chain model (as compared to continuous-time models which we will briefly discuss later). At any particular 2 - LITYX, LLC

point in time, say time t, an individual customer can be in one of a pre-defined set of possible states. For example, we can define the possible states as representing the total number of products the customer holds at that time (e.g., financial services products, or wireless products). So a customer might be in state 0 at time t if they have no products at that time (i.e., they are a prospect or recent churner), or in state 1 if they have one product, or in state 2 if they have two products, and so on. The set of all possible states must be well-defined ahead of time, and this set is called the state space. In the above example, we might define the state space as {0, 1, 2, 3, 4+}. The Markov chain defined in this way has five states, where we have defined the state 4+ as referring to a customer who has 4 or more products at a point in time (since it might be rare for any customer to have that many products or more, we combine them all into one state 4 or more ). To continue the example, let s say that a time period is a month. Now one question is: If a customer is in state 1 this month (i.e., currently holds one product), what state will they be in next month (i.e., how many products next month)? A Markov chain attempts to model a customer s behavior through time by describing the probability that she will move from one state to another, or stay in the same state, from one time period to the next. The movement of time from one period to the next is called a time step, and a customer s progression from one state to another over that one time step is called a transition (even if the state does not change in that time step, we still call it a transition). In our example, there may be a high probability that the customer remains in state 1 next month (see accompanying diagram). There is some probability, hopefully small, that the customer will transition to state 0. This would represent the likelihood that this customer will no longer have a relationship with our company one month from now. There is also some SOME POSSIBLE CUSTOMER TRANSITIONS NEXT MONTH Possible transitions THIS MONTH 1 Probability=.02 Probability=.91 Probability=.05 Probability=.02 Probability=0 0 1 2 3 3 - LITYX, LLC 4+

probability that the customer will transition to state 2, or possibly even 3 or 4 by the next month. These are transitions we would typically w a n t t o encourage since it means STATE SPACE DIAGRAM WITH ALL POSSIBLE TRANSITIONS 0 1 2 3 4+ the customer is purchasing new products and developing a deeper, more loyal relationship. The probabilities that are assigned to the possible transitions from one state to another are called transition probabilities. The model describes all possible state-to-state transitions that can occur in a single time step. To do so, we collect all transition probabilities into what is called the transition matrix. This matrix includes a row and column for each state we have defined in the state space. The rows represent the current state the customer is in, and the columns represent the state for that customer at the next time step. A probability in the matrix is interpreted as 0 TRANSITION MATRIX FOR 5-STATE MARKOV CHAIN EXAMPLE 0 1 2 3 4+ 0.95 0.04 0.01 0 0 follows. Take for example the 0.06 value in the third row (current state 2 ) and the second column (next state 1 ) in the accompanying figure. This is the probability that a customer who is currently in state 2 will transition to state 1 by next month. Notice that the probabilities in any row add to 1.0 since we have accounted for all possible transitions. A customer must 1 2 3 4+ 0.02 0.91 0.05 0.02 0 0.01 0.06 0.90 0.02 0.01 0.01 0.03 0.06 0.85 0.05 0 0.02 0.02 0.06 0.90 transition to some state in the next time period, even it means staying in the same state. Where do the probabilities in the matrix come from? The values in the matrix are representative of the real behavior of customers gleaned from historical customer data. For 4 - LITYX, LLC

this particular example, a fairly simple one, we would extract customer data from the previous year (or two, or three). For each customer in this extract, we would observe their actual path from state to state over this time period. This data, when averaged over customers, is converted into a transition matrix. Depending on how the organization has stored historical customer and transactional data, this may or may not be simple. The customer data files will not contain a string of visited states for CONVERTING RAW HISTORICAL CUSTOMER DATA Customer ID Transaction Date Transaction Type 003482 04/25/05 OPEN 003482 08/12/05 OPEN 003482 11/02/05 CLOSE Suppose this customer started with 2 products at beginning of 2005. The states visited by this customer during 2005 would be the following sequence (states are determined at the beginning of a month). 2 2 2 2 3 3 3 3 4+ 4+ 4+ 3 each customer. Typically data may exist in the form of an ACCOUNT OPEN DATE or ACCOUNT CLOSE DATE, and the state a customer was in at the earliest date in the extract would have to be determined (see accompanying figure). In any case, an appropriate extract should be available that will allow reconstruction of state transitions from one period to the next for each customer or for a sample of customers. SAMPLE MARKOV CHAIN ANALYSES Now that we have discussed how to define a Markov chain model and determine the transition matrix, let's review some simple analyses that can be performed. This is not a complete listing by any means, and is only meant to generate ideas and convey the wide range of analysis possibilities. The uses will also vary dependent on the particular business problem and the complexity of the model. ANALYSIS 1: A SINGLE PERIOD ANALYSIS. We can use the Markov chain model to conduct analysis of likely events over a single period. This is perhaps the simplest form of Markov chain analysis. For example, we can calculate an overall expected one-month churn rate. Let's say it is currently September 1, and are conducting a churn analysis looking forward one month to October. First, at the start of the current period, September, we need an accounting 5 - LITYX, LLC

EXPECTED CHURN CALCULATION EXAMPLE Markov Sept 1 Actual State Probability of Churning State s Contribution Chain State Percentages by October 1 to Churn Rate 1 28.2 0.02 0.564 % 2 39.8 0.02 0.796 % 3 18.2 0.01 0.182 % 4+ 13.8 0.00 0 % Totals 100.0 1.542 % From current customer file on September 1. From transition matrix 0 column (see example on Page 4). Product of values in previous two columns. Final churn calculation of the percentage of customers currently in each of the states we have defined (ignoring state 0 which represents non-customers). Now we can calculate the expected churn rate by taking the weighted average of these percentages and the appropriate transition to state 0 probabilities from the transition matrix. Examples of supporting calculations are shown in the adjacent figure. ANALYSIS 2: A MULTIPLE PERIOD ANALYSIS. Markov chains provide a very simple way to conduct analyses over multiple periods, even though the model itself is for transitions over a single period. The transition matrix we have been describing is often called a one-step transition matrix because it describes transition probabilities for one time period. What if we want to calculate the probability that a customer will transition from one state to another over two, three, or any number of periods? To answer that we need the corresponding transition matrix the two-step or three-step transition matrix for example. Calculating such matrices is not as hard as it might seem. In fact, it is quite easy, and relies on the mathematical concept of multiplying matrices. If we refer to the one-step transition matrix we have defined as P, then the two-step transition matrix is P 2, the three-step 6 - LITYX, LLC

transition matrix is P 3, and so on. We won t delve into the details of how to square, cube, or further multiply matrices since it is better left for computer software to perform anyway. The accompanying figure shows the resulting two-step transition matrix for the five state example presented earlier (this is the square of the matrix on Page 4). The interpretations of the probabilities are exactly the same as before, except that they now represent the probability of transitioning from one state (based on the row) to another (based on the column) in two periods. For example, the accompanying matrix tells us that there is a 9.19% probability that a customer who currently has one product will have two products two months from now. To calculate a one-year churn analysis, we can compute the 12-step transition matrix and 0 1 2 3 4+ then make calculations just like those in the previous subsection. TWO-STEP TRANSITION MATRIX FOR 5-STATE MARKOV CHAIN EXAMPLE 0 1 2 3 4+ 0.9034 0.0750 0.0205 0.0010 0.0001 0.0379 0.8325 0.0919 0.0362 0.0015 0.0199 0.1098 0.8145 0.0368 0.0190 0.0192 0.0578 0.1076 0.7273 0.0881 0.0012 0.0392 0.0406 0.1058 0.8132 The main drawback to this technique is that the further out in time you wish to perform an analysis, the less accurate the longer period transition matrix may be. For example, taking the one-step transition matrix to the 24th power in order to conduct a two year analysis may result in inaccuracies. Longer term analyses are better conducted by redefining the original period to be that length of time (e.g., two years), and going back to your database to recalculate a new transition matrix for this redefined period length. ANALYSIS 3: SIMULATION. A Markov chain model can be used to perform powerful forwardlooking simulation analyses fairly easily. For example, suppose we are planning to launch an acquisition campaign. As part of the planning process, we want to understand the future behavior of our successfully acquired new customers. Will they be long-term customers? Will they stagnate? To answer these questions, we can use the Markov chain model to simulate a prospective cohort of acquired customers over their first year with our company. The result is 7 - LITYX, LLC

an understanding of the longer term retention rate of these new customers, the opportunity for cross and up-sales once acquired, and an overall view of the longer term profitability of the campaign beyond just looking at response rate. The simulation method also gives us a method for predicting upside potential for the campaign, as well as downside risk that our campaign success estimates are low. We will present the simulation technique for a single prospect. The same methodology can be applied to the entire cohort and results summarized. Simulation of a Markov chain is highly dependent on the derived transition matrix and availability of a software program to generate random numbers and track the simulation results. Often, such a program needs to be developed manually, but can make use of available libraries for many of the detailed mathematical calculations. Also, the usefulness of simulation is most evident the more complex the model is (i.e., the more states defined and the more attributes on which the states are based). USE OF TRANSITION MATRIX FOR SIMULATION (STEP 1) The procedure can be illustrated as follows. Start with a prospect, and call the time at 0 1 2 3 4+ which the campaign runs Time 0 or t=0. 0 0.95 0.04 0.01 0 0 At t=0, all prospects are in state 0 since 1 0.02 0.91 0.05 0.02 0 they are not currently customers. Now, generate a random number between 0.0 and 1.0. This is the basis of the simulation and determines the behavior of the simulated 2 3 0.01 0.06 0.90 0.02 0.01 0.01 0.03 0.06 0.85 0.05 prospect. From the transition matrix 4+ 0 0.02 0.02 0.06 0.90 (shaded row in accompanying diagram, Step 1), we proceed as follows: if the random number is less than.95, this prospect did not respond positively to the campaign. Otherwise, if it is less than.95+.04=.99, the customer responded, and transitions to state 1 (will have one product) by next month. And otherwise, the customer responded, and in fact will transition to state 2 (two products) by next month. If focus is not on response rate itself, we can ignore the 95% non-responders during this process. Now, for the next step, Step 2+, we follow this simulated customer through a full year (or longer or shorter if we wish) of behavioral steps. Suppose that in the first step, the customer 8 - LITYX, LLC

transitioned to state 1. Then we generate another random number and compare it to the transition matrix values for the state 1 row (shaded row in accompanying diagram, Step 2+). If the random number is less than.02, the customer transitions back to state 0 (and so quickly churned). Otherwise, if it is less than.02+.91=.93, the USE OF TRANSITION MATRIX FOR SIMULATION (STEP 2+) 0 1 2 3 4+ 0 0.95 0.04 0.01 0 0 1 0.02 0.91 0.05 0.02 0 2 0.01 0.06 0.90 0.02 0.01 customer remains in state 1. Otherwise, if it is less than.93+.05=.98, the customer transitions to state 2 (bought a new product), and so on. 3 4+ 0.01 0.03 0.06 0.85 0.05 0 0.02 0.02 0.06 0.90 We can continue this method for a single simulated customer for as many months out as we want to simulate. Once we do this for all customers and summarize the results, we can compute various marketing metrics on this simulated cohort to help analyze and plan the campaign. Often, a number of simulated cohorts are constructed in order to understand upside potential or downside risk for different possibilities of the campaign progress. Also, we can test what-if scenarios by changing values in the transition matrix or changing attributes of the prospect base we are targeting. This can analyze for different possible future scenarios based on our campaign assumptions. Again, the sky is the limit for the potential uses of this technique. MORE STATE SPACE EXAMPLES So far, we have referred to a simple example of a Markov chain with five states, representing the current number of products a customer has. There are countless different types of Markov chain models that could be designed and built, limited only by imagination and ingenuity. Different industries, different business scenarios, and different objectives all lead to potentially different uses of these models. We give a few more examples of ways to define a Markov chain state space. This should serve to spark some more novel ideas, but not at all limit the wealth of possibilities. 9 - LITYX, LLC

EXAMPLE 1 STATE SPACE DIAGRAM: TOTAL NUMBER PURCHASES IN PERIOD 0 1-5 6-10 11-15 16-20 ETC. Useful for retail, cataloguing, direct marketing, and B2B industries such as consumer products or high tech. EXAMPLE 1: TOTAL NUMBER OF PURCHASES. In this model, states represent the total number of purchases made in the period by a customer. These could be categorized, such as 0, 1-5, 6-10, 11-15, etc., or represent actual purchase quantities, site visits, or order counts. This model can work well for retail, cataloguing, and direct marketing industries, or online stores, where purchase counts or visits in a period are typically higher, and return visits are expected. This can also work well for B2B situations (e.g., consumer products, high tech) where the customers are the retailers or other outlets further down the supply chain. EXAMPLE 2: VALUE SEGMENTATION. Markov chain models can be very useful for performing customer segmentation in a new light. They allow us to go beyond static segment definitions, and move toward stochastic segmentation. For example, we could measure total revenue (or even better profits, if available) for each customer, and then place each into a decile (e.g., Top 10%, second 10%, etc) as a method of revenue or profitability segmentation. The deciles can become states in a Markov chain, allowing us to observe how a customer or group of customers progresses from decile to decile over time. We can answer questions like, what is the probability the customer will become a top decile customer next month, or in the next six months? What percentage of top decile customers become unprofitable six months later? This model can be useful across many industries. EXAMPLE 3: PURCHASE RECENCY. In many marketing situations, it is important to know how likely the customer is to make a purchase in the next time period. For example, for a mailorder retailer, a decision may be made to not send a catalog next month if the customer has 10 - LITYX, LLC

n o t p u r c h a s e d recently. But, that decision needs to be made with likelihood of purchase and potential profit in mind. In this Markov chain model, the definition of a state EXAMPLE 3 STATE SPACE DIAGRAM: PURCHASE RECENCY 1 2 3 4 5 ETC. can be the number of periods since last purchase. A 1 means the customer purchased last period, while an 8, for example, represents a customer who last purchased eight periods ago. In this model, a transition can only happen in two ways. If the customer made a purchase, they immediately transition to the 1 state. If they didn t make a purchase last period, they transition to the next higher state (see accompanying diagram). For example, a transition from state 4 to 2 in one period is impossible. This model can be used to predict the likelihood of becoming a purchaser again next period after many non-purchase periods, or the likelihood of repeat purchasing (i.e., symbolized by remaining in state 1 from one period to the next). EXAMPLE 3A: REPETITIVE MARKETING. A related example is applicable to a repetitive marketing campaign. For example, the campaign strategy may be to continually target a customer with a promotion if they did not respond to the previous offer. We may in fact increase the value of our offer (e.g., more free airline miles, bigger discount) as we send more offers, hoping to finally entice the customer. In this scenario, the states represent the number of mailings sent to a customer (up to some maximum number we are willing to send), with an additional state named ACQ to represent the customer has positively responded and is now acquired. In this example, a customer can only transition to the next higher state (did not respond) or the the ACQ state (did respond). We can use this model to help decide when and how much to increase offers based on likelihood of response to future mailings and overall profitability goals, and to analyze the pattern of responses over repeated mailings. EXAMPLE 4: PRODUCT CROSS-SELL. SELL. An even more advanced Markov chain model tracks customer product holdings from one period to the next and can be useful in understanding 11 - LITYX, LLC

customer needs and for cross-selling. Let s take a financial services firm, for example, and imagine that they offer four products: intro checking accounts (C), auto loans (A), mortgages (M), and home equity loans or lines (H). We now define states as all possible combinations of products that a customer could hold, as well as the state 0 representing no products held (a prospect or recent churner). In this case, the number of possible states is 16: C, A, M, H, CA, CM, CH, AM, AH, MH, CAM, CAH, CMH, AMH, CAMH, and 0. Transitions represent cross-sell events or product churn. For example, if a customer transitions from M to MH in a period, they just purchased a home equity line on top of their mortgage account. We can analyze, for example, the likelihood of transitioning to a state including H, but not M. These would represent customers who hold a mortgage elsewhere, but bought their home-equity line from us. This model does a great job of helping to understand customer needs since we can analyze patterns of purchases over time, and with respect to the products currently held. In summary, remember that these are just a few examples of a wealth of possible Markov chain models useful in marketing. The main limitation to be aware of when designing more complex Markov chain state spaces is availability of data necessary to determine the transition matrix. ASSUMPTIONS AND LIMITATIONS FIRST ORDER-MARKOV CHAINS. In our discussion so far, the state a customer moves to at the next time step depends only on the state in which they are currently. Any prior information about the customer s history makes no difference to this Markov chain model. This is known as a first-order Markov chain. This is a simplifying assumption about customer behavior over time, and may not necessarily be true in real situations. For example, if a customer is preparing to fully cancel her three insurance policies, she will need to do so over a few months span while waiting for the policy due dates to expire. Her transition sequence over a five month period (and starting with three products) might be 3, 2, 2, 1, 1, 0. In this case, the customer s transition to state 0 would have been predicted not just by the previous 1, but by the full sequence of states over those five months. Such situations can be taken into account by developing higher-order models which we will briefly discuss later. These are more complex, but usually better account for the reality of behavior. 12 - LITYX, LLC

HOMOGENEOUS MARKOV CHAINS. The second point to notice is that we have defined only one transition matrix, and this one is used no matter when in time we are speaking about. For example, this matrix is meant to describe state transitions from January 2003 to February 2003, March 2004 to April 2004, or November 2005 to December 2005, or any transition from one month to the next. This is called a homogeneous Markov chain because the transition matrix is the same for all points in time. Again, this is a simplifying assumption that is not necessarily true in reality. For example, in a retail setting, the number of products purchased is more likely to increase during the holiday shopping season months toward the end of the year than at other times. To take into account different transition probabilities for different points in time, we would need non-homogeneous Markov chains. As with higher order Markov chains, they are more complex, but account for reality a little better. We will also discuss these more later. Although the first-order and homogeneous assumptions simplify real behavior, they often do quite a nice job of describing behavior accurately, particularly over shorter time periods. The incremental gain realized by considering more advanced models is not always worth the additional complexity they entail. ADVANCED MARKOV CHAIN CONCEPTS PREDICTIVE MODELING WITH MARKOV CHAINS. One drawback of the Markov chain model framework discussed so far is that it is difficult to incorporate detailed customer-level information into them. For example, a Markov chain could be built that describes transitions based on many attributes all at once (say, for example, Age Bracket, Income, Purchase History, Purchase Frequency, Products Held, No. Months as Customer). Such a Markov chain would be very complex because it would contain a large number of states to account for all the possible combinations of these variables. Although it is possible to build such a chain, it would be more difficult to analyze and make sense of results. A better way to incorporate many attributes into the Markov chain is to describe state transitions using predictive models instead of a transition matrix. For example, in the sample transition matrix on Page 4, the transition probability from state 1 to 2 is.05. This is a static value that holds for all customers, regardless of their particular history or background. 13 - LITYX, LLC

That is usually fine for analyzing overall behavior, but not detailed enough for understanding individual customer behavior. The solution is to build a model that predicts the state a customer will transition into next period based on customer-level data. So, the static.05 probability referred to above now becomes a more complex mathematical formula based on a customer s individual attributes. Essentially, the result is that a transition matrix is built for each customer or segment of customers, making for a more powerful solution. These models can be built using linear or multinomial regression, or other forms of predictive modeling techniques such as decision trees or neural networks. Further discussion of building predictive models to associate with a Markov chain is beyond the scope of this whitepaper. But, the power and importance of this technique cannot be understated. It allows for much more robust analysis of customer behavior and simulation. HIGHER ORDER MARKOV CHAINS. As stated earlier, a first-order Markov chain is one in which the next transition depends only on the current state, not any previous states. We can incorporate more complexity into the model by allowing for second, third, or higher order Markov chains. A second-order chain, for example, allows the next transition to depend on previous two states, and so on for higher orders. As it turns out, a higher order model can always be restated as a first-order model, but one in which the state space is larger. Therefore, the only extra difficulty with higher order models is that their state spaces can become quite complex. NON-HOMOGENEOUS CHAINS. A non-homogeneous Markov chain is one in which the transition matrix may be different at different points in time. This makes the model more complex than a homogeneous chain, but can be handled fairly easily. We can simply build a different transition matrix for different time periods, then use them at the appropriate point. For example, we may have three transition matrices in a retail setting. The first might describe transitions from November to December (upswing in holiday buying), the second might describe transitions from December to January (associated downswing in buying), and the third might describe transitions for the rest of the year (typical buying behavior). Then, when analyzing behavior or conducting simulations, we just refer to the appropriate chain at the appropriate time. 14 - LITYX, LLC

CONTINUOUS-TIME TIME MARKOV CHAINS. Earlier, we referred to the Markov chain models under consideration as discrete-time Markov chains. This meant that we only considered transitions to occur at pre-defined points in time, such as at the end of a month or quarter. This is certainly a simplification of how real events actually occur. For example, a product can typically be purchased at any point in time, not at just at a discrete interval. A continuous-time Markov chain allows a transition to occur at any point in time. For example, consider the Markov chain model where we define a state to be the number of products currently held by a customer. If a customer is in state 3, then makes a purchase, the continuous-time model immediately transitions him to state 4. The discrete-time model waits until the end of the month to officially t r a n s i t i o n t h e customer to the next CONTINUOUS TIME MARKOV CHAIN FOR CHURN ANALYSIS state. In many cases, the discrete-time model is sufficient. It may not Current Time How long? Time of Churn b e h e l p f u l o r interesting to track transition times to such a fine level of detail. For example, in the retail catalog industry, it is only important to know that a purchase will be made next month (and so a catalog should be sent), not necessarily when in that month it will be made. But, in some cases, we may need such detail. For example, in a customer call center analysis, we may define the states as how many service reps are currently busy. We want to know when the next call will come, and if it will come in before a current call ends. In this case, it is important to predict the precise time of the state transition, and not base it on the end of some predefined period of time. Also, for analysis over longer timeframes, it may be easier to use continuous-time models. For example, an automobile dealer may want to predict how long it will be before a customer makes his next purchase. Since this is typically a long period of time, it might be easier to 15 - LITYX, LLC

predict how long until that purchase is made instead of predicting monthly state transition probabilities and figuring out how many months until that transition is likely to occur. Or, for a long term churn analysis, we might want to predict how long until the churn event occurs and not worry about the intervening time steps. As seen in the above examples, continuous-time Markov chains are often used to predict the waiting time until some event occurs. The event we are waiting for might be the next purchase by a customer, a churn event, or a lifestyle transition. More advanced forms of these kinds of predictions use techniques such as generalized linear models, hierarchical and mixed models, or neural networks. SUMMARY We have seen that Markov chain models can be a very important tool for analyzing, understanding, and predicting customer behavior. The models can range from simple to very complex, and so are accessible to organizations at all levels of analytic capability. The potential for modeling different scenarios is virtually endless, making Markov chains a powerful tool for marketers. 16 - LITYX, LLC

www.lityx.com LITYX, LLC info@lityx.com 2006 Lityx, LLC. All rights reserved.