How To Model Your Cusomer Base Analysis And Buy Your Cummon



Similar documents
Chapter 8: Regression with Lagged Explanatory Variables

PROFIT TEST MODELLING IN LIFE ASSURANCE USING SPREADSHEETS PART ONE

Performance Center Overview. Performance Center Overview 1

Duration and Convexity ( ) 20 = Bond B has a maturity of 5 years and also has a required rate of return of 10%. Its price is $613.

ARCH Proceedings

Individual Health Insurance April 30, 2008 Pages

MACROECONOMIC FORECASTS AT THE MOF A LOOK INTO THE REAR VIEW MIRROR

SPEC model selection algorithm for ARCH models: an options pricing evaluation framework

The naive method discussed in Lecture 1 uses the most recent observations to forecast future values. That is, Y ˆ t + 1

Measuring macroeconomic volatility Applications to export revenue data,

Morningstar Investor Return

Journal Of Business & Economics Research September 2005 Volume 3, Number 9

Term Structure of Prices of Asian Options

Forecasting, Ordering and Stock- Holding for Erratic Demand

DYNAMIC MODELS FOR VALUATION OF WRONGFUL DEATH PAYMENTS

Chapter 1.6 Financial Management

A Note on Using the Svensson procedure to estimate the risk free rate in corporate valuation

Principal components of stock market dynamics. Methodology and applications in brief (to be updated ) Andrei Bouzaev, bouzaev@ya.

ANALYSIS AND COMPARISONS OF SOME SOLUTION CONCEPTS FOR STOCHASTIC PROGRAMMING PROBLEMS

TEMPORAL PATTERN IDENTIFICATION OF TIME SERIES DATA USING PATTERN WAVELETS AND GENETIC ALGORITHMS

Niche Market or Mass Market?

The Real Business Cycle paradigm. The RBC model emphasizes supply (technology) disturbances as the main source of

Appendix D Flexibility Factor/Margin of Choice Desktop Research

How To Calculate Price Elasiciy Per Capia Per Capi

Mathematics in Pharmacokinetics What and Why (A second attempt to make it clearer)

Working Paper No Net Intergenerational Transfers from an Increase in Social Security Benefits

Single-machine Scheduling with Periodic Maintenance and both Preemptive and. Non-preemptive jobs in Remanufacturing System 1

Chapter 7. Response of First-Order RL and RC Circuits

The Application of Multi Shifts and Break Windows in Employees Scheduling

INTRODUCTION TO FORECASTING

USE OF EDUCATION TECHNOLOGY IN ENGLISH CLASSES

Hedging with Forwards and Futures

Distributing Human Resources among Software Development Projects 1

Random Walk in 1-D. 3 possible paths x vs n. -5 For our random walk, we assume the probabilities p,q do not depend on time (n) - stationary

Information Theoretic Evaluation of Change Prediction Models for Large-Scale Software

Vector Autoregressions (VARs): Operational Perspectives

The Transport Equation

Present Value Methodology

DETERMINISTIC INVENTORY MODEL FOR ITEMS WITH TIME VARYING DEMAND, WEIBULL DISTRIBUTION DETERIORATION AND SHORTAGES KUN-SHAN WU

Chapter 6: Business Valuation (Income Approach)

11/6/2013. Chapter 14: Dynamic AD-AS. Introduction. Introduction. Keeping track of time. The model s elements

INTEREST RATE FUTURES AND THEIR OPTIONS: SOME PRICING APPROACHES

Can Individual Investors Use Technical Trading Rules to Beat the Asian Markets?

THE FIRM'S INVESTMENT DECISION UNDER CERTAINTY: CAPITAL BUDGETING AND RANKING OF NEW INVESTMENT PROJECTS

Market Liquidity and the Impacts of the Computerized Trading System: Evidence from the Stock Exchange of Thailand

PATHWISE PROPERTIES AND PERFORMANCE BOUNDS FOR A PERISHABLE INVENTORY SYSTEM

Real-time Particle Filters

Optimal Stock Selling/Buying Strategy with reference to the Ultimate Average

Usefulness of the Forward Curve in Forecasting Oil Prices

Chapter 4: Exponential and Logarithmic Functions

DEMAND FORECASTING MODELS

Why Did the Demand for Cash Decrease Recently in Korea?

Risk Modelling of Collateralised Lending

UNDERSTANDING THE DEATH BENEFIT SWITCH OPTION IN UNIVERSAL LIFE POLICIES. Nadine Gatzert

ANALYSIS FOR FINDING AN EFFICIENT SALES FORECASTING METHOD IN THE PROCESS OF PRODUCTION PLANNING, OPERATION AND OTHER AREAS OF DECISION MAKING

II.1. Debt reduction and fiscal multipliers. dbt da dpbal da dg. bal

Segmentation, Probability of Default and Basel II Capital Measures. for Credit Card Portfolios

The Greek financial crisis: growing imbalances and sovereign spreads. Heather D. Gibson, Stephan G. Hall and George S. Tavlas

cooking trajectory boiling water B (t) microwave time t (mins)

Hotel Room Demand Forecasting via Observed Reservation Information

Time Series Analysis Using SAS R Part I The Augmented Dickey-Fuller (ADF) Test

The Grantor Retained Annuity Trust (GRAT)

A Re-examination of the Joint Mortality Functions

Supplementary Appendix for Depression Babies: Do Macroeconomic Experiences Affect Risk-Taking?

Stochastic Optimal Control Problem for Life Insurance

Relationships between Stock Prices and Accounting Information: A Review of the Residual Income and Ohlson Models. Scott Pirie* and Malcolm Smith**

PRACTICES AND ISSUES IN OPERATIONAL RISK MODELING UNDER BASEL II

DOES TRADING VOLUME INFLUENCE GARCH EFFECTS? SOME EVIDENCE FROM THE GREEK MARKET WITH SPECIAL REFERENCE TO BANKING SECTOR

Market Analysis and Models of Investment. Product Development and Whole Life Cycle Costing

Setting Accuracy Targets for. Short-Term Judgemental Sales Forecasting

AP Calculus BC 2010 Scoring Guidelines

Planning Demand and Supply in a Supply Chain. Forecasting and Aggregate Planning

On the degrees of irreducible factors of higher order Bernoulli polynomials

Forecasting Sales: A Model and Some Evidence from the Retail Industry. Russell Lundholm Sarah McVay Taylor Randall

Forecasting. Including an Introduction to Forecasting using the SAP R/3 System

Credit Index Options: the no-armageddon pricing measure and the role of correlation after the subprime crisis

Modeling a distribution of mortgage credit losses Petr Gapko 1, Martin Šmíd 2

Contrarian insider trading and earnings management around seasoned equity offerings; SEOs

MTH6121 Introduction to Mathematical Finance Lesson 5

Predicting Stock Market Index Trading Signals Using Neural Networks

Day Trading Index Research - He Ingeria and Sock Marke

Does Option Trading Have a Pervasive Impact on Underlying Stock Prices? *

AP Calculus AB 2010 Scoring Guidelines

Chapter 8 Student Lecture Notes 8-1

BALANCE OF PAYMENTS. First quarter Balance of payments

LIFE INSURANCE WITH STOCHASTIC INTEREST RATE. L. Noviyanti a, M. Syamsuddin b

Forecasting and Information Sharing in Supply Chains Under Quasi-ARMA Demand

A New Type of Combination Forecasting Method Based on PLS

How To Predict A Person'S Behavior

Modeling VIX Futures and Pricing VIX Options in the Jump Diusion Modeling

WATER MIST FIRE PROTECTION RELIABILITY ANALYSIS

SEASONAL ADJUSTMENT. 1 Introduction. 2 Methodology. 3 X-11-ARIMA and X-12-ARIMA Methods

Table of contents Chapter 1 Interest rates and factors Chapter 2 Level annuities Chapter 3 Varying annuities

Stock Trading with Recurrent Reinforcement Learning (RRL) CS229 Application Project Gabriel Molina, SUID

Small and Large Trades Around Earnings Announcements: Does Trading Behavior Explain Post-Earnings-Announcement Drift?

Economics Honors Exam 2008 Solutions Question 5

LEASING VERSUSBUYING

Double Entry System of Accounting

Implementing 130/30 Equity Strategies: Diversification Among Quantitative Managers

Transcription:

Vol. 24, No. 2, Spring 25, pp. 275 284 issn 732-2399 eissn 1526-548X 5 242 275 informs doi 1.1287/mksc.14.98 25 INFORMS Couning Your Cusomers he Easy Way: An Alernaive o he Pareo/NBD Model Peer S. Fader The Wharon School, Universiy of Pennsylvania, 749 Hunsman Hall, 373 Walnu Sree, Philadelphia, Pennsylvania 1914-634, faderp@wharon.upenn.edu Bruce G. S. Hardie London Business School, Regen s Park, London NW1 4SA, Unied Kingdom, bhardie@london.edu Ka Lok Lee Caalina Healh Resource, Blue Bell, Pennsylvania 19422, kaloklee@alumni.upenn.edu Today s managers are very ineresed in predicing he fuure purchasing paerns of heir cusomers, which can hen serve as an inpu ino lifeime value calculaions. Among he models ha provide such capabiliies, he Pareo/NBD couning your cusomers framework proposed by Schmilein e al. (1987) is highly regarded. However, despie he respec i has earned, i has proven o be a difficul model o implemen, paricularly because of compuaional challenges associaed wih parameer esimaion. We develop a new model, he bea-geomeric/nbd (BG/NBD), which represens a sligh variaion in he behavioral sory associaed wih he Pareo/NBD bu is vasly easier o implemen. We show, for insance, how is parameers can be obained quie easily in Microsof Excel. The wo models yield very similar resuls in a wide variey of purchasing environmens, leading us o sugges ha he BG/NBD could be viewed as an aracive alernaive o he Pareo/NBD in mos applicaions. Key words: cusomer base analysis; repea buying; Pareo/NBD; probabiliy models; forecasing; lifeime value Hisory: This paper was received Augus 11, 23, and was wih he auhors 7 monhs for 2 revisions; processed by Gary Lilien. 1. Inroducion Faced wih a daabase conaining informaion on he frequency and iming of ransacions for a lis of cusomers, i is naural o ry o make forecass abou fuure purchasing. These projecions ofen range from aggregae sales rajecories (e.g., for he nex 52 weeks), o individual-level condiional expecaions (i.e., he bes guess abou a paricular cusomer s fuure purchasing, given informaion abou his pas behavior). Many oher relaed issues may arise from a cusomer-level daabase, bu hese are ypical of he quesions ha a manager should iniially ry o address. This is paricularly rue for any firm wih serious ineres in racking and managing cusomer lifeime value (CLV) on a sysemaic basis. There is a grea deal of ineres, among markeing praciioners and academics alike, in developing models o accomplish hese asks. One of he firs models o explicily address hese issues is he Pareo/NBD couning your cusomers framework originally proposed by Schmilein e al. (1987), called hereafer SMC. This model describes repea-buying behavior in seings where cusomer dropou is unobserved: I assumes ha cusomers buy a a seady rae (albei in a sochasic manner) for a period of ime, and hen become inacive. More specifically, ime o dropou is modelled using he Pareo (exponenial-gamma mixure) iming model, and repea-buying behavior while acive is modelled using he NBD (Poisson-gamma mixure) couning model. The Pareo/NBD is a powerful model for cusomer-base analysis, bu is empirical applicaion can be challenging, especially in erms of parameer esimaion. Perhaps because of hese operaional difficulies, relaively few researchers acively followed up on he SMC paper soon afer i was published (as judged by ciaion couns). However, i has received a seadily increasing amoun of aenion in recen years as many researchers and managers have become concerned abou issues such as cusomer churn, ariion, reenion, and CLV. While a number of researchers (e.g., Balasubramanian e al. 1998, Jain and Singh 22, Mulhern 1999, Niraj e al. 21) refer o he applicabiliy and usefulness of he Pareo/NBD, only a small handful claim o have acually implemened i. Neverheless, some of hese papers (e.g., Reinarz and Kumar 2, Schmilein and Peerson 1994) have, in urn, become quie popular and widely cied. 275

276 Markeing Science 24(2), pp. 275 284, 25 INFORMS The objecive of his paper is o develop a new model, he bea-geomeric/nbd (BG/NBD), which represens a sligh variaion in he behavioral sory ha lies a he hear of SMC s original work, bu is vasly easier o implemen. We show, for insance, how is parameers can be obained quie easily in Microsof Excel, wih no appreciable loss in he model s abiliy o fi or predic cusomer purchasing paerns. We develop he BG/NBD model from firs principles and presen he expressions required for making individual-level saemens abou fuure buying behavior. We compare and conras is performance o ha of he Pareo/NBD via a simulaion and an illusraive empirical applicaion. The wo models yield very similar resuls, leading us o sugges ha he BG/NBD should be viewed as an aracive alernaive o he Pareo/NBD model. Before developing he BG/NBD model, we briefly review he Pareo/NBD model ( 2). In 3 we ouline he assumpions of he BG/NBD model, deriving he key expressions a he individual level and for a randomly chosen individual, in 4 and 5, respecively. This is followed by he aforemenioned simulaion and empirical analysis. We conclude wih a discussion of several issues ha arise from his work. 2. The Pareo/NBD Model The Pareo/NBD model is based on five assumpions: (i) While acive, he number of ransacions made by a cusomer in a ime period of lengh is disribued Poisson wih mean. (ii) Heerogeneiy in he ransacion rae across cusomers follows a gamma disribuion wih shape parameer r and scale parameer. (iii) Each cusomer has an unobserved lifeime of lengh. This poin a which he cusomer becomes inacive is disribued exponenial wih dropou rae. (iv) Heerogeneiy in dropou raes across cusomers follows a gamma disribuion wih shape parameer s and scale parameer. (v) The ransacion rae and he dropou rae vary independenly across cusomers. The Pareo/NBD (and, as we will see shorly, he BG/NBD) requires only wo pieces of informaion abou each cusomer s pas purchasing hisory: his recency (when his las ransacion occurred) and frequency (how many ransacions he made in a specified ime period). The noaion used o represen his informaion is (X x x T), where x is he number of ransacions observed in he ime period T and x ( < x T ) is he ime of he las ransacion. Using hese wo key summary saisics, SMC derive expressions for a number of managerially relevan quaniies, such as: EX, he expeced number of ransacions in a ime period of lengh (SMC, Equaion (17)), which is cenral o compuing he expeced ransacion volume for he whole cusomer base over ime. PX x, he probabiliy of observing x ransacions in a ime period of lengh (SMC, Equaions (A4), (A43), and (A45)). EYX x x T, he expeced number of ransacions in he period T T + for an individual wih observed behavior X x x T (SMC, Equaion (22)). The likelihood funcion associaed wih he Pareo/NBD model is quie complex, involving numerous evaluaions of he Gaussian hypergeomeric funcion. Besides being unfamiliar o mos researchers working in he areas of daabase markeing and CRM analysis, muliple evaluaions of he Gaussian hypergeomeric are very demanding from a compuaional sandpoin. Furhermore, he precision of some numerical procedures used o evaluae his funcion can vary subsanially over he parameer space (Lozier and Olver 1995); his can cause major problems for numerical opimizaion rouines as hey search for he maximum of he likelihood funcion. To he bes of our knowledge, he only published paper reporing a successful implemenaion of he Pareo/NBD model using sandard maximum likelihood esimaion (MLE) echniques is Reinarz and Kumar (23), and he auhors commen on he associaed compuaional burden. As an alernaive o MLE, SMC proposed a hree-sep mehod-of-momens esimaion procedure, which was furher refined by Schmilein and Peerson (1994). While simpler han MLE, he proposed algorihm is sill no easy o implemen; furhermore, i does no have he desirable saisical properies commonly associaed wih MLE. In conras, he BG/NBD model, o be inroduced in he nex secion, can be implemened very quickly and efficienly via MLE, and is parameer esimaion does no require any specialized sofware or he evaluaion of any unconvenional mahemaical funcions. 3. BG/NBD Assumpions Mos aspecs of he BG/NBD model direcly mirror hose of he Pareo/NBD. The only difference lies in he sory being old abou how/when cusomers become inacive. The Pareo iming model assumes ha dropou can occur a any poin in ime, independen of he occurrence of acual purchases. If we assume insead ha dropou occurs immediaely afer a purchase, we can model his process using he beageomeric (BG) model. More formally, he BG/NBD model is based on he following five assumpions (he firs wo of

Markeing Science 24(2), pp. 275 284, 25 INFORMS 277 which are idenical o he corresponding Pareo/NBD assumpions): (i) While acive, he number of ransacions made by a cusomer follows a Poisson process wih ransacion rae. This is equivalen o assuming ha he ime beween ransacions is disribued exponenial wih ransacion rae, i.e., f j j 1 e j j 1 j > j 1 (ii) Heerogeneiy in follows a gamma disribuion wih pdf f r r r 1 e > (1) r (iii) Afer any ransacion, a cusomer becomes inacive wih probabiliy p. Therefore he poin a which he cusomer drops ou is disribued across ransacions according o a (shifed) geomeric disribuion wih pmf Pinacive immediaely afer jh ransacion p1 p j 1 j 1 2 3 (iv) Heerogeneiy in p follows a bea disribuion wih pdf fp a b pa 1 1 p b 1 p 1 (2) Bab where Bab is he bea funcion, which can be expressed in erms of gamma funcions: Bab ab/a + b. (v) The ransacion rae and he dropou probabiliy p vary independenly across cusomers. 4. Model Developmen a he Individual Level 4.1. Derivaion of he Likelihood Funcion Consider a cusomer who had x ransacions in he period T wih he ransacions occurring a 1 2 x : 1 2 T We derive he individual-level likelihood funcion in he following manner: he likelihood of he firs ransacion occurring a 1 is a sandard exponenial likelihood componen, which equals e 1. he likelihood of he second ransacion occurring a 2 is he probabiliy of remaining acive a 1 imes he sandard exponenial likelihood componen, which equals 1 pe 2 1. This coninues for each subsequen ransacion, unil: he likelihood of he xh ransacion occurring a x is he probabiliy of remaining acive a x 1 x imes he sandard exponenial likelihood componen, which equals 1 pe x x 1. Finally, he likelihood of observing zero purchases in x T is he probabiliy he cusomer became inacive a x, plus he probabiliy he remained acive bu made no purchases in his inerval, which equals p + 1 pe T x. Therefore, L p 1 2 x T e 1 1 pe 2 1 1 pe x x 1 {p + 1 pe } T x p1 p x 1 x e x + 1 p x x e T As poined ou earlier for he Pareo/NBD, noe ha informaion on he iming of he x ransacions is no required; a sufficien summary of he cusomer s purchase hisory is (X x x T). Similar o SMC, we assume ha all cusomers are acive a he beginning of he observaion period; herefore, he likelihood funcion for a cusomer making purchases in he inerval T is he sandard exponenial survival funcion: L X T e T Thus, we can wrie he individual-level likelihood funcion as L p X xt 1 p x x e T where x> 1ifx>, oherwise. + x> p1 p x 1 x e x (3) 4.2. Derivaion of PX x Le he random variable X denoe he number of ransacions occurring in a ime period of lengh (wih a ime origin of ). To derive an expression for he PX x, we recall he fundamenal relaionship beween inereven imes and he number of evens: X x T x, where T x is he random variable denoing he ime of he xh ransacion. Given our assumpion regarding he naure of he dropou process, PXx Pacive afer xh purchase PT x and T x+1 >+ x> Pbecomes inacive afer xh purchase PT x Given he assumpion ha he ime beween ransacions is characerized by he exponenial disribuion, PT x and T x+1 > is simply he Poisson probabiliy

278 Markeing Science 24(2), pp. 275 284, 25 INFORMS ha Xx, and PT x is he Erlang-x cdf. Therefore, PXxp 1 p x x e + x! x> p1 p x 1 x 1 ] [1 e j (4) j! 4.3. Derivaion of EX Given ha he number of ransacions follows a Poisson process, EX is simply if he cusomer is acive a. For a cusomer who becomes inacive a, he expeced number of ransacions in he period is. However, wha is he likelihood ha a cusomer becomes inacive a? Condiional on and p, j P>Pacive a p 1 p j j e j! j e p This implies ha he pdf of he dropou ime is given by g ppe p. (Noe ha his akes on an exponenial form. However, i feaures an explici associaion wih he ransacion rae, in conras wih he Pareo/NBD, which has an exponenial dropou process ha is independen of he ransacion rae.) I follows ha he expeced number of ransacions in a ime period of lengh is given by EXp P>+ g pd 1 p 1 p e p (5) 5. Model Developmen for a Randomly Chosen Individual All he expressions developed above are condiional on he ransacion rae and he dropou probabiliy p, boh of which are unobserved. To derive he equivalen expressions for a randomly chosen cusomer, we ake he expecaion of he individual-level resuls over he mixing disribuions for and p, as given in (1) and (2). This yields he following resuls. Taking he expecaion of (3) over he disribuion of and p resuls in he following expression for he likelihood funcion for a randomly chosen cusomer wih purchase hisory (X x x T ): Lrab X x x T Bab+x r+x r Bab r+t r+x + x> Ba+1b+x 1 Bab r+x r (6) r+ x r+x The four BG/NBD model parameers (rab) can be esimaed via he mehod of maximum likelihood in he following manner. Suppose we have a sample of N cusomers, where cusomer i had X i x i ransacions in he period T i, wih he las ransacion occurring a xi. The sample log-likelihood funcion is given by N LLrab ln [ Lrab X i x i xi T i ] (7) i1 This can be maximized using sandard numerical opimizaion rouines. Taking he expecaion of (4) over he disribuion of and p resuls in he following expression for he probabiliy of observing x purchases in a ime period of lengh : PXxrab Bab+x Bab r+x rx! Ba+1b+x 1 + x> Bab [ ( ) r { x 1 1 + ( ) r ( ) x + + j r+j rj! ( ) j } ] (8) + Finally, aking he expecaion of (5) over he disribuion of and p resuls in he following expression for he expeced number of purchases in a ime period of lengh : EXrab a+b 1 a 1 [ 1 ( ) r + 2F 1 (rba+b 1 + )] where 2 F 1 is he Gaussian hypergeomeric funcion. (See he appendix for deails of he derivaion.) Noe ha his final expression requires a single evaluaion of he Gaussian hypergeomeric funcion, bu i is imporan o emphasize ha his expecaion is only used afer he likelihood funcion has been maximized. A single evaluaion of he Gaussian hypergeomeric funcion for a given se of parameers is relaively sraighforward, and can be closely approximaed wih a polynomial series, even in a modeling environmen such as Microsof Excel. In order for he BG/NBD model o be of use in a forward-looking cusomer-base analysis, we need o obain an expression for he expeced number of ransacions in a fuure period of lengh for an individual wih pas observed behavior X x x T. (9)

Markeing Science 24(2), pp. 275 284, 25 INFORMS 279 We provide a careful derivaion in he appendix, bu here is he key expression: EYX x x T rab a+b+x 1 a 1 [ 1 ( ) +T r+x ( +T + 2F 1 r +xb+xa+b+x 1 1+ x> a b+x 1 ( +T + x ) r+x +T + (1) Once again, his expecaion requires a single evaluaion of he Gaussian hypergeomeric funcion for any cusomer of ineres, bu his is no a burdensome ask. The remainder of he expression is simple arihmeic. 6. Simulaion While he underlying behavioral sory associaed wih he proposed BG/NBD model is quie similar o ha of he Pareo/NBD, we have no ye provided any assurance ha he empirical performance of he wo models will be closely aligned wih each oher. In his secion, herefore, we discuss a comprehensive simulaion sudy ha provides a horough undersanding of when he BG/NBD can (and canno) serve as a close proxy o he Pareo/NBD. More specifically, we creae a wide variey of purchasing environmens (by manipulaing he four parameers of he Pareo/NBD model) o look for limiing condiions under which he BG/NBD model does a poor job of capuring he underlying purchasing process. 6.1. Simulaion Design To creae hese simulaed purchasing environmens, we chose hree levels for each of he four Pareo/NBD parameers, hen generaed a full-facorial design of 3 4 81 differen worlds. For he wo shape parameers (r and s) we used values of.25,.5, and.75; for each of he wo scale parameers ( and ) we used values of 5, 1, and 15. When we ranslae hese various combinaions ino meaningful summary saisics i becomes easy o see he wide variaion across hese simulaed worlds. For insance, buyer peneraion (i.e., he number of cusomers who make a leas one purchase, or 1 P) varies from a low of 13% o a high of 76%. Likewise, average purchase frequency (i.e., mean number of purchases among buyers, or EX/1 P) ranges from 2.1 up o 8.2 purchases per period. I is worh noing ha his broad range covers he observed values from he original Schmilein and Peerson (1994) applicaion as well as he acual daase used in our empirical analysis (o be discussed in he nex secion). For each of he 81 simulaed worlds, we creaed a synheic panel of 4, households, hen simulaed he Pareo/NBD purchase (and dropou) process for a period of 14 weeks. We hen ran he BG/NBD model on he firs 52 weeks for each of hese daases, ) ] and used he esimaed parameers o generae forecass for a holdou period covering he remaining 52 weeks. We evaluae he performance of he BG/NBD based on he mean absolue percen error (MAPE) calculaed across his 52-week forecas sales rajecory. If he MAPE value is a low number (below, say, 5%), we have faih in he applicabiliy of he BG/NBD for ha paricular se of underlying parameers; oherwise, we need o look more carefully o undersand why he BG/NBD is no doing an adequae job of maching he Pareo/NBD sales projecion. 6.2. Simulaion Resuls In general, he BG/NBD performed quie well in his holdou-forecasing ask. The average value of he MAPE saisic was 2.68%, and he wors case across all 81 worlds was a reasonably accepable 6.97%. However, upon closer inspecion we noiced an ineresing, sysemaic rend across he worlds wih relaively high values of MAPE. In Table 1 we summarize he relevan summary saisics for he 1 wors simulaed worlds in conras wih he remaining 71 worlds. Noice ha he BG/NBD forecass end o be relaively poor when peneraion and/or purchase frequency are exremely low. Upon furher reflecion abou he differences beween he wo model srucures, his resul makes sense. Under he Pareo/NBD model, dropou can occur a any ime even before a cusomer has made his firs purchase afer he sar of he observaion period. However, under he BG/NBD, a cusomer canno become inacive before making his firs purchase. If peneraions and/or buying raes are fairly high, hen his difference becomes relaively inconsequenial. However, in a world where acive buyers are eiher uncommon or very slow in making heir purchases, he BG/NBD will no do such a good job of mimicking he Pareo/NBD. Beyond his one source of deviaion, here do no appear o be any oher paerns associaed wih higher versus lower values of MAPE. For insance, he Pearson correlaion beween MAPE and peneraion for he 71 worlds wih good behavior is a modes.142. (In conras, across all 81 worlds, his correlaion is.379.) Therefore, when we se aside he worlds wih sparse buying, he BG/NBD appears o be very robus. I would be a simple maer o exend he BG/NBD model o allow for a segmen of hard core nonbuyers. This would require only one addiional Table 1 Summary of Simulaion Resuls Average purchase MAPE Peneraion frequency Wors 1 worlds 5.29 26% 2.6 Oher 71 worlds 2.32 43% 3.8

28 Markeing Science 24(2), pp. 275 284, 25 INFORMS parameer and would likely overcome his problem compleely, bu we do no see he likelihood or severiy of his problem o be exreme enough o warran such an exension as par of he basic model. Neverheless, we encourage managers o coninually monior summary saisics such as peneraion and purchase frequency; for many firms his is already a rouine pracice. Having esablished he robusness (and an imporan limiing condiion) of he BG/NBD, we now urn o a more horough invesigaion of is performance (relaive o he Pareo/NBD) in an acual daase. 7. Empirical Analysis We explore he performance of he BG/NBD model using daa on he purchasing of CDs a he online reailer CDNOW. The full daase focuses on a single cohor of new cusomers who made heir firs purchase a he CDNOW websie in he firs quarer of 1997. We have daa covering heir iniial (rial) and subsequen (repea) purchase occasions for he period January 1997 hrough June 1998, during which he 23,57 Q1/97 riers bough nearly 163, CDs afer heir iniial purchase occasions. (See Fader and Hardie 21 for furher deails abou his daase.) For he purposes of his analysis, we ake a 1/1h sysemaic sample of he cusomers. We calibrae he model using he repea ransacion daa for he 2,357 sampled cusomers over he firs half of he 78-week period and forecas heir fuure purchasing over he remaining 39 weeks. For cusomer i (i 1 2357), we know he lengh of he ime period during which repea ransacions could have occurred (T i 39 ime of firs purchase), he number of repea ransacions in his period (x i ), and he ime of his las repea ransacion ( xi ). (If x i, xi.) In conras o Fader and Hardie (21), we are focusing on he number of ransacions, no he number of CDs purchased. Maximum likelihood esimaes of he model parameers (rab) are obained by maximizing he loglikelihood funcion given in (7) above. Sandard numerical opimizaion mehods are employed, using he Solver ool in Microsof Excel, o obain he parameer esimaes. (Idenical esimaes are obained using he more sophisicaed MATLAB programming language.) To implemen he model in Excel, we rewrie he log-likelihood funcion, (6), as where Lrab X x x T A 1 A 2 A 3 + x> A 4 A 1 r+xr A r 2 a+bb+x ba+b+x ( ) 1 r+x )( ) a 1 r+x A 3 A +T 4 ( b+x 1 + x This is very easy o code in Excel see Figure 1 for complee deails. (A noe on how o implemen he model in Excel, along wih a copy of he complee spreadshee, can be found a hp://brucehardie.com/ noes/4/.) The parameers of he Pareo/NBD model are also obained via MLE, bu his ask could be performed only in MATLAB due o he compuaional demands of he model. The parameer esimaes and corresponding log-likelihood funcion values for he wo models are repored in Table 2. Looking a he log-likelihood funcion values, we observe ha he BG/NBD model provides a beer fi o he daa. In Figure 2, we examine he fi of hese models visually: The expeced numbers of people making, 1 7+ repea purchases in he 39-week model calibraion period from he wo models are compared o he acual frequency disribuion. The fis of he wo models are very close. On he basis of he chi-square Figure 1 Screensho of Excel Workshee for Parameer Esimaion B 1 2 3 4 5 6 7 8 9 r alpha a b LL.243 4.414.793 2.426 9582.4 1 SUM(E8:E2364) 11 12 13 14 15 16 17 2362 2363 2364 A C D E F G H I GAMMALN(B$1+ B8) GAMMALN(B$1) + B$1*LN(B$2) IF(B8>, LN(B$3) LN(B$4 + B8 1) (B$1+ B8)* LN(B$2 + C8),) (B$1+B8)*LN(B$2 + D8) ID x _x T ln(.) ln(a_1) ln(a_2) ln(a_3) ln(a_4) 1 2 3.43 38.86 9.4596.839.491 8.4489 9.4265 2 1 1.71 38.86 4.4711 1.562.2828 4.6814 3.379 3. 38.86.5538.362..914. 4. 38.86.5538.362..914. 5. 38.86.5538.362..914. F8+G8+LN(EXP(H8)+(B8>)*EXP(I8)) 6 7 29.43 38.86 21.8644 GAMMALN(B$3 6.784+ B$4) + GAMMALN(B$4+B8) 27.2863 27.8696 7 1 5. 38.86 4.8651 GAMMALN(B$4) GAMMALN(B$3+B$4+B8) 1.562.2828 4.6814 3.943 8 9 2. 35.71 38.86 38.86.5538 9.5367.362.839..491.914 8.4489. 9.7432 1. 38.86.5538.362..914. 2355. 27..4761.362..8363. 2356 4 26.57 27. 14.1284 1.145.7922 14.6252 16.492 2357. 27..4761.362..8363. 1.999

Markeing Science 24(2), pp. 275 284, 25 INFORMS 281 Table 2 Model Esimaion Resuls Figure 3 Condiional Expecaions BG/NBD Pareo/NBD r 243 553 4414 1578 a 793 b 2426 s 66 11669 LL 95824 9595 goodness-of-fi es, we noe ha he BG/NBD model provides a beer fi o he daa (3 2 482, p 19) han he Pareo/NBD, 3 2 1199, (p 7). The performance of hese models becomes more apparen when we consider how well he models rack he acual number of (oal) repea ransacions over ime. During he 39-week calibraion period, he racking performance of he BG/NBD and Pareo/ NBD models is pracically idenical. In he subsequen 39-week forecas period, boh models rack he acual (cumulaive) sales rajecory, wih he Pareo/NBD performing slighly beer han he BG/NBD (underforecasing by 2% versus 4%), bu boh models demonsrae superb racking/forecasing capabiliies. Our final and perhaps mos criical examinaion of he relaive performance of he wo models focuses on he qualiy of he predicions of individual-level ransacions in he forecas period (Weeks 4 78) condiional on he number of observed ransacions in he model calibraion period. For he BG/NBD model, hese are compued using (1). For he Pareo/NBD, as noed earlier, he equivalen expression is represened by Equaion (22) in SMC. In Figure 3, we repor hese condiional expecaions along wih he average of he acual number of ransacions ha ook place in he forecas period, broken down by he number of calibraion-period repea Figure 2 Frequency 1,5 1, 5 Prediced Versus Acual Frequency of Repea Transacions Acual BG/NBD Pareo/NBD 1 2 3 4 5 6 7+ # Transacions Expeced # Transacions in Weeks 4 78 7 6 5 4 3 2 1 Acual BG/NBD Pareo/NBD 1 2 3 4 5 6 7+ # Transacions in Weeks 1 39 ransacions. (For each x, we are averaging over cusomers wih differen values of x.) Boh he BG/NBD and Pareo/NBD models provide excellen predicions of he expeced number of ransacions in he holdou period. I appears ha he Pareo/NBD offers slighly beer predicions han he BG/NBD, bu i is imporan o keep in mind ha he groups owards he righ of he figure (i.e., buyers wih larger values of x in he calibraion period) are exremely small. An imporan aspec ha is hard o discern from he figure is he relaive performance for he very large zero class (i.e., he 1,411 people who made no repea purchases in he firs 39 weeks). This group makes a oal of 334 ransacions in Weeks 4 78, which comprises 18% of all of he forecas period ransacions. (This is second only o he 7+ group, which accouns for 22% of he forecas period ransacions.) The BG/NBD condiional expecaion for he zero class is.23, which is much closer o he acual average (334/1,411.24) han ha prediced by he Pareo/NBD (.14). Neverheless, hese differences are no necessarily meaningful. Taken as a whole across he full se of 2,357 cusomers, he predicions for he BG/NBD and Pareo/NBD models are indisinguishable from each oher and from he acual ransacion numbers. This is confirmed by a hree-group ANOVA (F 2768 265), which is no significan a he usual 5% level. The means repored in Figure 3 mask he variabiliy in he individual-level numbers. Consider, for example, he 1 cusomers who made hree repea ransacions in he calibraion period. In he course of he 39- week forecas period, his group of cusomers made anywhere beween and 1 repea ransacions, wih an average of 1.56 ransacions. The individual-level BG/NBD condiional expecaions vary from.4 o

282 Markeing Science 24(2), pp. 275 284, 25 INFORMS Table 3 Correlaions Beween Forecas Period Transacion Numbers Acual BG/NBD Pareo/NBD Acual 1. BG/NBD.626 1. Pareo/NBD.63.996 1. 2.57, wih an average of 1.52; he Pareo/NBD condiional expecaions vary from.9 o 2.84, wih an average of 1.71. Table 3 repors he correlaions beween he acual number of repea ransacions and he BG/NBD and Pareo/NBD condiional expecaions, compued across all 2,357 cusomers. We observe ha he correlaion beween he acual number of forecas period ransacions and he associaed BG/NBD condiional expecaions is.626. Is his high or low? To he bes of our knowledge, no oher researchers have repored such measures of individual-level predicive performance, which makes i difficul for us o assess wheher his correlaion is good or bad. (We hope ha fuure research will shed ligh on his issue.) Given he objecives of his research, i is of greaer ineres o compare he BG/NBD predicions wih hose of he Pareo/NBD model. The differences are negligible: The correlaion beween hese wo ses of numbers is an impressive.996. This analysis demonsraes he high degree of validiy of boh models, paricularly for he purposes of forecasing a cusomer s fuure purchasing, condiional on his pas buying behavior. Furhermore, i demonsraes ha he performance of he BG/NBD model mirrors ha of he Pareo/NBD model. 8. Discussion Many researchers have praised he Pareo/NBD model for is sensible behavioral sory, is excellen empirical performance, and he useful managerial diagnosics ha arise quie naurally from is formulaion. We fully agree wih hese posiive assessmens and have no misgivings abou he model whasoever, besides is compuaional complexiy. I is simply our inenion o make his ype of modeling framework more broadly accessible so ha many researchers and praciioners can benefi from he original ideas of SMC. The BG/NBD model arises by making a small, relaively inconsequenial, change o he Pareo/NBD assumpions. The ransiion from an exponenial disribuion o a geomeric process (o capure cusomer dropou) does no require any differen psychological heories, nor does i have any noeworhy managerial implicaions. When we evaluae he wo models on heir primary oucomes (i.e., heir abiliy o fi and predic repea ransacion behavior), hey are effecively indisinguishable from each oher. As Albers (2) noes, he use of markeing models in acual pracice is becoming less of an excepion, and more of a rule, because of spreadshee sofware. I is our hope ha he ease wih which he BG/NBD model can be implemened in a familiar modeling environmen will encourage more firms o ake beer advanage of he informaion already conained in heir cusomer ransacion daabases. Furhermore, as key personnel become comforable wih his ype of model, we can expec o see growing demand for more complee (and complex) models and more willingness o commi resources o hem. Beyond he purely echnical aspecs involved in deriving he BG/NBD model and comparing i o he Pareo/NBD, we have aemped o highligh some imporan managerial aspecs associaed wih his kind of modeling exercise. For insance, o he bes of our knowledge, his is only he second empirical validaion of he Pareo/NBD model he firs being Schmilein and Peerson (1994). (Oher researchers e.g., Reinarz and Kumar 2, 23; Wu and Chen 2, have employed he model exensively, bu do no repor on is performance in a holdou period.) We find ha boh models yield very accurae forecass of fuure purchasing, boh a he aggregae level as well as a he level of he individual (condiional on pas purchasing). Besides using hese empirical ess as a basis o compare models, we also wan o call more aenion o hese analyses wih paricular emphasis on condiional expecaions as he proper yardsicks ha all researchers should use when judging he absolue performance of oher forecasing models for CLVrelaed applicaions. I is imporan for a model o be able o accuraely projec he fuure purchasing behavior of a broad range of pas cusomers, and is performance for he zero class is especially criical, given he ypical size of ha silen group. In using his model, here are several implemenaion issues o consider. Firs, he model should be applied separaely o cusomer cohors defined by he ime (e.g., quarer) of acquisiion, acquisiion channel, ec. (Blaberg e al. 21). (For a very maure cusomer base, he model could be applied o coarse RFM-based segmens.) Second, if we are using one cohor s parameers as he basis for, say, anoher cohor s condiional expecaion calculaions, we mus be confiden ha he wo cohors are comparable. Third, we mus acknowledge an implici assumpion when using he forecass generaed using a model such as ha developed in he paper: We are assuming ha fuure markeing aciviies argeed a he group of cusomers will basically be he same as hose observed in he pas. (Of course, such models can be used o provide a baseline agains which we can

Markeing Science 24(2), pp. 275 284, 25 INFORMS 283 examine he impac of changes in markeing aciviy.) Finally, as wih he Pareo/NBD, he BG/NBD mus be augmened by a model of purchase amoun before i can be used as he basis for CLV calculaions. Two candidae models are he normal-normal mixure (Schmilein and Peerson 1994) and he gammagamma mixure (Colombo and Jiang 1999). A naural saring poin for any such exension would be o assume ha purchase amoun is independen of purchase iming (Schmilein and Peerson 1994). The BG/NBD easily lends iself o relevan generalizaions, such as he inclusion of demographics or measures of markeing aciviy. (In fac, some poenial end users of models such as he BG/NBD and he Pareo/NBD may view he inclusion of such variables as a necessary condiion for implemenaion.) However, grea care mus be exercised when underaking such exensions: To he exen ha cusomer segmens have been formed on he basis of pas behavior (e.g., using he RFM framework) and hese segmens have been argeed wih differen markeing aciviies (Elsner e al. 24), we mus be aware of economeric issues such as endogeneiy bias (Shugan 24) and sample selecion bias. If such exensions are underaken, he BG/NBD in is basic form would sill serve as an appropriae (and hard-o-bea) benchmark model and should be viewed as he righ saring poin for any cusomer-base analysis exercise in a nonconracual seing (i.e., where he opporuniies for ransacions are coninuous and he ime a which cusomers become inacive is unobserved). Acknowledgmens The firs auhor acknowledges he suppor of he Wharon- SMU (Singapore Managemen Universiy) Research Cener. The second auhor acknowledges he suppor of ESRC Gran R223742 and he London Business School Cenre for Markeing. Appendix In his appendix, we derive he expressions for EX and EYX x x T. Cenral o hese derivaions is Euler s inegral for he Gaussian hypergeomeric funcion: 1 2F 1 abcz b 1 1 c b 1 1 z a d Bbc b c>b Derivaion of EX To arrive a an expression for EX for a randomly chosen cusomer, we need o ake he expecaion of (5) over he disribuion of and p. Firs we ake he expecaion wih respec o, giving us EXrp 1 p p+p r The nex sep is o ake he expecaion of his over he disribuion of p. We firs evaluae 1 p p a 1 1 p b 1 Bab r dp a+b 1 a 1 Nex, we evaluae r p a 1 1 p b 1 dp p+p r Bab r 1 p a 2 1 p b 1 +p r dp Bab leing q 1 p (which implies dp dq) ( ) r 1 ( q b 1 1 q a 2 1 ) r + Bab + q dq which, recalling Euler s inegral for he Gaussian hypergeomeric funcion ( ) r ) Ba 1b + Bab 2F 1 (rba+b 1 + I follows ha EXrab a+b 1 a 1 [ 1 ( ) r ( + 2F 1 rba+b 1 + )] Derivaion of EYX x x T Le he random variable Y denoe he number of purchases made in he period T T +. We are ineresed in compuing he condiional expecaion EYX x x T, he expeced number of purchases in he period T T + for a cusomer wih purchase hisory X x x T. If he cusomer is acive a T, i follows from (5) ha EYp 1 p 1 p e p (A1) Wha is he probabiliy ha a cusomer is acive a T? Given our assumpion ha all cusomers are acive a he beginning of he iniial observaion period, a cusomer canno drop ou before he has made any ransacions; herefore, Pacive a T X T p1 For he case where purchases were made in he period T, he probabiliy ha a cusomer wih purchase hisory X x x T is sill acive a T, condiional on and p, is simply he probabiliy ha he did no drop ou a x and made no purchase in x T, divided by he probabiliy of making no purchases in his same period. Recalling ha his second probabiliy is simply he probabiliy ha he cusomer became inacive a x, plus he probabiliy he remained acive bu made no purchases in his inerval, we have Pacive a T X x x T p 1 pe T x p+1 pe T x Muliplying his by 1 p x 1 x e x /1 p x 1 x e x gives us Pacive a T X x x T p 1 px x e T Lp X x x T (A2) where he expression for Lp X x x T is given in (3). (Noe ha when x, he expression given in (A2) equals 1.)

284 Markeing Science 24(2), pp. 275 284, 25 INFORMS Muliplying (A1) and (A2) yields EYX x x T p 1 px x e ( T 1/p 1/pe p /p ) Lp X x x T p 1 1 p x x e T p 1 1 p x x T +p e (A3) LpX x x T (Noe ha his reduces o (A1) when x, which follows from he resul ha a cusomer who made zero purchases in he ime period T mus be assumed o be acive a ime T.) As he ransacion rae and dropou probabiliy p are unobserved, we compue EYX x x T for a randomly chosen cusomer by aking he expecaion of (A3) over he disribuion of and p, updaed o ake ino accoun he informaion X x x T : EYX x x T rab EYX x x T p f p rabx x x T ddp (A4) By Bayes heorem, he join poserior disribuion of and p is given by f p rabx x x T Lp X x xt frf p ab (A5) Lrab X x x T Subsiuing (A3) and (A5) in (A4), we ge EYX x x T rab where and A Ba 1b+x Bab A B LrabX x x T p 1 1 p x x e T frf p abddp r+x r r+t r+x B p 1 1 p x x e T +p frf p abddp { p a 2 1 p b+x 1 } r r+x 1 +T +p e d dp Bab r r+xr 1 p a 2 1 p b+x 1 +T +p r+x dp rbab leing q 1 p (which implies dp dq) r+x r rbab+t + r+x ) q b+x 1 1 q (1 a 2 r+x +T + q dq (A6) (A7) which, recalling Euler s inegral for he Gaussian hypergeomeric funcion, r+x r r+t + r+x 2F 1 (r +xb+xa+b+x 1 +T + Ba 1b+x Bab ) (A8) Subsiuing (6), (A7), and (A8) in (A6) and simplifying, we ge EYX x x T rab [ a+b+x 1 1 ( ) +T r+x ( a 1 +T + 2F 1 r +xb+xa+b+x 1 References 1+ x> a b+x 1 ( +T + x ) r+x +T + Albers, Sönke. 2. Impac of ypes of funcional relaionships, decisions, and soluions on he applicabiliy of markeing models. Inerna. J. Res. Markeing 17(2 3) 169 175. Balasubramanian, S., S. Gupa, W. Kamakura, M. Wedel. 1998. Modeling large daases in markeing. Saisica Neerlandica 52(3) 33 323. Blaberg, Rober C., Gary Gez, Jacquelyn S. Thomas. 21. Cusomer Equiy. Harvard Business School Press, Boson, MA. Colombo, Richard, Weina Jiang. 1999. A sochasic RFM model. J. Ineracive Markeing 13(Summer) 2 12. Elsner, Ralf, Manfred Kraff, Arnd Huchzermeier. 24. Opimizing Rhenania s direc markeing business hrough dynamic mulilevel modeling (DMLM) in a mulicaalog-brand environmen. Markeing Sci. 23(2) 192 26. Fader, Peer S., Bruce G. S. Hardie. 21. Forecasing repea sales a CDNOW: A case sudy. Par 2 of 2. Inerfaces 31(May June) S94 S17. Jain, Dipak, Siddharha S. Singh. 22. Cusomer lifeime value research in markeing: A review and fuure direcions. J. Ineracive Markeing 16(Spring) 34 46. Lozier, D. W., F. W. J. Olver. 1995. Numerical evaluaion of special funcions. Waler Gauschi, ed. Mahemaics of Compuaion 1943 1993: A Half-Cenury of Compuaional Mahemaics. Proc. Sympos. Appl. Mah. American Mahemaical Sociey, Providence, RI. Mulhern, Francis J. 1999. Cusomer profiabiliy analysis: Measuremen, concenraion, and research direcions. J. Ineracive Markeing 13(Winer) 25 4. Niraj, Rakesh, Mahendra Gupa, Chakravarhi Narasimhan. 21. Cusomer profiabiliy in a supply chain. J. Markeing 65(July) 1 16. Reinarz, Werner, V. Kumar. 2. On he profiabiliy of long-life cusomers in a nonconracual seing: An empirical invesigaion and implicaions for markeing. J. Markeing 64(Ocober) 17 35. Reinarz, Werner, V. Kumar. 23. The impac of cusomer relaionship characerisics on profiable lifeime duraion. J. Markeing 67(January) 77 99. Schmilein, David C., Rober A. Peerson. 1994. Cusomer base analysis: An indusrial purchase process applicaion. Markeing Sci. 13(Winer) 41 67. Schmilein, David C., Donald G. Morrison, Richard Colombo. 1987. Couning your cusomers: Who are hey and wha will hey do nex? Managemen Sci. 33(January) 1 24. Shugan, Seven M. 24. Endogeneiy in markeing decision models. Markeing Sci. 23(1) 1 3. Wu, Couchen, Hsiu-Li Chen. 2. Couning your cusomers: Compounding cusomer s in-sore decisions, inerpurchase ime, and repurchasing behavior. Eur. J. Oper. Res. 127(1) 19 119. ) ]