MARKET SEGMENTATION, CUSTOMER LIFETIME VALUE, AND CUSTOMER ATTRITION IN HEALTH INSURANCE: A SINGLE ENDEAVOR THROUGH DATA MINING Illya Mowerman WellPoint, Inc. 370 Bassett Road North Haven, CT 06473 (203) 654 3188 Illya.mowerman@wellpoint.com ABSTRACT In today s businesses world with finite financial resources and evermore competition, firms must focus their efforts of both acquiring new customers and retaining existing ones that produce the highest profits. Therefore, profits as well as acquisition and retention costs must be viewed on a long term basis, this meaning over the life of the customer with the firm. This paper proposes a methodology for calculating the lifetime value of a customer in the health care insurance industry that renders a customer segmentation based on their expected life with the company and their profitability with the aid of survival analysis. INTRODUCTION Theophrastus, the immediate successor of Aristotle, was among other things a student of character. His pioneering character sketches captured the profound insight that an individual s behavior across seemingly unrelated domains is often highly correlated [4]. It is not uncommon for a person to ask what kind of a person is he? in order to attempt to predict a future behavior or reaction. Likewise in many businesses, marketing campaigns are built on studies that capture measures of loyalty, specifically in loyalty of profitable customers. It is by now well documented that individuals exhibit consistent behavioral tendencies across a range of contexts. The associations are wide ranging and sometimes surprising. Recently, psychologists have found that there are strong linkages between personality measures and how a person walks, how often they smile, what kind of music they like, and how they dress [4]. For businesses, the interest would be to be able to identify groups that are loyal to their brand. In the health care insurance industry there is a fixation on targeting healthy customers rather than loyal ones. What is meant by healthy is not only the customers current state of health,
but also their future health. The ability to predict future health costs has been well studied and is taken into account within the underwriting department. The customer is then given a premium commensurate to their risk, and in some cases denied coverage. The ability to predict which customers will incur in very high costs, usually due to rare illnesses or physical accidents, is, for the most part, not possible unless with the use of genetic testing or a crystal ball, which is not an option for now. Therefore, the fixation on targeting customers that will never become train wrecks is an exercise in futility. Furthermore, given the nature of large numbers, if an insurance product is priced correctly, commensurate with the benefits offered, when a product has enough customers, referred to as members in the health insurance industry, the overall profitability will be positive. In the health insurance industry target marketing and customer retention campaigns should be done on the basis of loyalty, profitability, as well as other dimensions. The need then arises to group existing and potential members into like groups in order to develop more responsive campaigns. Traditionally, when customer segmentation is mentioned one immediately thinks of cluster analysis, grouping observations with similar traits. The limitation of this approach in the health insurance industry is that there are time dependent covariates the affect loyalty which cannot be easily included into cluster analysis if not by a myriad of indicator variables. Sequentially after creating a segmentation the market researcher may formulate the next analysis with a binary outcome, churn and no churn, in order to predict who is most likely to churn by pair of segments found in the cluster analysis. The need of modeling at least pairs of segments is due to that within segments members have very similar durations. The complexity now is which pairs of segments to model together to produce meaningful contrasts. Conversely the market researcher may decide to model churn without talking into account a segmentation by using all the observations, but then she is left without taking into account the time dependent variables once again. For this analysis logistic regression, neural networks, as well as other algorithms can be employed. The health insurance industry has special considerations when compared to other industries. In this industry, like other insurance industries, premiums are paid monthly, and customer benefits are irregular and sparse for the most part. Other insurance companies, such as automotive and home insurance, are mandatory, either by the State or by the bank that issued the loan for the asset, while health insurance in most states is voluntary. When the members pay their premium they receive in return piece of mind. It is only when they receive a health care service or products do members actually receive a tangible benefit from their premium. The implication of the difference between the value proposition of health insurance companies and other businesses creates the need for special considerations. What is meant by this is that there are dimensions and metrics within the industry that are specific to the industry.
In this paper a framework for creating a customer market segmentation, and calculating lifetime value for the health insurance industry. Next is the literature review where the methodology was derived. Then the proposed methodology with a detailed account of the metrics required is presented. Last, the conclusion. LITERATURE REVIEW Calculating lifetime value, as well as creating a segmentation based on profitability and duration can be viewed as both a dichotomous outcome, churn and non churn, and a time series problem. Survival analysis does both as it calculates time to event. Nonetheless, traditional survival analysis models are not the well suited for this research. Conventional survival models were developed for small data sets from designed experiments where the purpose of the analysis is to guide scientifically sound conclusions. These methods are often awkward for large databases where the purpose is to guide profitable business decisions [6]. The algorithm to be used is the Discrete Time Logistic Hazard Model, which was first introduced by C. Brown in 1975. Hazard models based on logistic regression are well suited to the challenging features of survival data mining problems such as: discrete time, dependent competing risks, truncated data, time dependent covariates, time varying effects of the covariates, and irregular non linear hazards [6]. Traditional survival algorithms cannot deal with all the above mentioned conditions together. Hazard models based on logistic regression originated in the field of Biostatistics [3], but have been rarely used in medical applications. However, it is better established in the field of social sciences [1], [2]. METHODOLOGY This is a data mining endeavor, and established data mining steps are applied: define the research question, prepare and explore the data, apply data mining algorithms, interpret and analyze results, disseminate knowledge [5]. The research question is: who are our most loyal customers, who are our least loyal, and break them out by gross margin? The data used are transactional data on claims and premiums, and demographic data from the health insurance company, psychographic and financial data from a vendor of this type of data. Following is the methodology for the data preparation in order to successfully build the model that will answer the research question. The considerations that need to be taken into account to successfully apply the data mining algorithm evidently will determine the preparation for the data. The contribution of this paper lies in the preparation of the time dependent covariates and the introduction and proposed use of regression splines.
The algorithm requires multiple observations per subject, in the case the insurance policy subscriber, one for each discrete time interval. For this analysis the time interval will be one month, because the industry functions, in many ways, on a monthly basis. Premiums, although paid in different intervals at the choice of the subscriber, are calculated monthly along with other metrics related to claims, which are calculated per member per month (PMPM). The question now is the statistic of the time dependent covariates. Age of the subscriber is evident to be the actual age at the time interval, but the statistic of other covariates is not so easily discerned. Gross margin on the other hand should be a cumulative statistic reported on a PMPM basis, which allows for the normalization of multiple members in a policy, eliminates the confounding effect of time, and allows for a broader understanding of the policies financial health. Premiums that are changed, excluding when the product is changed or the member count within the policy changes, change on a 12 month anniversary cycle. Exploratory data analysis has shown that the impact of rate changes lasts for three months after the rate change. Therefore, a field indicating rate change representing the nominal dollar value of the rate change will remain the same for three time intervals, starting from the first month the rate hike is in effect. This will allow the model to capture churn, also referred to as lapses in the health insurance industry, due to rate increases. Another field, similar in nature, could be calculated as the percent of the rate hike. Nonetheless, it is obvious that these two variables are confounding, and that one of them should ultimately be taken out. Product changes signal a change in perceived need of the subscriber. A down grade in product may signal either a realization of the subscriber that his product is too rich in benefits, or perhaps a downturn in his income. Conversely, an upgrade in product may signal either a perceived future need of more benefits, or an increase in the subscriber s income. In either case, a change in product signals an engagement with the firm to modify their contract with it with the ultimate goal of improving their perceived value from the health insurance company. Therefore, a cumulative variable is to be created that counts the number of product changes. It will also be useful to create an additional variable that would indicate if the change was an upgrade or downgrade. Policies with no claims have been found to be more likely to lapse than policies with at least one claim. This finding was encountered in the exploratory data analysis. The variable that indicates whether a policy has had no claims is cumulative in the sense that a policy will have the variable set to true until the month when the first claim is encountered and false from then on.
Regression splines are segmented functions composed of polynomials. The join between the segments are called knots. A regression spline suitable for hazard functions is composed of several cubic segments and a linear end segment joined smoothly to each other. The function can be parameterized as a linear combination of time and a set of cubic spline basis functions. Several cubic splines are introduced into the model as covariates at different time intervals equal spaced. In example, at every three months a spline is inserted into the model. When the algorithm is run, the selection method will determine the significant splines. The splines that were found significant are then used to segment the population. For example, if the splines at months six, fifteen, and twenty four were found significant we then would interpret these results as four macro segments: policies that last up to six months, policies that lapse between seven and fifteen months, policies that term between sixteen and twenty four months, and those that last more than twenty five months. We cannot conclude that those policies that last more than twenty five months do not term because of the right censoring of the data. With the model built and the macro segments defined, the lifetime value of the policy can be calculated. The calculation of the lifetime value, which is well documented, is the net present value of the future returns. DISCUSSION In this paper a single data mining endeavor is proposed to satisfy the goal of segmenting and profiling a customer base of a health insurance company with the aim of understanding churn behavior, and ultimately long term profitability. A novel approach to segmenting is presented with the interpretation of splines that are found significant. Last, a discussion of covariates and their proper statistics, specific to the health insurance industry, were presented. The methodology proposed here is related specially to the health insurance industry. However, this does not limit the methodology proposed to the industry. Specifically, the use of the splines to derive segments based on the churn behavior is applicable to many industries unrelated to health insurance and insurance in general. REFERENCES [1] Allison, P. D. Discrete Time Methods for the Analysis of Event Histories. Sociological Methodology, 1982, Jossey Bass. [2] Allison, P. D. Survival Analysis Using the SAS System. SAS Institute, Inc., 1995. [3] Brown, C. C. On the Use of Indicator Variables for Studying the Time Dependence of Parameters in a Response Time Model. Biometrics, 1975, Vol. 31, 862 872.
[4] Gosling, S. Snoop: What Your Stuff Says about You. New York: Basic Books, 2008. [5] Hand, D. J., Mannila, H., Smyth, P. Principles of Data Mining. The MIT Press, 2001. [6] Potts, W. Survival Data Mining: Predictive Hazard Modeling for Customer History Data. SAS Institute, Inc., 2004.