Segmentation For Insurance Payments Michael Sherlock, Transcontinental Direct, Warminster, PA
|
|
- Oswald Fields
- 8 years ago
- Views:
Transcription
1 Segmentation For Insurance Payments Michael Sherlock, Transcontinental Direct, Warminster, PA ABSTRACT An online insurance agency has built a base of names that responded to different offers from various carriers, some of whom have purchased one or more insurance products. The goal of this project is to analyze response and sales data to understand relationship among products, offers, and payments for a single carrier to deploy predictive models and segmentation schemes to target best prospects for acquisition or cross-sell through a multi-channel communication strategy. Individual information was overlaid with demographic and psychographic data. A logistic regression model was developed to score individuals based on their propensity to convert (i.e. pay for the issued policy). These records were then segmented using a clustering technique, and distinct, descriptive groups were produced to alter marketing language. Also, a cross-sell matrix was developed to identify products that the same carrier may offer to existing customers. INTRODUCTION A single carrier s (referred to here as carrier ) Accidental Death and Dismemberment (ADD) product is sold through two different online portals (URLs). Each portal is distinct, despite the fact that the product is identical. Aesthetically, the sites are different; however, more significantly, the offers vary. Portal A features a deviated premium offer ($1 for the first month of coverage) while Portal B features a bonus offer (free $10,000 policy for a limited time). The carrier is having a problem with nonpayment of premium. The portals are reaching their traffic and signup goals, but the carrier is disappointed with the proportion of individuals that make their first payment (i.e. conversion). The goal of this project is to identify those most likely to convert and compare them to those least likely to convert. By utilizing this information, the carrier may adjust their creative and placement strategy as well as their offerings. APPROACH After data hygiene, application and purchase files were matched together to yield a single master file of 16,456 records. Individuals that were denied coverage were removed from the analysis. 155 demographic and psychographic variables were matched to the file which was then loaded into SAS for analysis. The seven most predictive variables that may be gathered during the online application process were used. Since all records visited a site and requested a policy, a logistic regression model was built to predict which records are most likely to pay for the policy after issue. A series of models were built with a variety of variables, and the best performing one was selected. The most predictive fields were: face value of the policy, payment method, age, types of credit cards owned, household income, home ownership, and length of residence. Other variables did show some predictability, but the model was reduced for parsimony. MODEL CODE The code below represents how the final logistic regression model was built. The various iterations of the model building process are omitted here. ODS HTML; ODS GRAPHICS ON; PROC LOGISTIC DATA = client.carrier_final; WHERE issue = 1 & free = "N"; CLASS homeowner cc payment / PARAM=REF REF = FIRST MISSING; MODEL paid(event = '1') = faceamount payment agecode cc medianhhincome lor*homeowner / RSQUARE IPLOTS CLPARM = BOTH LACKFIT CORRB COVB NODUMMYPRINT STB; OUTPUT OUT = client.carrier_logit1 PREDICTED = pred1 PREDPROBS = individual; GRAPHICS DFBETAS ROC ESTPROB; ODS GRAPHICS OFF; ODS HTML CLOSE; ODS HTML file='c:\documents and Settings\msherlock\My Documents\My SAS 1
2 Files\CLIENT\CLIENT_OUT\carrierlogit1.html'; PROC PRINT DATA = client.carrierlogit1; VAR ID issue free paid faceamount payment agecode cc medianhhincome lor homeowner _FROM INTO_ IP_0 IP_1 _LEVEL_ pred1; ODS HTML CLOSE; This code builds the logistic regression model to predict where payment = 1; that is, the issued policy was converted into a paid policy. Logistic regression defaults to the lowest number (in this case, 0) so the software must be explicitly told to model for a 1. Nominal variables (homeowner, credit card type, and payment method) are included in the class statement for SAS to automatically produce dummy variables. An interaction variable (lor*homeowner) was used to only include length of residence for homeowners. The OUT = statement is used to produce an output scored dataset. An OUTMODEL = statement was also included originally to score a hold-out dataset to confirm model validity. The variables used in this model were selected after multiple iterations of running the logistic procedure and selecting the best performing model. MODELING CONCLUSIONS Method of payment is by far the most predictive element of the model. Those paying by credit card are four-times more likely to complete the transaction. Those paying by a bank draft (a.k.a. electronic fund transfer, or EFT) are twice as likely. Those requesting a bill to be sent are 53% less likely to pay. Although method of payment is the most predictive, it is not the only factor. Besides, one does not know the method of payment before someone pays. This model seeks to identify those most likely to pay before the payment option is selected, thereby identifying those that may need more incentive to pay. By mining the data, it was discovered that 23% of those that requested a bill and subsequently did not pay it do indeed possess a bank issued credit card. 26% of those that offered an EFT method of payment and did not pay their bill also have a bank issued credit card. 78% of those that select the credit card payment option complete the transaction. Nearly one out of every four invoice and EFT non-payers were able to pay by credit card. If guided into this payment method, the overall completed transactions would be significantly increased. Over one third (34%) of those selecting EFT do not pay. 26% of those people have a credit card. It is known that 78% of credit card users pay. Meaning, one may increase paid transactions by 7% (34% * 26% * 78% = 7%) by encouraging EFT users to credit card payment. Nearly three-quarters (71%) of invoices go unpaid. It is known that 23% of these people possess a credit card. Again, 78% of credit card users pay. Meaning, one may increase paid transactions by 13% (71% * 23% * 78% = 13%) by encouraging invoice requestors to use a credit card. 2
3 MODEL OUTPUT Model Fit Statistics Intercept Only Intercept and Covariates AIC SC Log L R-Square Max-rescaled R-Square Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio <.0001 Score <.0001 Wald <.0001 Type 3 Analysis of Effects DF Wald Chi-Square Pr > ChiSq faceamount <.0001 payment <.0001 agecode <.0001 cc <.0001 medianhhincome <.0001 LOR*homeowner Analysis of Maximum Likelihood Estimates Parameter DF Estimate Standard Error Wald Chi- Square Pr > ChiSq Standardized Estimate Intercept <.0001 faceamount < Payment (EFT) b <.0001 Payment (Credit) c <.0001 Payment (Invoice) i <.0001 agecode < cc cc cc <.0001 cc cc <.0001 cc cc medianhhincome < LOR*homeowner LOR*homeowner
4 Odds Ratio Estimates Point Estimate 95% Wald Confidence Limits faceamount payment b vs payment c vs payment i vs agecode cc 1 vs cc 2 vs cc 3 vs cc 4 vs cc 5 vs cc 6 vs cc 7 vs medianhhincome Association of Predicted Probabilities and Observed Responses Percent Concordant 70.6 Somers' D Percent Discordant 28.8 Gamma Percent Tied 0.6 Tau-a Pairs c The ROC Curve below illustrates the sensitivity (probability of a false positive) versus 1- specificity (inverse of the probability of a false negative). On this curve, the rapid climb shown on the left-hand side shows that this model is predicting policy conversion well. The estimated area under the curve (C) is approximately If it were 0.5, the resulting curve would be a straight diagonal line; meaning the model would only be predicting well 50% of the time. 4
5 DECILE RESULTS The file was deciled by applying the scoring algorithm to all records. The file was then split into ten portions of equal size to gauge the lift realized by applying said model. Payment Rate Cumulative % paid % of file % paid % of file Decile 1 20% 10% 20% 10% Decile 2 18% 10% 38% 20% Decile 3 12% 10% 50% 30% Decile 4 9% 10% 59% 40% Decile 5 8% 10% 67% 50% Decile 6 7% 10% 75% 60% Decile 7 7% 10% 82% 70% Decile 8 7% 10% 88% 80% Decile 9 6% 10% 95% 90% Decile 10 5% 10% 100% 100% The graph below further illustrates how the predicted payment changes as one goes deeper into the file by decile: Predicted Payment 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Decile 1 Decile 2 Decile 3 Decile 4 Decile 5 Decile 6 Decile 7 Decile 8 Decile 9 Decile 10 The model output and analysis above are included to illustrate the validity of the model. Based on demographic information as well as payment type, one may predict the likelihood of payment. Using this information alone, the carrier has a general idea of the prospects that will most likely result in payment. To further hone the carrier s strategy, a segmentation analysis was produced to group prospects into those most likely to pay and those least likely to pay. This analysis follows. 5
6 SEGMENTATION After the predictive model was built, the scored records were grouped to identify key clusters of good pay and bad pay individuals. The process resulted in twelve distinct clusters where records perform similarly to one another within the cluster and dissimilarly to records in other clusters. Interestingly, only 24% of the final file contained records that were homeowners; however, three of the four best pay clusters were all 100% homeowners. SEGMENTATION CODE Segmentation is a multi-step process. A number of iterations were performed here to produce the best fitting segments. The procedures that follow here are the results of the best group of variables. /*Step 1 Approximate Covariance Estimation (ACE) for Clustering*/ PROC ACECLUS DATA = client.carriersegment OUT = client.carrierlogit1aceclus OUTSTAT = client.carrierlogit1aceclusstat PROPORTION =.03 PP; VAR pred1 medianhhincome agecode renter; /*Step 2 Clustering*/ PROC FASTCLUS DATA = client.carrierlogit1aceclus MAXCLUSTERS=12 MAXITER=10 OUT = client.carrierlogit1fastclus; VAR Can1 Can2 Can3 Can4; /*Step 4 Descriptives*/ ODS HTML file='c:\documents and Settings\msherlock\My Documents\My SAS Files\CLIENT\CLIENT_OUT\Carrier_clusters1.html'; PROC UNIVARIATE DATA = client.carrierlogit1fastclus PLOTS; VAR pred1 paid medianhhincome agecode renter bankcard retail; CLASS cluster (ORDER = INTERNAL); ODS HTML CLOSE; /*Additional Descriptives*/ ODS HTML file='c:\documents and Settings\msherlock\My Documents\My SAS Files\CLIENT\CLIENT_OUT\Carrier_clusters2.html'; PROC UNIVARIATE DATA= client.carrierlogit1fastclus plots; VAR bankdraft creditcard invoice faceamount lor sqinsurl psurl; CLASS cluster (ORDER = INTERNAL); ODS HTML CLOSE; The ACECLUS procedure produces a series of canonical variables based on the input variables. The resulting dataset is then run through the FASTCLUS procedure to actually group the records together based on said canonical variables. Once this task is complete, the clusters are analyzed by the original input variables as well as other variables to produce a full view of what these records look like. The procedure output is omitted here. But what follows is the analysis of the UNIVARIATE procedures grouped by the various cluster segments with the comparison to similar clusters shown. 6
7 THE RENTERS Cluster 2 Cluster 3 Overall Predicted Payment 71% 26% 40% Income $30-$39,999 $30-$39,999 $30-$39,999 Age Home Owner 100% renters 100% renters 24% renters Gender 53% male 38% male 54% male Marital Status 34% married 30% married 68% married Length of Residence Under 5 years Under 5 years Under 12 years Face Value of Policy $118,100 $136,400 $132,700 Dev. Prem / Bonus Off 16% / 67% 12% / 74% 13% / 66% EFT / Credit Card 35% / 62% 0% / 0% 7% / 14% Cluster 2 was nearly three-times more likely to pay than cluster 3, despite the fact that they had many commonalities. Both clusters had similar household income, they rent, and have lived in the same domicile for up to five years. The good pay group, cluster 2, had more men than cluster 3; a proportion more on par with the whole sample. Cluster 2 was somewhat older than cluster 3, versus 25-29, respectively. Also, cluster 2 tended to go for policies with lower face values than cluster 3, resulting in lower monthly premiums. THE HOMEOWNERS Cluster 5 Cluster 7 Overall Predicted Payment 80% 32% 40% Income $40-$49,999 $40-$49,999 $30-$39,999 Age Home Owner 100% own 100% own 76% own Gender 68% male 60% male 54% male Marital Status 76% married 53% married 68% married Length of Residence 6+ years 2-11 years Under 12 years Face Value of Policy $124,500 $139,300 $132,700 Dev. Prem / Bonus Off 18% / 60% 13% / 65% 13% / 66% EFT / Credit Card 24% / 73% 0% / 0% 7% / 14% Both of these groups own their homes and have an annual household income of $40,000-$49,999. They both tend to be married males. Cluster 5, which is more than twice as likely to pay, is significantly more likely to have older, married men than cluster 7. Also, Cluster 5 tends towards less expensive policies. 7
8 THE UPPER-MIDDLE Cluster 12 Cluster 10 Overall Predicted Payment 77% 38% 40% Income $50-$74,999 $50-$74,999 $30-$39,999 Age Home Owner 100% own 100% own 76% own Gender 66% male 66% male 54% male Marital Status 62% married 68% married 68% married Length of Residence 3-13 years 5-18 years Under 12 years Face Value of Policy $140,000 $140,400 $132,700 Dev. Prem / Bonus Off 24% / 53% 13% / 63% 13% / 66% EFT / Credit Card 29% / 65% 0% / 0% 7% / 14% One may expect the most affluent group to be the best pay of all the clusters. However, there is a distinct different between the two most affluent groups. One pays well, the other does not. Key differences are found in age, offer, and payment method. These two clusters contain the records with the highest household income in the sample. Both tend to contain married men that own their homes seeking policies around $140,000. These commonalities aside, cluster 12, with the year-olds, is twice as likely to pay their bill as cluster 10 with the year-olds. THE POWER OF THE OFFER Cluster 1 Cluster 8 Overall Predicted Payment 72% 30% 40% Income $20-$29,999 $20-$29,999 $30-$39,999 Age Home Owner 100% own 100% own 76% own Gender 63% male 59% male 54% male Marital Status 64% married 62% married 68% married Length of Residence 3-13 years 3-15 years Under 12 years Face Value of Policy $126,500 $128,800 $132,700 Dev. Prem / Bonus Off 19% / 61% 10% / 70% 13% / 66% EFT / Credit Card 34% / 58% 0% / 0% 7% / 14% Clusters 1 and 8 are nearly identical. But these two similar groups have a significant disparity in the kind of offer they responded to. The only difference between these two groups is that no one in cluster 8 volunteered a credit card or bank draft as a method of payment. The key difference, and, arguably, the only difference, is that those in cluster 8 simply do not trust the on-line channel for sensitive banking information. In addition, those in cluster 1, the better-pay group, are more likely to have come through a deviated premium offer than cluster 8, which tends towards a bonus offer. 8
9 OTHER LOW-PROBABILITY TO PAY CLUSTERS Cluster 6 Cluster 9 Cluster 11 Cluster 4 Predicted Payment 37% 31% 24% 26% Income 30-39, , , ,999 Age Home Owner 100% own 100% renters 100% renters 100% renters Gender 59% male 46% male 37% male 46% Marital Status 66% married 38% married 22% married 49% married Length of 6+ years 1-9 years Under 5 years Under 5 years Residence Face Value of $126,500 $128,100 $134,200 $139,400 Policy Dev. Prem / Bonus 11% / 65% 11% / 72% 11% / 73% 10% / 76% Off EFT / Credit Card 1% / 0% 4% / 3% 2% / 0% 1% / 0% Cluster 6 is the oldest. Most of the file is 40-44; this cluster is Predicted payment is 37% Cluster 9 and 11 are the poorest. Cluster 11 household income of $15,000-$19,999 is well below the $30,000 to $39,999 seen for most of the file. Cluster 11 has a predicted payment of 24%. Cluster 9 s household income is $20,000-$29,999. Cluster 9 has a predicted payment of 31%. Cluster 4 is anomalous in that it contains relatively affluent renters, who are slightly older than the sample mean, and yet is considered unlikely to pay. Besides being highly likely to request a bill, rather than using a credit card or EFT, cluster 4 is the most likely cluster to have come through a bonus offer rather than a deviated premium offer. The predicted payment of cluster 4 is 26%. CROSS-SELLING RESULTS Since multiple lists from a variety of providers were available, a cross-selling matrix was developed to identify commonalities among groups that investigate multiple providers. The greatest cross-over existed between the ADD policies (analyzed above) and the adult term-life product from the same carrier. Six clusters were produced to examine interactions among variables and interest in both products, or lack thereof. Only homeowners were interested in both ADD and Adult products. Women ages were primarily interested in both products. 65% of these women were married. The only group of men interested in both products are ages 45-49, and 84% married. These two groups combined represent 82% of all the records interested in both products. 9
10 CONCLUSION By applying this segmentation scheme to the online portals, the carrier may identify the likelihood that the policy will be paid for. Since it is not based solely on payment type, the carrier may intelligently encourage those who most need it towards more immediate payment options. This model may be applied not only online, but may also be used in an offline re-contact strategy to ensure payment of policies. In addition, the cross-selling opportunities identified here may be used to increase the book of business. When crossselling, one may choose to approach only those in the most likely to pay clusters to create greater efficiencies. Although the EFT method of payment is preferred in the industry, one must realize that due to the public concern with identity theft it is quite difficult to get consumers to volunteer checking account routing numbers online. The industry is opposed to accepting credit cards, due primarily to the increased transaction cost, but credit cards are the universal currency of the online space. There are also quite a few opportunities for further analysis, such as the following: Include marketing costs and upstream marketing source data into the model to use cost-per-lead and costper-policy analysis to identify the most valuable media type Include profitability data to quantify earnings potential for switching policies to credit card payment after deducting card transaction fee Include attrition data to derive lifetime value for use in acquisition and retention marketing With longitudinal data a frequency model may be built to determine the proper timing of on-line and/or offline solicitations to cross-sell other products. ACKNOWLEDGEMENTS SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective companies. CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Michael Sherlock Transcontinental Direct 75 Hawk Road Warminster, PA mjsherlock@gmail.com 10
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing C. Olivia Rud, VP, Fleet Bank ABSTRACT Data Mining is a new term for the common practice of searching through
More informationModeling Lifetime Value in the Insurance Industry
Modeling Lifetime Value in the Insurance Industry C. Olivia Parr Rud, Executive Vice President, Data Square, LLC ABSTRACT Acquisition modeling for direct mail insurance has the unique challenge of targeting
More informationCool Tools for PROC LOGISTIC
Cool Tools for PROC LOGISTIC Paul D. Allison Statistical Horizons LLC and the University of Pennsylvania March 2013 www.statisticalhorizons.com 1 New Features in LOGISTIC ODDSRATIO statement EFFECTPLOT
More information11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationStatistics, Data Analysis & Econometrics
Using the LOGISTIC Procedure to Model Responses to Financial Services Direct Marketing David Marsh, Senior Credit Risk Modeler, Canadian Tire Financial Services, Welland, Ontario ABSTRACT It is more important
More informationDeveloping Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@
Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Yanchun Xu, Andrius Kubilius Joint Commission on Accreditation of Healthcare Organizations,
More informationln(p/(1-p)) = α +β*age35plus, where p is the probability or odds of drinking
Dummy Coding for Dummies Kathryn Martin, Maternal, Child and Adolescent Health Program, California Department of Public Health ABSTRACT There are a number of ways to incorporate categorical variables into
More informationUsing An Ordered Logistic Regression Model with SAS Vartanian: SW 541
Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541 libname in1 >c:\=; Data first; Set in1.extract; A=1; PROC LOGIST OUTEST=DD MAXITER=100 ORDER=DATA; OUTPUT OUT=CC XBETA=XB P=PROB; MODEL
More informationBasic Statistical and Modeling Procedures Using SAS
Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom
More informationCustomer Profiling for Marketing Strategies in a Healthcare Environment MaryAnne DePesquo, Phoenix, Arizona
Paper 1285-2014 Customer Profiling for Marketing Strategies in a Healthcare Environment MaryAnne DePesquo, Phoenix, Arizona ABSTRACT In this new era of healthcare reform, health insurance companies have
More informationSUGI 29 Statistics and Data Analysis
Paper 194-29 Head of the CLASS: Impress your colleagues with a superior understanding of the CLASS statement in PROC LOGISTIC Michelle L. Pritchard and David J. Pasta Ovation Research Group, San Francisco,
More informationPredicting Customer Churn in the Telecommunications Industry An Application of Survival Analysis Modeling Using SAS
Paper 114-27 Predicting Customer in the Telecommunications Industry An Application of Survival Analysis Modeling Using SAS Junxiang Lu, Ph.D. Sprint Communications Company Overland Park, Kansas ABSTRACT
More informationPROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY
PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY ABSTRACT Keywords: Logistic. INTRODUCTION This paper covers some gotchas in SAS R PROC LOGISTIC. A gotcha
More informationModeling Customer Lifetime Value Using Survival Analysis An Application in the Telecommunications Industry
Paper 12028 Modeling Customer Lifetime Value Using Survival Analysis An Application in the Telecommunications Industry Junxiang Lu, Ph.D. Overland Park, Kansas ABSTRACT Increasingly, companies are viewing
More informationOnline Appendix to Are Risk Preferences Stable Across Contexts? Evidence from Insurance Data
Online Appendix to Are Risk Preferences Stable Across Contexts? Evidence from Insurance Data By LEVON BARSEGHYAN, JEFFREY PRINCE, AND JOSHUA C. TEITELBAUM I. Empty Test Intervals Here we discuss the conditions
More informationMethods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL
Paper SA01-2012 Methods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL ABSTRACT Analysts typically consider combinations
More informationThis can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.
One-Degree-of-Freedom Tests Test for group occasion interactions has (number of groups 1) number of occasions 1) degrees of freedom. This can dilute the significance of a departure from the null hypothesis.
More informationA Basic Guide to Modeling Techniques for All Direct Marketing Challenges
A Basic Guide to Modeling Techniques for All Direct Marketing Challenges Allison Cornia Database Marketing Manager Microsoft Corporation C. Olivia Rud Executive Vice President Data Square, LLC Overview
More informationAnalysis of Survey Data Using the SAS SURVEY Procedures: A Primer
Analysis of Survey Data Using the SAS SURVEY Procedures: A Primer Patricia A. Berglund, Institute for Social Research - University of Michigan Wisconsin and Illinois SAS User s Group June 25, 2014 1 Overview
More informationCustomer Life Time Value
Customer Life Time Value Tomer Kalimi, Jacob Zahavi and Ronen Meiri Contents Introduction... 2 So what is the LTV?... 2 LTV in the Gaming Industry... 3 The Modeling Process... 4 Data Modeling... 5 The
More informationVI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
More informationAn Application of the Cox Proportional Hazards Model to the Construction of Objective Vintages for Credit in Financial Institutions, Using PROC PHREG
Paper 3140-2015 An Application of the Cox Proportional Hazards Model to the Construction of Objective Vintages for Credit in Financial Institutions, Using PROC PHREG Iván Darío Atehortua Rojas, Banco Colpatria
More informationImproved Interaction Interpretation: Application of the EFFECTPLOT statement and other useful features in PROC LOGISTIC
Paper AA08-2013 Improved Interaction Interpretation: Application of the EFFECTPLOT statement and other useful features in PROC LOGISTIC Robert G. Downer, Grand Valley State University, Allendale, MI ABSTRACT
More informationSolving Insurance Business Problems Using Statistical Methods Anup Cheriyan
Solving Insurance Business Problems Using Statistical Methods Anup Cheriyan Ibexi Solutions Page 1 Table of Contents Executive Summary...3 About the Author...3 Introduction...4 Common statistical methods...4
More informationEasily Identify Your Best Customers
IBM SPSS Statistics Easily Identify Your Best Customers Use IBM SPSS predictive analytics software to gain insight from your customer database Contents: 1 Introduction 2 Exploring customer data Where do
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationA LOGISTIC REGRESSION MODEL TO PREDICT FRESHMEN ENROLLMENTS Vijayalakshmi Sampath, Andrew Flagel, Carolina Figueroa
A LOGISTIC REGRESSION MODEL TO PREDICT FRESHMEN ENROLLMENTS Vijayalakshmi Sampath, Andrew Flagel, Carolina Figueroa ABSTRACT Predictive modeling is the technique of using historical information on a certain
More informationCredit Risk Analysis Using Logistic Regression Modeling
Credit Risk Analysis Using Logistic Regression Modeling Introduction A loan officer at a bank wants to be able to identify characteristics that are indicative of people who are likely to default on loans,
More informationLogistic (RLOGIST) Example #3
Logistic (RLOGIST) Example #3 SUDAAN Statements and Results Illustrated PREDMARG (predicted marginal proportion) CONDMARG (conditional marginal proportion) PRED_EFF pairwise comparison COND_EFF pairwise
More informationData Mining Techniques Chapter 4: Data Mining Applications in Marketing and Customer Relationship Management
Data Mining Techniques Chapter 4: Data Mining Applications in Marketing and Customer Relationship Management Prospecting........................................................... 2 DM to choose the right
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationDetecting Email Spam. MGS 8040, Data Mining. Audrey Gies Matt Labbe Tatiana Restrepo
Detecting Email Spam MGS 8040, Data Mining Audrey Gies Matt Labbe Tatiana Restrepo 5 December 2011 INTRODUCTION This report describes a model that may be used to improve likelihood of recognizing undesirable
More informationPredicting Recovery Rates for Defaulting Credit Card Debt
Predicting Recovery Rates for Defaulting Credit Card Debt Angela Moore Quantitative Financial Risk Management Centre School of Management University of Southampton Abstract Defaulting credit card debt
More information2013 CRS Research Report MOTORCYCLE SAFETY AND DRIVING UNDER INFLUENCE OF ALCOHOL
2013 CRS Research Report MOTORCYCLE SAFETY AND DRIVING UNDER INFLUENCE OF ALCOHOL Final Report by Andrew P. Tarko, Ph.D. Jose Thomaz CENTER FOR ROAD SAFETY SCHOOL OF CIVIL ENGINEERING PURDUE UNIVERSITY
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationAssessing Course Effectiveness Learning Analytics. Jere Turner,PhD Director of Institutional Research Manchester Community College
Assessing Course Effectiveness Learning Analytics Jere Turner,PhD Director of Institutional Research Manchester Community College Background MCC is one of seven CC in NH Data warehouse built in 2004 DI
More informationAbbas S. Tavakoli, DrPH, MPH, ME 1 ; Nikki R. Wooten, PhD, LISW-CP 2,3, Jordan Brittingham, MSPH 4
1 Paper 1680-2016 Using GENMOD to Analyze Correlated Data on Military System Beneficiaries Receiving Inpatient Behavioral Care in South Carolina Care Systems Abbas S. Tavakoli, DrPH, MPH, ME 1 ; Nikki
More informationABSTRACT INTRODUCTION
Paper SP03-2009 Illustrative Logistic Regression Examples using PROC LOGISTIC: New Features in SAS/STAT 9.2 Robert G. Downer, Grand Valley State University, Allendale, MI Patrick J. Richardson, Van Andel
More informationChina s Middle Market for Life Insurance
China s Middle Market for Life Insurance May 2014 Sponsored by: SOA International Section SOA Marketing & Distribution Section SOA Research Expanding Boundaries Pool The opinions expressed and conclusions
More informationCOMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.
277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies
More informationAn Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA
ABSTRACT An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA Often SAS Programmers find themselves in situations where performing
More informationAdobe Analytics Premium Customer 360
Adobe Analytics Premium: Customer 360 1 Adobe Analytics Premium Customer 360 Adobe Analytics 2 Adobe Analytics Premium: Customer 360 Adobe Analytics Premium: Customer 360 3 Get a holistic view of your
More informationProduct recommendations and promotions (couponing and discounts) Cross-sell and Upsell strategies
WHITEPAPER Today, leading companies are looking to improve business performance via faster, better decision making by applying advanced predictive modeling to their vast and growing volumes of data. Business
More informationSUMAN DUVVURU STAT 567 PROJECT REPORT
SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.
More informationStatistics and Data Analysis
NESUG 27 PRO LOGISTI: The Logistics ehind Interpreting ategorical Variable Effects Taylor Lewis, U.S. Office of Personnel Management, Washington, D STRT The goal of this paper is to demystify how SS models
More informationPaper 45-2010 Evaluation of methods to determine optimal cutpoints for predicting mortgage default Abstract Introduction
Paper 45-2010 Evaluation of methods to determine optimal cutpoints for predicting mortgage default Valentin Todorov, Assurant Specialty Property, Atlanta, GA Doug Thompson, Assurant Health, Milwaukee,
More informationImproving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP
Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP ABSTRACT In data mining modelling, data preparation
More informationRedefining Measurement from Awareness to Conversion. Smart Market: Vol. 4 Data-Driven Marketing, Demystified
Smart Market: Vol. 4 Data-Driven Marketing, Demystified Redefining Measurement from Awareness to Conversion PROGRAMMATIC MARKETING & THE NEW CUSTOMER JOURNEY In today s multiscreen world, the odds that
More informationOverview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
More informationA Property & Casualty Insurance Predictive Modeling Process in SAS
Paper AA-02-2015 A Property & Casualty Insurance Predictive Modeling Process in SAS 1.0 ABSTRACT Mei Najim, Sedgwick Claim Management Services, Chicago, Illinois Predictive analytics has been developing
More informationEasily Identify the Right Customers
PASW Direct Marketing 18 Specifications Easily Identify the Right Customers You want your marketing programs to be as profitable as possible, and gaining insight into the information contained in your
More informationExamining a Fitted Logistic Model
STAT 536 Lecture 16 1 Examining a Fitted Logistic Model Deviance Test for Lack of Fit The data below describes the male birth fraction male births/total births over the years 1931 to 1990. A simple logistic
More informationSTATISTICA. Financial Institutions. Case Study: Credit Scoring. and
Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT
More informationThe Science and Art of Market Segmentation Using PROC FASTCLUS Mark E. Thompson, Forefront Economics Inc, Beaverton, Oregon
The Science and Art of Market Segmentation Using PROC FASTCLUS Mark E. Thompson, Forefront Economics Inc, Beaverton, Oregon ABSTRACT Effective business development strategies often begin with market segmentation,
More informationAlex Vidras, David Tysinger. Merkle Inc.
Using PROC LOGISTIC, SAS MACROS and ODS Output to evaluate the consistency of independent variables during the development of logistic regression models. An example from the retail banking industry ABSTRACT
More informationThe Association and Affinity Marketplace: Expanding Business Opportunities By Understanding Member Preferences by Association Type
The Association and Affinity Marketplace: Expanding Business Opportunities By Understanding Member Preferences by Association Type Introduction Leveraging Sales Opportunities within the Association & Affinity
More informationThe effects of Michigan s weakened motorcycle helmet use law on insurance losses
Bulletin Vol. 30, No. 9 : April 2013 The effects of Michigan s weakened motorcycle helmet use law on insurance losses In April of 2012 the state of Michigan changed its motorcycle helmet law. The change
More informationABSTRACT INTRODUCTION STUDY DESCRIPTION
ABSTRACT Paper 1675-2014 Validating Self-Reported Survey Measures Using SAS Sarah A. Lyons MS, Kimberly A. Kaphingst ScD, Melody S. Goodman PhD Washington University School of Medicine Researchers often
More information5.2 Customers Types for Grocery Shopping Scenario
------------------------------------------------------------------------------------------------------- CHAPTER 5: RESULTS AND ANALYSIS -------------------------------------------------------------------------------------------------------
More informationRole of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign
Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign Arun K Mandapaka, Amit Singh Kushwah, Dr.Goutam Chakraborty Oklahoma State University, OK, USA ABSTRACT Direct
More informationStudents' Opinion about Universities: The Faculty of Economics and Political Science (Case Study)
Cairo University Faculty of Economics and Political Science Statistics Department English Section Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study) Prepared
More informationInsurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.
Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví Pavel Kříž Seminář z aktuárských věd MFF 4. dubna 2014 Summary 1. Application areas of Insurance Analytics 2. Insurance Analytics
More informationA Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps
More informationTarget and Acquire the Multichannel Insurance Consumer
Neustar Insights Whitepaper Target and Acquire the Multichannel Insurance Consumer Increase Conversion by Applying Real-Time Data Across Channels Contents Executive Summary 2 Are You Losing Hot Leads?
More informationAn Example of SAS Application in Public Health Research --- Predicting Smoking Behavior in Changqiao District, Shanghai, China
An Example of SAS Application in Public Health Research --- Predicting Smoking Behavior in Changqiao District, Shanghai, China Ding Ding, San Diego State University, San Diego, CA ABSTRACT Finding predictors
More informationGamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
More informationStudy into the Sales of Add-on General Insurance Products
Study into the Sales of Add-on General Insurance Quantitative Consumer Research Report Prepared For: Financial Conduct Authority (FCA) March, 2014 Authorised Contact Persons Frances Green Research Director
More informationBinary Logistic Regression
Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including
More informationReevaluating Policy and Claims Analytics: a Case of Non-Fleet Customers In Automobile Insurance Industry
Paper 1808-2014 Reevaluating Policy and Claims Analytics: a Case of Non-Fleet Customers In Automobile Insurance Industry Kittipong Trongsawad and Jongsawas Chongwatpol NIDA Business School, National Institute
More informationCombining Linear and Non-Linear Modeling Techniques: EMB America. Getting the Best of Two Worlds
Combining Linear and Non-Linear Modeling Techniques: Getting the Best of Two Worlds Outline Who is EMB? Insurance industry predictive modeling applications EMBLEM our GLM tool How we have used CART with
More informationLogistic Regression. http://faculty.chass.ncsu.edu/garson/pa765/logistic.htm#sigtests
Logistic Regression http://faculty.chass.ncsu.edu/garson/pa765/logistic.htm#sigtests Overview Binary (or binomial) logistic regression is a form of regression which is used when the dependent is a dichotomy
More informationUnderstanding Characteristics of Caravan Insurance Policy Buyer
Understanding Characteristics of Caravan Insurance Policy Buyer May 10, 2007 Group 5 Chih Hau Huang Masami Mabuchi Muthita Songchitruksa Nopakoon Visitrattakul Executive Summary This report is intended
More informationSTATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and
Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table
More informationIBM SPSS Direct Marketing
IBM Software IBM SPSS Statistics 19 IBM SPSS Direct Marketing Understand your customers and improve marketing campaigns Highlights With IBM SPSS Direct Marketing, you can: Understand your customers in
More informationHOW CAN CABLE COMPANIES DELIGHT THEIR CUSTOMERS?
HOW CAN CABLE COMPANIES DELIGHT THEIR CUSTOMERS? Many customers do not love their cable companies. Advanced analytics and causal modeling can discover why, and help to figure out cost-effective ways to
More informationHow to set the main menu of STATA to default factory settings standards
University of Pretoria Data analysis for evaluation studies Examples in STATA version 11 List of data sets b1.dta (To be created by students in class) fp1.xls (To be provided to students) fp1.txt (To be
More informationPaper D10 2009. Ranking Predictors in Logistic Regression. Doug Thompson, Assurant Health, Milwaukee, WI
Paper D10 2009 Ranking Predictors in Logistic Regression Doug Thompson, Assurant Health, Milwaukee, WI ABSTRACT There is little consensus on how best to rank predictors in logistic regression. This paper
More informationChapter 29 The GENMOD Procedure. Chapter Table of Contents
Chapter 29 The GENMOD Procedure Chapter Table of Contents OVERVIEW...1365 WhatisaGeneralizedLinearModel?...1366 ExamplesofGeneralizedLinearModels...1367 TheGENMODProcedure...1368 GETTING STARTED...1370
More informationThree proven methods to achieve a higher ROI from data mining
IBM SPSS Modeler Three proven methods to achieve a higher ROI from data mining Take your business results to the next level Highlights: Incorporate additional types of data in your predictive models By
More informationIBM SPSS Direct Marketing 23
IBM SPSS Direct Marketing 23 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 23, release
More informationWashoe County Senior Services 2013 Survey Data: Service User
Washoe County Senior Services 2013 Survey Data: Service User Profile Prepared by Zebbedia G. Gibb & Peter Reed UNR Sanford Center for Aging (Jan 2014) Overall Summary: Income was the single best predictor
More informationDirect Marketing Profit Model. Bruce Lund, Marketing Associates, Detroit, Michigan and Wilmington, Delaware
Paper CI-04 Direct Marketing Profit Model Bruce Lund, Marketing Associates, Detroit, Michigan and Wilmington, Delaware ABSTRACT A net lift model gives the expected incrementality (the incremental rate)
More informationThe Demand for Financial Planning Services 1
The Demand for Financial Planning Services 1 Sherman D. Hanna, Ohio State University Professor, Consumer Sciences Department Ohio State University 1787 Neil Avenue Columbus, OH 43210-1290 Phone: 614-292-4584
More informationGetting Correct Results from PROC REG
Getting Correct Results from PROC REG Nathaniel Derby, Statis Pro Data Analytics, Seattle, WA ABSTRACT PROC REG, SAS s implementation of linear regression, is often used to fit a line without checking
More informationIBM SPSS Direct Marketing 22
IBM SPSS Direct Marketing 22 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 22, release
More informationVariable Selection in the Credit Card Industry Moez Hababou, Alec Y. Cheng, and Ray Falk, Royal Bank of Scotland, Bridgeport, CT
Variable Selection in the Credit Card Industry Moez Hababou, Alec Y. Cheng, and Ray Falk, Royal ank of Scotland, ridgeport, CT ASTRACT The credit card industry is particular in its need for a wide variety
More informationPredictive Modeling Using Transactional Data
Financial Services the way we see it Predictive Modeling Using Transactional Data Contents Introduction 3 Using Transactional Data 3 Data Quality 3. Data Profiling 3. Exploratory Data Analysis Cohort and
More informationGeneral Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.
General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n
More informationCategorical Data Analysis
Richard L. Scheaffer University of Florida The reference material and many examples for this section are based on Chapter 8, Analyzing Association Between Categorical Variables, from Statistical Methods
More informationUSING LOGISTIC REGRESSION TO PREDICT CUSTOMER RETENTION. Andrew H. Karp Sierra Information Services, Inc. San Francisco, California USA
USING LOGISTIC REGRESSION TO PREDICT CUSTOMER RETENTION Andrew H. Karp Sierra Information Services, Inc. San Francisco, California USA Logistic regression is an increasingly popular statistical technique
More informationSurvey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups
Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln Log-Rank Test for More Than Two Groups Prepared by Harlan Sayles (SRAM) Revised by Julia Soulakova (Statistics)
More informationLecture 18: Logistic Regression Continued
Lecture 18: Logistic Regression Continued Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina
More informationMultinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
More informationIBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA
CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA Summer 2013, Version 2.0 Table of Contents Introduction...2 Downloading the
More informationAnalytics: A Powerful Tool for the Life Insurance Industry
Life Insurance the way we see it Analytics: A Powerful Tool for the Life Insurance Industry Using analytics to acquire and retain customers Contents 1 Introduction 3 2 Analytics Support for Customer Acquisition
More informationData Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
More informationCan Annuity Purchase Intentions Be Influenced?
Can Annuity Purchase Intentions Be Influenced? Jodi DiCenzo, CFA, CPA Behavioral Research Associates, LLC Suzanne Shu, Ph.D. UCLA Anderson School of Management Liat Hadar, Ph.D. The Arison School of Business,
More informationTutorial Segmentation and Classification
MARKETING ENGINEERING FOR EXCEL TUTORIAL VERSION 1.0.8 Tutorial Segmentation and Classification Marketing Engineering for Excel is a Microsoft Excel add-in. The software runs from within Microsoft Excel
More informationMarketing Applications of Predictive Analytics. Robert J. Walling III, FCAS, MAAA San Diego, CA October 6, 2008
Marketing Applications of Predictive Analytics Robert J. Walling III, FCAS, MAAA San Diego, CA October 6, 2008 Overview Who s Buying What? Who s Selling What? A Proactive Approach Monitoring Results Who
More informationFactors affecting online sales
Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4
More information