Distance to Event vs. Propensity of Event A Survival Analysis vs. Logistic Regression Approach
|
|
|
- Gary Bryan
- 10 years ago
- Views:
Transcription
1 Distance to Event vs. Propensity of Event A Survival Analysis vs. Logistic Regression Approach Abhijit Kanjilal Fractal Analytics Ltd. Abstract: In the analytics industry today, logistic regression is a very robust and proven technique to predict the propensity of certain events. As a result, it has been very popular in marketing analytics. Given the same marketing problems, a survival analysis tries to answer them slightly differently it predicts a (time) distance to an event, instead of the propensity of event. With an objective to access the predictability, robustness, and strength of survival analysis approach as compared to logistic regression, we carried out an analysis in the context of an Activation Management Program. The results clearly show that one survival analysis model is as good as multiple logistic regressions for different prediction window, in terms of predictability and strength. When we talk about marketing activity in a customer s lifetime, it probably starts with activation and ends with attrition. These events basically constitute the (obviously closed) boundaries of the premise of marketing. Hence effective marketing around these events is extremely crucial for organizations to drive a significant profitability. This demands the prediction of such events and definition of strategies based on a customer s inclination towards these. The entire challenge becomes measuring a customer s inclination towards marketing events. This can be done in two ways, a) through a Propensity of Event approach or b) through a Distance to Event approach. When we are talking about the propensity of the event approach, it basically scores a probability of the event for each customer. This approach inherently assumes a prediction window in which customers will be assigned a probability of the event; for example, the activation probability of newly acquired customers in the next 12 months. Given this example, we generally would love to build a logistic model. Now, think about the customers who are likely to activate within first six months of on book and customers who are likely to activate within 1-12th month on book. The profiles of these two sets may differ, hence prediction from P6 (activation probability in 6 months) and P12 (activation probability within 12 months) may not be comparable. To understand the profiles of these two sets we basically need to build two models, one with prediction window of six months, and another of 12 months. If we divide the window further, we end up building more models. Do we have any technique so that we get the solution at a go? The answer is survival analysis. Medical science has been using this technique extensively. We should keep in mind that medical science is very sensitive towards errors in prediction. If marketing is also very sensitive towards errors in predicting right customers at right time with right offers, then it makes sense to explore the technique. Does it add any value to the process? This gives us the motivation to access survival analysis as compared to logistic regression in the context of Activation Management Program. Theoretical Model: Logistic regression is a model used for prediction of the probability of occurrence of an event. In this case our target variable is of binary type e.g. attrited or not, or card activated or not. It makes use of some predictive variables, either numerical or categorical, that contain all available information about the objects. 23
2 ( ) pi logit(pi) = ln = β0 + β1x1,i βk xk, 1 pi ( ) pi where logit(pi) p i is pi = the = probability ln of the event = β0 + β1x1,i for the i th + object... + βk xk, 1 and x s are pi the risk factors (β0 + β1x1,i βk xk,i). 1+ affecting e the event. s are the parameters of the model. Hence the probability boils down to pi = 1+ e 1 (β0 + β1x1,i βk xk,i). In logistic regression, we are interested in studying how risk factors are associated with the target event. Sometimes, though, we are interested in how a risk factor affects time to event. Survival analysis is used to analyze data in which the time until the event is of interest. The response is often referred to as a failure time, survival time, or event time. The beauty of survival analysis is that it can tackle the problem of censored data. Censoring is present when we have some information about a subject s event time, but we don t know the exact event time. There are several survival analysis techniques, Cox Proportional Hazard model is the most popular one. In this research we only focused on this technique. Let T D denote the time of the event. Our data, based on a sample size n, consists of a triple Empirical Model: Having these two techniques in place, we tried to address Activation Management Program for a bank credit card. Our objective was to access customers inclination towards activation within six months after acquisition and based on the results to prioritize promotional strategy and speed up the activation process of customers from their natural pace. Customers application information was used to build models. We divided the six months prediction window in three parts: 0-2 months, 0-4 months, and 0-6 months, and built three logistics regression models to predict two-, four-, and six-month activation probability. On the other hand we built one survival analysis and computed two-, four-, and six-month activation probability. To construct this solution, we first fit a proportional hazard model to the data and obtain the partial likelihood estimators. Let t1<t2<...<td be the distinct distance to activation and di be the no of active customers at. Let The estimator of the cumulative baseline activation rate is given by The estimator of the baseline dormancy function, Where T j is the time on study for the j th customer, is the event indicator, and is the vector of risk factor affecting the distance to event. Now, where is an arbitrary baseline hazard rate, is the parameter vector and is a known function. Because must be positive, a common model for is is given by The dormancy probability of a new customer with a given set of Risk Factors X 0 will be Activation Probability = 1 Dormancy Probability The model estimates will be arrived through the following partial likelihood function maximizaiton Activation Probability 24
3 Activation probability will be a monotonically increasing function in time. Here is the beauty of survival model, to get some kind of time function in logistics regression set up, we would need to build several models. The results of the analysis have been shown below one survival model vs. multiple logistic models (Results are based on actual numbers) 25
4 26
5 27
6 28
7 Conclusion: This case study shows that survival analysis model does as good a job as multiple logistic regression models for a different window in terms of strength and predictability. Survival Analysis differentiates individuals in terms of distance to event, and marketing strategies can be prioritized for optimization to drive higher profitability. The logistic regression model is unable to address the same business problem in similar manner. It should be noted that the results are based on a limited range of geography and product data. This case-study does not include timedependent risk factors into consideration. It would be useful to further generalize them in future studies. Reference: 1. Survival Analysis by John P. Klein, Melvin L. Moeschberger, Springer, Survival Analysis Using SAS: A Practical Guide by Paul D. Allison, SAS, March Modeling Survival Data: Extending the Cox Model (Statistics for Biology and Health) by Terry M. Therneau and Patricia M. Grambsch, Springer, Applied Survival Analysis: Regression Modeling of Time to Event Data by David W. Hosmer, Jr. and Stanley Lemeshow, Jhon Wiley & Sons,
200609 - ATV - Lifetime Data Analysis
Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2015 200 - FME - School of Mathematics and Statistics 715 - EIO - Department of Statistics and Operations Research 1004 - UB - (ENG)Universitat
Modeling Customer Lifetime Value Using Survival Analysis An Application in the Telecommunications Industry
Paper 12028 Modeling Customer Lifetime Value Using Survival Analysis An Application in the Telecommunications Industry Junxiang Lu, Ph.D. Overland Park, Kansas ABSTRACT Increasingly, companies are viewing
Statistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural
Predicting Customer Churn in the Telecommunications Industry An Application of Survival Analysis Modeling Using SAS
Paper 114-27 Predicting Customer in the Telecommunications Industry An Application of Survival Analysis Modeling Using SAS Junxiang Lu, Ph.D. Sprint Communications Company Overland Park, Kansas ABSTRACT
Predicting Customer Default Times using Survival Analysis Methods in SAS
Predicting Customer Default Times using Survival Analysis Methods in SAS Bart Baesens [email protected] Overview The credit scoring survival analysis problem Statistical methods for Survival
Statistics for Biology and Health
Statistics for Biology and Health Series Editors M. Gail, K. Krickeberg, J.M. Samet, A. Tsiatis, W. Wong For further volumes: http://www.springer.com/series/2848 David G. Kleinbaum Mitchel Klein Survival
Applying Survival Analysis Techniques to Loan Terminations for HUD s Reverse Mortgage Insurance Program - HECM
Applying Survival Analysis Techniques to Loan Terminations for HUD s Reverse Mortgage Insurance Program - HECM Ming H. Chow, Edward J. Szymanoski, Theresa R. DiVenti 1 I. Introduction "Survival Analysis"
Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
A Comparison of Decision Tree and Logistic Regression Model Xianzhe Chen, North Dakota State University, Fargo, ND
Paper D02-2009 A Comparison of Decision Tree and Logistic Regression Model Xianzhe Chen, North Dakota State University, Fargo, ND ABSTRACT This paper applies a decision tree model and logistic regression
Missing data and net survival analysis Bernard Rachet
Workshop on Flexible Models for Longitudinal and Survival Data with Applications in Biostatistics Warwick, 27-29 July 2015 Missing data and net survival analysis Bernard Rachet General context Population-based,
Least Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
Survival analysis methods in Insurance Applications in car insurance contracts
Survival analysis methods in Insurance Applications in car insurance contracts Abder OULIDI 1-2 Jean-Marie MARION 1 Hérvé GANACHAUD 3 1 Institut de Mathématiques Appliquées (IMA) Angers France 2 Institut
SAS and R calculations for cause specific hazard ratios in a competing risks analysis with time dependent covariates
SAS and R calculations for cause specific hazard ratios in a competing risks analysis with time dependent covariates Martin Wolkewitz, Ralf Peter Vonberg, Hajo Grundmann, Jan Beyersmann, Petra Gastmeier,
LOGISTIC REGRESSION ANALYSIS
LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic
An Application of the Cox Proportional Hazards Model to the Construction of Objective Vintages for Credit in Financial Institutions, Using PROC PHREG
Paper 3140-2015 An Application of the Cox Proportional Hazards Model to the Construction of Objective Vintages for Credit in Financial Institutions, Using PROC PHREG Iván Darío Atehortua Rojas, Banco Colpatria
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing C. Olivia Rud, VP, Fleet Bank ABSTRACT Data Mining is a new term for the common practice of searching through
Modeling Lifetime Value in the Insurance Industry
Modeling Lifetime Value in the Insurance Industry C. Olivia Parr Rud, Executive Vice President, Data Square, LLC ABSTRACT Acquisition modeling for direct mail insurance has the unique challenge of targeting
Sun Li Centre for Academic Computing [email protected]
Sun Li Centre for Academic Computing [email protected] Elementary Data Analysis Group Comparison & One-way ANOVA Non-parametric Tests Correlations General Linear Regression Logistic Models Binary Logistic
Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD
Tips for surviving the analysis of survival data Philip Twumasi-Ankrah, PhD Big picture In medical research and many other areas of research, we often confront continuous, ordinal or dichotomous outcomes
Nominal and ordinal logistic regression
Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome
STATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
SUMAN DUVVURU STAT 567 PROJECT REPORT
SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.
PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY
PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY ABSTRACT Keywords: Logistic. INTRODUCTION This paper covers some gotchas in SAS R PROC LOGISTIC. A gotcha
Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: [email protected] Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
Checking proportionality for Cox s regression model
Checking proportionality for Cox s regression model by Hui Hong Zhang Thesis for the degree of Master of Science (Master i Modellering og dataanalyse) Department of Mathematics Faculty of Mathematics and
Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541
Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541 libname in1 >c:\=; Data first; Set in1.extract; A=1; PROC LOGIST OUTEST=DD MAXITER=100 ORDER=DATA; OUTPUT OUT=CC XBETA=XB P=PROB; MODEL
Duration Analysis. Econometric Analysis. Dr. Keshab Bhattarai. April 4, 2011. Hull Univ. Business School
Duration Analysis Econometric Analysis Dr. Keshab Bhattarai Hull Univ. Business School April 4, 2011 Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 1 / 27 What is Duration Analysis?
Regression Modeling Strategies
Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions
Introduction to Event History Analysis DUSTIN BROWN POPULATION RESEARCH CENTER
Introduction to Event History Analysis DUSTIN BROWN POPULATION RESEARCH CENTER Objectives Introduce event history analysis Describe some common survival (hazard) distributions Introduce some useful Stata
Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection
Directions in Statistical Methodology for Multivariable Predictive Modeling Frank E Harrell Jr University of Virginia Seattle WA 19May98 Overview of Modeling Process Model selection Regression shape Diagnostics
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
An Introduction to Survival Analysis
An Introduction to Survival Analysis Dr Barry Leventhal Henry Stewart Briefing on Marketing Analytics 19 th November 2010 Agenda Survival Analysis concepts Descriptive approach 1 st Case Study which types
VI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing [email protected]
SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing [email protected] IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way
Improve Marketing Campaign ROI using Uplift Modeling. Ryan Zhao http://www.analyticsresourcing.com
Improve Marketing Campaign ROI using Uplift Modeling Ryan Zhao http://www.analyticsresourcing.com Objective To introduce how uplift model improve ROI To explore advanced modeling techniques for uplift
11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
Pattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University
Pattern Analysis Logistic Regression 12. Mai 2009 Joachim Hornegger Chair of Pattern Recognition Erlangen University Pattern Analysis 2 / 43 1 Logistic Regression Posteriors and the Logistic Function Decision
Introduction to Fixed Effects Methods
Introduction to Fixed Effects Methods 1 1.1 The Promise of Fixed Effects for Nonexperimental Research... 1 1.2 The Paired-Comparisons t-test as a Fixed Effects Method... 2 1.3 Costs and Benefits of Fixed
Marketing Information System in Fitness Clubs - Data Mining Approach -
Marketing Information System in Fitness Clubs - Data Mining Approach - Chen-Yueh Chen, Yi-Hsiu Lin, & David K. Stotlar (University of Northern Colorado,USA) Abstract The fitness club industry has encountered
Multivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
7.1 The Hazard and Survival Functions
Chapter 7 Survival Models Our final chapter concerns models for the analysis of data which have three main characteristics: (1) the dependent variable or response is the waiting time until the occurrence
Alex Vidras, David Tysinger. Merkle Inc.
Using PROC LOGISTIC, SAS MACROS and ODS Output to evaluate the consistency of independent variables during the development of logistic regression models. An example from the retail banking industry ABSTRACT
Valuing double barrier options with time-dependent parameters by Fourier series expansion
IAENG International Journal of Applied Mathematics, 36:1, IJAM_36_1_1 Valuing double barrier options with time-dependent parameters by Fourier series ansion C.F. Lo Institute of Theoretical Physics and
Revenue s Business Context
Analytics and Risk Examples from Research & Analytics Branch Duncan Cleary [email protected] http://www.linkedin.com/in/duncancleary Research & Analytics Branch DATA - INFORMATION - KNOWLEDGE 1 Revenue
Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)
Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.) Logistic regression generalizes methods for 2-way tables Adds capability studying several predictors, but Limited to
Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents
Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén
SVM-Based Approaches for Predictive Modeling of Survival Data
SVM-Based Approaches for Predictive Modeling of Survival Data Han-Tai Shiao and Vladimir Cherkassky Department of Electrical and Computer Engineering University of Minnesota, Twin Cities Minneapolis, Minnesota
Attrition in Online and Campus Degree Programs
Attrition in Online and Campus Degree Programs Belinda Patterson East Carolina University [email protected] Cheryl McFadden East Carolina University [email protected] Abstract The purpose of this study
Handling attrition and non-response in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
Statistics and Data Analysis
NESUG 27 PRO LOGISTI: The Logistics ehind Interpreting ategorical Variable Effects Taylor Lewis, U.S. Office of Personnel Management, Washington, D STRT The goal of this paper is to demystify how SS models
Tests for Two Survival Curves Using Cox s Proportional Hazards Model
Chapter 730 Tests for Two Survival Curves Using Cox s Proportional Hazards Model Introduction A clinical trial is often employed to test the equality of survival distributions of two treatment groups.
The equivalence of logistic regression and maximum entropy models
The equivalence of logistic regression and maximum entropy models John Mount September 23, 20 Abstract As our colleague so aptly demonstrated ( http://www.win-vector.com/blog/20/09/the-simplerderivation-of-logistic-regression/
Modeling the Claim Duration of Income Protection Insurance Policyholders Using Parametric Mixture Models
Modeling the Claim Duration of Income Protection Insurance Policyholders Using Parametric Mixture Models Abstract This paper considers the modeling of claim durations for existing claimants under income
Credit Risk Analysis Using Logistic Regression Modeling
Credit Risk Analysis Using Logistic Regression Modeling Introduction A loan officer at a bank wants to be able to identify characteristics that are indicative of people who are likely to default on loans,
Regression Models for Ordinal Responses: A Review of Methods and Applications
International Journal of Epidemiology International Epidemiological Association 1997 Vol. 26, No. 6 Printed in Great Britain Regression Models for Ordinal Responses: A Review of Methods and Applications
Interpretation of Somers D under four simple models
Interpretation of Somers D under four simple models Roger B. Newson 03 September, 04 Introduction Somers D is an ordinal measure of association introduced by Somers (96)[9]. It can be defined in terms
A LONGITUDINAL AND SURVIVAL MODEL WITH HEALTH CARE USAGE FOR INSURED ELDERLY. Workshop
A LONGITUDINAL AND SURVIVAL MODEL WITH HEALTH CARE USAGE FOR INSURED ELDERLY Ramon Alemany Montserrat Guillén Xavier Piulachs Lozada Riskcenter - IREA Universitat de Barcelona http://www.ub.edu/riskcenter
MARKET SEGMENTATION, CUSTOMER LIFETIME VALUE, AND CUSTOMER ATTRITION IN HEALTH INSURANCE: A SINGLE ENDEAVOR THROUGH DATA MINING
MARKET SEGMENTATION, CUSTOMER LIFETIME VALUE, AND CUSTOMER ATTRITION IN HEALTH INSURANCE: A SINGLE ENDEAVOR THROUGH DATA MINING Illya Mowerman WellPoint, Inc. 370 Bassett Road North Haven, CT 06473 (203)
Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups
Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln Log-Rank Test for More Than Two Groups Prepared by Harlan Sayles (SRAM) Revised by Julia Soulakova (Statistics)
Name of the module: Multivariate biostatistics and SPSS Number of module: 471-8-4081
Name of the module: Multivariate biostatistics and SPSS Number of module: 471-8-4081 BGU Credits: 1.5 ECTS credits: Academic year: 4 th Semester: 15 days during fall semester Hours of instruction: 8:00-17:00
Vignette for survrm2 package: Comparing two survival curves using the restricted mean survival time
Vignette for survrm2 package: Comparing two survival curves using the restricted mean survival time Hajime Uno Dana-Farber Cancer Institute March 16, 2015 1 Introduction In a comparative, longitudinal
Support Vector Machines Explained
March 1, 2009 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),
How To Find Out If A College Degree Is More Successful
Issue Brief October 2014 Dual-Credit/Dual-Enrollment Coursework and Long-Term College Success in Texas Justine Radunzel, Julie Noble, and Sue Wheeler This study was a cooperative effort of the Texas-ACT
Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
Applications of R Software in Bayesian Data Analysis
Article International Journal of Information Science and System, 2012, 1(1): 7-23 International Journal of Information Science and System Journal homepage: www.modernscientificpress.com/journals/ijinfosci.aspx
Statistical Analysis of Life Insurance Policy Termination and Survivorship
Statistical Analysis of Life Insurance Policy Termination and Survivorship Emiliano A. Valdez, PhD, FSA Michigan State University joint work with J. Vadiveloo and U. Dias Session ES82 (Statistics in Actuarial
Life Settlement Pricing
Life Settlement Pricing Yinglu Deng Patrick Brockett Richard MacMinn Tsinghua University University of Texas Illinois State University Life Settlement Description A life settlement is a financial arrangement
How To Make A Credit Risk Model For A Bank Account
TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző [email protected] 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions
Get Better Business Results
Get Better Business Results From the Four Stages of Your Customer Lifecycle Stage 1 Acquisition A white paper from Identify Unique Needs and Opportunities at Each Lifecycle Stage It s a given that having
Logistic regression modeling the probability of success
Logistic regression modeling the probability of success Regression models are usually thought of as only being appropriate for target variables that are continuous Is there any situation where we might
SAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
CREDIT SCORING MODEL APPLICATIONS:
Örebro University Örebro University School of Business Master in Applied Statistics Thomas Laitila Sune Karlsson May, 2014 CREDIT SCORING MODEL APPLICATIONS: TESTING MULTINOMIAL TARGETS Gabriela De Rossi
Determining Future Success of College Students
Determining Future Success of College Students PAUL OEHRLEIN I. Introduction The years that students spend in college are perhaps the most influential years on the rest of their lives. College students
A random point process model for the score in sport matches
IMA Journal of Management Mathematics (2009) 20, 121 131 doi:10.1093/imaman/dpn027 Advance Access publication on October 30, 2008 A random point process model for the score in sport matches PETR VOLF Institute
Statistics in Retail Finance. Chapter 7: Fraud Detection in Retail Credit
Statistics in Retail Finance Chapter 7: Fraud Detection in Retail Credit 1 Overview > Detection of fraud remains an important issue in retail credit. Methods similar to scorecard development may be employed,
Advanced Quantitative Methods for Health Care Professionals PUBH 742 Spring 2015
1 Advanced Quantitative Methods for Health Care Professionals PUBH 742 Spring 2015 Instructor: Joanne M. Garrett, PhD e-mail: [email protected] Class Notes: Copies of the class lecture slides
Calculating Effect-Sizes
Calculating Effect-Sizes David B. Wilson, PhD George Mason University August 2011 The Heart and Soul of Meta-analysis: The Effect Size Meta-analysis shifts focus from statistical significance to the direction
Rank dependent expected utility theory explains the St. Petersburg paradox
Rank dependent expected utility theory explains the St. Petersburg paradox Ali al-nowaihi, University of Leicester Sanjit Dhami, University of Leicester Jia Zhu, University of Leicester Working Paper No.
Distribution (Weibull) Fitting
Chapter 550 Distribution (Weibull) Fitting Introduction This procedure estimates the parameters of the exponential, extreme value, logistic, log-logistic, lognormal, normal, and Weibull probability distributions
Package smoothhr. November 9, 2015
Encoding UTF-8 Type Package Depends R (>= 2.12.0),survival,splines Package smoothhr November 9, 2015 Title Smooth Hazard Ratio Curves Taking a Reference Value Version 1.0.2 Date 2015-10-29 Author Artur
Statistics in Retail Finance. Chapter 2: Statistical models of default
Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision
Introduction. Survival Analysis. Censoring. Plan of Talk
Survival Analysis Mark Lunt Arthritis Research UK Centre for Excellence in Epidemiology University of Manchester 01/12/2015 Survival Analysis is concerned with the length of time before an event occurs.
Business Analytics and Credit Scoring
Study Unit 5 Business Analytics and Credit Scoring ANL 309 Business Analytics Applications Introduction Process of credit scoring The role of business analytics in credit scoring Methods of logistic regression
Introduction to Survival Analysis
John Fox Lecture Notes Introduction to Survival Analysis Copyright 2014 by John Fox Introduction to Survival Analysis 1 1. Introduction I Survival analysis encompasses a wide variety of methods for analyzing
