The primary goal of this thesis was to understand how the spatial dependence of



Similar documents
Marketing Mix Modelling and Big Data P. M Cain

How to Get More Value from Your Survey Data

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

RUSRR048 COURSE CATALOG DETAIL REPORT Page 1 of 6 11/11/ :33:48. QMS 102 Course ID

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

Constrained Clustering of Territories in the Context of Car Insurance

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

Segmentation: Foundation of Marketing Strategy

Glossary of Terms Ability Accommodation Adjusted validity/reliability coefficient Alternate forms Analysis of work Assessment Battery Bias

Marketing (MSc) Vrije Universiteit Amsterdam - Fac. der Economische Wet. en Bedrijfsk. - M Marketing

Local classification and local likelihoods

5. Multiple regression

Globally Optimal Crowdsourcing Quality Management

Understanding Characteristics of Caravan Insurance Policy Buyer

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Handling attrition and non-response in longitudinal data

[This document contains corrections to a few typos that were found on the version available through the journal s web page]

The Optimality of Naive Bayes

STA 4273H: Statistical Machine Learning

Data Mining Clustering (2) Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining

UNIVERSITY OF WAIKATO. Hamilton New Zealand

Forecasting Geographic Data Michael Leonard and Renee Samy, SAS Institute Inc. Cary, NC, USA

Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca

How to report the percentage of explained common variance in exploratory factor analysis

Imputing Values to Missing Data

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza

Archetypal Analysis in Marketing Research: A New Way of Understanding Consumer Heterogeneity

Rens van de Schoot a b, Peter Lugtig a & Joop Hox a a Department of Methods and Statistics, Utrecht

A Basic Introduction to Missing Data

Chapter 6. The stacking ensemble approach

DIGITS CENTER FOR DIGITAL INNOVATION, TECHNOLOGY, AND STRATEGY THOUGHT LEADERSHIP FOR THE DIGITAL AGE

Logistic Regression (1/24/13)

Chapter 11 Introduction to Survey Sampling and Analysis Procedures

Cell Phone based Activity Detection using Markov Logic Network

Data Mining: Algorithms and Applications Matrix Math Review

Curriculum - Doctor of Philosophy

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA

Appendix B Checklist for the Empirical Cycle

The Science and Art of Market Segmentation Using PROC FASTCLUS Mark E. Thompson, Forefront Economics Inc, Beaverton, Oregon

On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information

Chapter 3 Local Marketing in Practice

OUTLIER ANALYSIS. Data Mining 1

Data Analytical Framework for Customer Centric Solutions

Component Ordering in Independent Component Analysis Based on Data Power

WHITEPAPER. How to Credit Score with Predictive Analytics

Statistics Graduate Courses

Chapter 1 INTRODUCTION. 1.1 Background

Scheduling Algorithms in MapReduce Distributed Mind

ITIL Intermediate Capability Stream:

IBM SPSS Direct Marketing 23

Quantitative methods in marketing research

Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences

CLUSTER ANALYSIS FOR SEGMENTATION

Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus

Multiple Linear Regression in Data Mining

Applied mathematics and mathematical statistics

IBM SPSS Direct Marketing 22

MARKET SEGMENTATION, CUSTOMER LIFETIME VALUE, AND CUSTOMER ATTRITION IN HEALTH INSURANCE: A SINGLE ENDEAVOR THROUGH DATA MINING

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

Java Modules for Time Series Analysis

The frequency of visiting a doctor: is the decision to go independent of the frequency?

How To Understand The Theory Of Probability

1 Short Introduction to Time Series

Master of Science in Marketing Analytics (MSMA)

Machine Learning Methods for Causal Effects. Susan Athey, Stanford University Guido Imbens, Stanford University

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

Introduction to Longitudinal Data Analysis

Business Analytics and Credit Scoring

PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY

Numerical Algorithms Group

CRM Forum Resources

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Data Mining - Evaluation of Classifiers

Introduction to Matrix Algebra

Problem Solving and Data Analysis

Penalized regression: Introduction

16 : Demand Forecasting

Visualization of Complex Survey Data: Regression Diagnostics

Introduction to Logistic Regression

FACTOR ANALYSIS NASC

From the help desk: Bootstrapped standard errors

Principles of Data Mining by Hand&Mannila&Smyth

Behavioral Segmentation

List of Ph.D. Courses

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

GLM, insurance pricing & big data: paying attention to convergence issues.

Department/Academic Unit: Public Health Sciences Degree Program: Biostatistics Collaborative Program

Analysis of Bayesian Dynamic Linear Models

Making the Most of Missing Values: Object Clustering with Partial Data in Astronomy

The HB. How Bayesian methods have changed the face of marketing research. Summer 2004

Transcription:

5 General discussion 5.1 Introduction The primary goal of this thesis was to understand how the spatial dependence of consumer attitudes can be modeled, what additional benefits the recovering of spatial dependence can bring and how these benefits can be utilized by researchers and practitioners. Three studies contributed to achieve this goal. The current chapter presents key findings of these studies, theoretical and managerial implications, together with limitations and directions for further research. 5.2 Main methodological findings The results of the simulations from Study 1 (Chapter 2) show that, firstly, the probability of recovering the spatial lag as the true model is greater than that for the spatial error model, with the probability to recover the true spatial error model being particularly low for small values of the spatial parameter. Secondly, the spatial lag model produces a lower mean squared error (MSE) of the spatial parameter and higher MSE of regression parameters compared with the spatial error model. Thirdly, high connectivity of the weights matrix (fraction of nonzero elements) has a negative impact on the probability of detecting the true model specification. Fourthly, a high proportion of variance in the data, together with high correlation between neighboring observations, leads to an incorrect identification of the spatial error structure as a spatial lag one. Fifthly, spatial models estimated using the first-order contiguity weights matrix perform better on average than those using the N nearest neighbors and inverse distance weights matrices in terms of their higher probabilities of detecting the true model and the lower MSE of

Advances in Spatial Dependence Modeling of Consumer Attitudes with Bayesian Factor Models the parameters. Sixth, a selection procedure for the weights matrix based on the log-likelihood or information criteria usually indicates the correct specification. Seventh and finally, the accuracy of the regression parameter is not very sensitive to the type of weights matrix for low spatial parameter values but is much more sensitive for higher values, though in this case, the probability of detecting the true weights matrix is also very high. Hence, in overall, this chapter provides numerous important new insights on the specification of spatial models and weights matrices by means of a simulation study. In Study 2 (Chapter 3) a new model of Bayesian Spatial Factor Analysis (BSFA) was proposed, which builds on existing heterogeneous factor models, in particular Ansari et al. (2002). The developed MCMC estimation algorithm showed very good performance in recovering true parameters on synthetic data. The model facilitates the systematic investigation of spatial co-variance of psychological traits and treats spatial structure as a source of information of interest in their measurement. In the analysis of value domains across European countries it was showed that certain domains from Schwartz (1992) theory, namely Conformity, Tradition, Hedonism, Achievement and Power, have strong spatial dependence across regions in Europe, whereas the other domains do not. Study 3 (Chapter 4) is partially based on the developments of Bayesian Spatial Factor Analytic (BSFA) model from Study 2 and one of the major contributions is related to introducing the cut-points formulation to accommodate discrete measurement scale and including exogenous covariates to allow their influence on factor means. The extended BSFA proved to perform very well on synthetic data by precisely recovering all parameters. Also, it was shown analytically and numerically that ignoring spatial dependence across factor means leads to biased and inconsistent estimates in certain parameters. The application to financial attitudes in the Netherlands towards 128

Chapter 5: General Discussion financial products proved that the model could be of high interest in better understanding of unobserved consumer attitudes. The proposed BSFA model produces interesting meaningful results not only for factor loadings, but also for deeper structure of latent factors, namely their spatial correlation. It was shown that certain attitudes, like loyalty and risk averseness, do really follow spatial dependence. In addition, including region specific covariates to regression with factor means shed more light on the nature of the factor means variation. This dependence also helped to understand better the spatial distribution of unobserved constructs. Finally, the proposed optimization algorithm can be of great benefit to assign financial planners to certain areas based on certain criteria and subject to certain constraints. All in all, the spatial modeling framework proved to be very useful and effective tool for modeling consumer attitudes. We demonstrated that ignoring spatial dependence in general causes many problems with parameter estimates and further inferences. Combining spatial models, factor models and hierarchical models allowed us to develop unique and strong methodology for modeling heterogeneous consumer attitudes. Moreover, this new methodology is not only of high vale by itself, but also its output may provide an input for solving other relevant problems in practice, as was demonstrated with optimization of financial planners. 5.3 Implications Study 1 allows formulating several practical recommendations for researchers who deal with georeferenced data. First, applying a weights matrix selection procedure that is based on information criteria increases the probability of identifying its true specification and benefits the researcher in terms of the precision of the parameter estimates compared with choosing the weights matrix arbitrarily. This finding is true for all degrees of spatial dependence. Second, if theory does not suggest otherwise, using the most simple spatial weights matrix, i.e. first-order 129

Advances in Spatial Dependence Modeling of Consumer Attitudes with Bayesian Factor Models contiguity, offers the second-best option. It does not perform best in all cases, but in general, it increases the probability of detecting the true model and decreases the MSE of spatial and regression parameters. It is an encouraging finding, as in the majority research projects the firstorder contiguity matrix is used (Florax and de Graaff 2004), though the motivation usually comes from its simplicity. The Bayesian Spatial Factor Analytic (BSFA) model, developed in Study 2, has not only theoretical but also substantive implications. The application to real data showed that the model is of great interest in better understanding of spatial distribution of latent constructs. Namely, the detailed study of spatial distribution of Schwartz value domains showed that Conformity and Hedonism have a country-specific structure, Power has a chessboard spatial pattern, while Tradition and Achievement have roughly a North-South gradient that cuts across national borders. These patterns were not recovered by classical psychometrical tools and can be of great value to practitioners. For example, the assumption of a discrete partition of value structure according to country borders is a limiting one: regions belonging to different countries but being in spatial proximity may have more similarities than regions in the same country but located at some distance from each other. The managerial implications that follow from Study 3 originate from two decisions that a manager or financial planner may face, namely the targeting or selection of customers and optimization of the allocation of financial planners to markets. Firstly, the established influence of region specific covariates on factor means can help to understand better spatial dependence patterns in the factor scores. For instance, regions with larger households tend to appreciate financial advice more than regions with smaller households, while regions with larger share of high educated people appreciate them less. Regions with lower education appear to be more risk 130

Chapter 5: General Discussion averse, while regions with low income less risk averse. These consumer characteristics, like the factor scores, show spatial dependence. Secondly, the scheduling of financial planners or introducing of new financial products can be done more efficiently based on the spatial factor scores, matching the spatial structure of latent constructs to the proficiency levels of the financial planners. For instance, for offering new risky products the regions with low risk aversion should be targeted and a financial planner that has a high level of proficiency of selling financial products with higher associated risk is well positioned to do that. Also, the regions, where the importance of financial advice is higher than in other regions, should be targeted and financial advisors assigned with priority. Further, retention strategies for regions with more loyal customers would need to receive priority, while acquisition strategies are likely effective for regions with lower loyalty. Integrating all above mentioned we can state with confidence that the proposed Bayesian Spatial Factor Analytic methodology can be of high value for practitioners and can provide new insights for various phenomena. In this thesis we showed already its applicability in both psychology (Schwartz values) and marketing (attitudes to financial products). Moreover, its application to financial attitudes gave input to solving the problem from operations field, namely optimizing spatial allocations of financial planners. However, these applications do not represent an exhaustive list and one can identify other potential applications, as, for instance, predicting of customers behavior, dealing with the problem of sparse data or sales territory design. The idea to apply the elements from spatial econometrics to the analysis of attitudes is quite a new endeavor and it has already proved to be a promising direction. It has a big potential not only from methodological point of view, but also can have a significant managerial relevance. 131

Advances in Spatial Dependence Modeling of Consumer Attitudes with Bayesian Factor Models 5.4 Limitations and further research possibilities There are several possible extensions for Study 1. Firstly, it can be beneficial to include other specifications of spatial models and weights matrices, and to apply model and weights matrix selection procedures not sequentially but simultaneously. Secondly, though negative values of spatial parameters are quite rare in practice, they still may occur (as in Study 2), hence, it can be another venue for further research. Finally, as this study was concerned only with the regression-type models, one of the directions for further research could be extending it to factor analytic models as well. Especially the influence of different specifications of spatial weights matrices on factor models is of high interest. One of the limitations of Study 2 is the assumption of a homogeneous factor loadings matrix. Though this is a common assumption in factor models and in applications to Schwartz values, additional insights may be obtained by allowing the factor loadings to differ across countries, or to assume their variation to be region specific, or even spatially dependent. Pursuing this idea will further contribute to the literature on measurement invariance in crossnational research (Steenkamp and Baumgartner 1998; De Jong, Steenkamp, and Fox 2007). Finally the limitations of Study 3 are as follows. Firstly, it was assumed that cut-points are homogeneity across respondents, while introducing heterogeneous cut-points may help to capture differences in scale usage across respondents and/or regions. Another limitation is using only two constraints in the optimal allocation of financial planners, namely that the regions in clusters should be contiguous and that each cluster contains a fixed number of regions. Other constraints may be added to reflect the interests of optimizing subject, like limiting the regions to be inside a circle with predetermined radius in order to prevent the forming of stretched clusters; or limiting not the number of regions in a cluster, but 132

Chapter 5: General Discussion its population size instead, eliminating the situation when one financial planner is assigned to highly populated area and another one to low populated. Further, multiple factor scores could be used simultaneously to assign planners to regions, based on their proficiency in multiple areas, such as selling products with higher associated risk, or acquisition of new versus retention of existing customers. Addressing these issues could be one of the directions for further research. The proposed in the thesis methodology of spatial factor analysis and optimization algorithm are not limited only to the allocation of financial planners. It can be effectively applied for other sales force optimization tasks and can provide additional benefits to the field of sales territory design (Zoltners and Sinha 2005). For instance, the existing approaches require the knowledge of all current customers in order to design the sales territories. Contrary, our methodology does not require this information and can distinguish the territories based on potential customers. Another direction for further research may be including of observed behavior of customers (purchases) of financial products. However, this extension would be possible subject to availability of such data. If we can observe these purchases then the proposed methodology can be strengthened even more, as it will allow to cross-validate the model s performance. Moreover, the future purchases can be predicted based on the location of customers, their demographic characteristics and/or their attitude towards the financial products. For this setup the factor analytic model should be extended to more general Structural Equation Modeling (SEM) framework. Finally, the linking of the derived allocation of planners to the profits can be of great interest and significance. Though the matching of capabilities of financial planners to customer 133

Advances in Spatial Dependence Modeling of Consumer Attitudes with Bayesian Factor Models markets that they serve is directly related to profits, the more formal linkage may be more appealing for financial institutions. Methodologically it is straightforward to do by introducing profit maximizing optimization, but the challenge is to get an access to sensitive and confidential data on profitability from financial institutions. 134