# Keep It Simple: Easy Ways To Estimate Choice Models For Single Consumers

Save this PDF as:

Size: px
Start display at page:

Download "Keep It Simple: Easy Ways To Estimate Choice Models For Single Consumers"

## Transcription

1 Keep It Simple: Easy Ways To Estimate Choice Models For Single Consumers Christine Ebling, University of Technology Sydney, Bart Frischknecht, University of Technology Sydney, Jordan Louviere, University of Technology Sydney, Abstract We show with Monte-Carlo simulations and empirical choice data sets that we can quickly and simply refine choice model estimates for individuals based on methods such as ordinary least squares regression and weighted least squares regression to produce well-behaved insample and out-of-sample predictions of choices. We use well-known regression methods to estimate choice models, which should allow many more researchers to estimate choice models and be confident that they are unlikely to make serious mistakes. Keywords: Individual-level choice models, linear probability models

3 analysis, which require education, training and experience; but given that academics and practitioners can acquire these skills, the approaches proposed in this paper should allow many researchers to estimate choice models for single people and predict choice outcomes in many circumstances of academic and practical interest. Proposed Modelling Approaches Selection of one alternative from among several is the observed dependent variable of interest in this paper. The statistical models for the analysis of such data are known as latent dependent variable models because the dependent variable of interest is strength of preference, but one observes only a binary indicator of that unobserved, underlying measure. Typically, the responses to choice experiments are coded one (1) for the chosen option and zero (0) for the unchosen option(s). As we later note, all latent dependent variable models have a formal identification problem, namely the standard deviation of the error component is perfectly inversely correlated with the model estimates. This poses no problem in predicting choice probabilities; however, as noted by Swait and Louviere (1993), it poses issues for comparisons of model estimates across data sources. Further, as noted by Fiebig, et al (2010), if the standard deviation of the error component is not constant across people, this can result in seriously biased model estimates and misleading inferences. The latter consideration motivates one to find ways to account for choice consistency differences across people. One way to treat differences is to estimate models for single persons making choices. In particular, we investigate ordinary least squares regression models estimated directly from the 1, 0 choice indicators (known as Linear Probability Models, or LPMs), and weighted least squares regression models estimated from full or partial rankings of the choice options in each set (hereafter termed WLS models) as proposed and applied by Louviere, et al (2008). The LPM approach is motivated by the work of the Nobel Laureate James Heckman and co-author Synder (1998), who proposed that LPMs provide a better way to model discrete choices when errors are asymmetric; naturally, no one knows if errors in latent utility functions are symmetric or not, so this approach is worth considering. The WLS approach is motivated by rank order explosion of choices, originally discussed by Luce and Suppes (1965) and first used in marketing by Chapman and Staelin (1982). We use rank order explosion procedure here to construct up to N-1 choice sets from a set of N rankings, or r-1 choice sets from a subset of r rankings of N options. Rank order explosion also can be seen as a way to aggregate choices; i.e., 1, 0 choices are fully disaggregated because they are associated with the most elemental level of choices, namely a particular person, a particular choice set and a set of particular choice options (alternatives). Hensher and Louviere (1984), Louviere and Woodworth (1983) and Louviere, et al (2008) (among others) show how to aggregate choices across people or choice sets by calculating and analysing choice counts or choice proportions. Non-experimental analogues of these aggregate choice counts were considered by (among others) Berkson (1944) and Theil (1969) for cases where real market choices are aggregated by choice alternatives and choice sets or other differences in choice observations. If one has such choice counts or one can transform rankings into such counts (via rank order explosion or Louviere, et al 2008), one can apply WLS regression to estimate choice model parameters. Green (1984) discusses using WLS as the first step of the maximum likelihood estimator, which is the approach used by Louviere and Woodworth (1983). In the case of both LPMs and WLS, the model estimates are on the wrong scale to correctly predict the choice probabilities. That is, the parameter estimates are too small or too

4 large and will systematically under- or over-predict the observed choice probabilities, and must be corrected. We propose and apply a simple correction that appears to work quite well under a variety of simulated and actual conditions in- and out-of-sample. Specifically, one estimates an LPM or a WLS model for a single person and uses the model to predict the observed dependent variable for each choice option in each choice set of interest. Then, one calculates the associated residual mean squared error for each person, and multiplies all LPM or WLS model estimates for an individual by a correction factor, which is 1/(mean squared error). The corrected parameters are used to make choice predictions for each person by using the logit formula. We applied both LPM and WLS methods to predict observed choices in two empirical data sets representing choices of car insurance options (set 1) and choices of cross-country airline flights (set 2); and a third set of simulated choices (set 3). We also compared each approach with individual-level conditional logit models estimated from exploded choice data based on observing most and least preferred choices in each choice set. In the interests of space, we do not go into detail on exploding the choice data using the observed partial rankings, but merely note that we explode the data to be consistent with Luce and Suppes (1965) and Chapman and Staelin (1982). In the airline and car insurance data we estimate the models from choices of 150 out of 200 respondents in 12 choice sets with four options each constructed using Street and Burgess (2007) optimal design theory. We assess in-sample and outof-sample fits using the method of sample enumeration where we average the choice share for a particular alternative across the individual choice probabilities of the in-sample or out-ofsample respondents respectively. We use observed and predicted choice alternative shares to compare models in- and out-of-sample using R-Squares, hit and fit rates (150 respondents for in-sample; the remaining 50 respondents for out-of-sample). We use an additional four choice sets to assess holdout task fits, again using R-Square and hit rates. To avoid biases that may be due to the particular respondents selected for estimation, we average our fit measures over ten different random draws of 150 individuals. We follow the same procedure for the simulated choice experiment data; we generate simulated choice data by drawing a set of parameter values from specified distributions for each simulated person. We generated 100 simulated populations, and sampled 150 people randomly ten times from each population. We average the fit measures across populations and samples. Results Conditional logit models estimated from the exploded data consistently perform well insample. When the correction factors are applied to the LPMs and the WLS model estimate, they also perform well in-sample. All models systematically tend to over-predict low choice share alternatives and under-predict high choice share alternatives for out-of-sample choice sets and samples. This is likely due to differences in error variances between estimation and hold-out data sources, such that error variances in out-of-sample sets are systematically larger than in estimation sets. The likely source of the error variance differences is model misspecification; which in turn suggests larger mis-specification errors for the car insurance data, also implying more non-additive choice rules for insurance than flight choices. The results are summarised in Table 1, which reports model performances for four cases: 1) in-sample, same people, same choice sets; 2) out of sample, same people, different choice sets; 3) out of sample, different people, same choice sets; and 4) out of sample, different people, different choice sets. Shown are the hit rates, i.e. the average number the alterna-

5 tives assigned the highest choice probabilities in a given choice set that were the same as the alternatives chosen, the fit rates, i.e. the mean squared deviation between predicted choice probabilities and individual observed 0/1 choices, the R 2 for the line y=x (i.e., aggregate observed shares equal to aggregate predicted shares), as well as the R 2 and the slope of the regression of the observed aggregate choice shares on the predicted aggregate choice probabilities. Whereas hit and the fit rates measure a models ability to predict individual person choices, R 2 values and regression slopes reflect a models ability to produce good aggregate predictions. Table 1: Results of Model Estimation Tests Exploded Cond OLS not correcterorrecteror) OLS x 1/(msq er- WLS not cor- WLS x 1/(msq er- Model Performance Logit Set1 Set2 Set3 Set1 Set2 Set3 Set1 Set2 Set3 Set1 Set2 Set3 Set1 Set2 Set3 In Sample (Same respondents, same choice sets) Hit rate Fit index R^2, perfect R^2, best fit Slope, best Out of Sample within respondents (same respondents, different choice sets) Hit rate Fit index R^2, perfect R^2, best fit Slope, best Out of Sample across respondents (different respondents, same choice sets) Hit rate Fit index R^2, perfect R^2, best fit Slope, best Out of Sample across respondents across choice sets(different respondents, different choice sets) Hit rate Fit index R^2, perfect R^2, best fit Slope, best R^2 perfect = estimated thru origin; R^2 best = estimated linear fit; Slope best = slope of estimated linear fit Theoretically, model performance should systematically decline from cases 1 to 4. Let us now consider the various performance measures: 1) Hit rates these are uniformly high for case 1, indicating one correctly predicts the chosen options a very high proportion of the time in-sample; hit rates decline for cases 2 and 3, but surprisingly rise in case 4; 2) Fit rates for both corrected LPM and corrected WLS models are closer to 0, and thus better, than exploded conditional logit models. Considering hit-rates and fit-rates, corrected LPM and WLS models perform as well for individual persons as exploded conditional logit models. 3) R 2 values are uniformly high for conditional logits using exploded choices; they are consistently high for corrected LPM and WLS models, and are consistently lower for non-corrected models; 4) Slopes are close to one for conditional logit on exploded choices and corrected LPM and WLS models, but differ widely for non-corrected models. All models tend to over-predict low choice share alternatives and under predict high choice share alternatives out-of-sample with different choice sets.

6 Discussion and Conclusions We proposed and investigated two simple ways to model individual choices, namely linear probability models (LPMs) and Weighted Least Squares regression based on Louviere, et al (2008). We tested the performance of the two approaches in- and out-of-sample. Our results suggest that LPMs and WLS models corrected for error variance difference between individuals perform well in both real and simulated data once the parameter estimates are weighted by the inverse of the mean squared error. Because of their simplicity and ease of use, we suggest that both approaches are likely to prove attractive to researchers who want to better understand and model consumer choices, but find complex statistical models or black box commercial software challenging. In turn, this should allow both academics and practitioners to do a reasonable job of predicting choices without making large mistakes. References Berkson, J Application of the Logistic Function to Bio-Assay. Journal of the American Statistical Association 39, Chapman, R.G., Staelin, R Exploiting Rank Order Choice Set Data Within the Stochastic Utility Model. Journal of Marketing Research 19, Chapman, R.G An Approach to Estimating Logit Models of a Single Decision Maker's Choice Behavior. Advances in Consumer Research 11, Fiebig, D.G., Keane, M.P., Louviere, J., Wasi, N The Generalized Multinomial Logit Model: Accounting for Scale and Coefficient Heterogeneity. Marketing Science 29, Finn, A., Louviere, J.J Determining the Appropriate Response to Evidence of Public Concern: The Case of Food Safety. Journal of Public Policy and Marketing 11 (1) Green, P.J Iteratively Reweighted Least Squares for Maximum Likelihood Estimation, and some Robust and Resistant Alternatives. Journal of the Royal Statistical Society, Series B (Methodological) 46(2), Heckman, J.J., Snyder, J.M Linear Probability Models of the Demand for Attributes With an Empirical Application to Estimating the Preferences of Legislators. Rand Journal of Economics 28, Hensher, D.A., Louviere, J.J Identifying Individual Preferences for International Air Travel: An Application of Functional Measurement Theory. Journal of Transport Economics and Policy 17(3), Louviere, J.J., Woodworth, G Design and Analysis of Simulated Consumer Choice or Allocation Experiments: An Approach Based on Aggregate Data. Journal of Marketing Research 20,

7 Louviere, J.J., Street, D., Burgess, L., Wasi; N., Islam, T., Marley, A.A.J Modeling the Choices of Individual Decision-Makers by Combining Efficient Choice Experiment Designs with Extra Preference Information. Journal of Choice Modelling 1(1), Luce, R.D Individual Choice Behavior: A Theoretical Analysis. New York: Wiley. Luce, D., Suppes, P Preference, Utility, and Subjective Probability. In Bush, R.R., Galanter, E. (eds.), Handbook of Mathematical Psychology, John Wiley and Sons Inc., New York. Marley A.A.J., Louviere, J.J Some Probabilistic Models of Best, Worst, and Best-Worst choices. Journal of Mathematical Psychology 49, McFadden, D Conditional Logit Analysis of Qualitative Choice Behavior. In P. Zarembka (ed.), Frontiers In Econometrics, Academic Press: New York, Street, D., Burgess, L The Construction of Optimal Stated Choice Experiments: Theory and Methods. New York: Wiley. Swait, J.D., Louviere, J.J The Role of the Scale Parameter in the Estimation and Comparison of Multinomial Logit Models. Journal of Marketing Research 30, Theil, H A Multinomial Extension of the Linear Logit Model. International Economic Review 10, Thurstone, L.L A Law of Comparative Judgement. Psychological Review 34,

### Simple Ways To Estimate Choice Models For Single Consumers 1

Simple Ways To Estimate Choice Models For Single Consumers 1 Bart Frischknecht, University of Technology Sydney, bart.frischknecht@uts.edu.au Christine Eckert, University of Technology Sydney, christine.ebling@uts.edu.au

### Choice or Rank Data in Stated Preference Surveys?

74 The Open Transportation Journal, 2008, 2, 74-79 Choice or Rank Data in Stated Preference Surveys? Becky P.Y. Loo *,1, S.C. Wong 2 and Timothy D. Hau 3 Open Access 1 Department of Geography, The University

### MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

### Trade-off Analysis: A Survey of Commercially Available Techniques

Trade-off Analysis: A Survey of Commercially Available Techniques Trade-off analysis is a family of methods by which respondents' utilities for various product features are measured. This article discusses

### Econometric Analysis of Cross Section and Panel Data Second Edition. Jeffrey M. Wooldridge. The MIT Press Cambridge, Massachusetts London, England

Econometric Analysis of Cross Section and Panel Data Second Edition Jeffrey M. Wooldridge The MIT Press Cambridge, Massachusetts London, England Preface Acknowledgments xxi xxix I INTRODUCTION AND BACKGROUND

### Incorporating prior information to overcome complete separation problems in discrete choice model estimation

Incorporating prior information to overcome complete separation problems in discrete choice model estimation Bart D. Frischknecht Centre for the Study of Choice, University of Technology, Sydney, bart.frischknecht@uts.edu.au,

### INTRODUCTORY STATISTICS

INTRODUCTORY STATISTICS FIFTH EDITION Thomas H. Wonnacott University of Western Ontario Ronald J. Wonnacott University of Western Ontario WILEY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore

### Forecasting the sales of an innovative agro-industrial product with limited information: A case of feta cheese from buffalo milk in Thailand

Forecasting the sales of an innovative agro-industrial product with limited information: A case of feta cheese from buffalo milk in Thailand Orakanya Kanjanatarakul 1 and Komsan Suriya 2 1 Faculty of Economics,

### Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different

### 4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4

4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Non-linear functional forms Regression

### Power and sample size in multilevel modeling

Snijders, Tom A.B. Power and Sample Size in Multilevel Linear Models. In: B.S. Everitt and D.C. Howell (eds.), Encyclopedia of Statistics in Behavioral Science. Volume 3, 1570 1573. Chicester (etc.): Wiley,

### FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS

FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS Jeffrey M. Wooldridge Department of Economics Michigan State University East Lansing, MI 48824-1038

### Handling attrition and non-response in longitudinal data

Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein

### Regression Analysis: Basic Concepts

The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance

### Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors

Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors Arthur Lewbel, Yingying Dong, and Thomas Tao Yang Boston College, University of California Irvine, and Boston

### Regression with a Binary Dependent Variable

Regression with a Binary Dependent Variable Chapter 9 Michael Ash CPPA Lecture 22 Course Notes Endgame Take-home final Distributed Friday 19 May Due Tuesday 23 May (Paper or emailed PDF ok; no Word, Excel,

### Calculating the Probability of Returning a Loan with Binary Probability Models

Calculating the Probability of Returning a Loan with Binary Probability Models Associate Professor PhD Julian VASILEV (e-mail: vasilev@ue-varna.bg) Varna University of Economics, Bulgaria ABSTRACT The

### August 2012 EXAMINATIONS Solution Part I

August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,

### Student evaluation of degree programmes: the use of Best-Worst Scaling

Student evaluation of degree programmes: the use of Best-Worst Scaling Introduction Students play a role as stakeholders in higher education quality enhancement through their involvement in educational

### Quantile Regression under misspecification, with an application to the U.S. wage structure

Quantile Regression under misspecification, with an application to the U.S. wage structure Angrist, Chernozhukov and Fernandez-Val Reading Group Econometrics November 2, 2010 Intro: initial problem The

### Pooling and Meta-analysis. Tony O Hagan

Pooling and Meta-analysis Tony O Hagan Pooling Synthesising prior information from several experts 2 Multiple experts The case of multiple experts is important When elicitation is used to provide expert

### Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### PS 271B: Quantitative Methods II. Lecture Notes

PS 271B: Quantitative Methods II Lecture Notes Langche Zeng zeng@ucsd.edu The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.

### PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA

PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA ABSTRACT The decision of whether to use PLS instead of a covariance

### Statistical Innovations Online course: Latent Class Discrete Choice Modeling with Scale Factors

Statistical Innovations Online course: Latent Class Discrete Choice Modeling with Scale Factors Session 3: Advanced SALC Topics Outline: A. Advanced Models for MaxDiff Data B. Modeling the Dual Response

### Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios

Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios By: Michael Banasiak & By: Daniel Tantum, Ph.D. What Are Statistical Based Behavior Scoring Models And How Are

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### Introduction to Regression and Data Analysis

Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

### MODELING CONSUMER DECISION MAKING

MODELING CONSUMER DECISION MAKING AND DISCRETE CHOICE BEHAVIOR J ULY 9-11, 2001 To register, complete the enclosed registration form and fax it to: 212.995.4502 or mail to: Stern School of Business Executive

### Comparing Alternate Designs For A Multi-Domain Cluster Sample

Comparing Alternate Designs For A Multi-Domain Cluster Sample Pedro J. Saavedra, Mareena McKinley Wright and Joseph P. Riley Mareena McKinley Wright, ORC Macro, 11785 Beltsville Dr., Calverton, MD 20705

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### 2. Linear regression with multiple regressors

2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

### Models of best-worst choice and ranking among multiattribute options (profiles)

Models of best-worst choice and ranking among multiattribute options (profiles) A. A. J. Marley,1 and D. Pihlens 2 Journal of Mathematical Psychology (In Press, September 2011) 1 A.A.J. Marley (corresponding

### An Analysis of Rank Ordered Data

An Analysis of Rank Ordered Data Krishna P Paudel, Louisiana State University Biswo N Poudel, University of California, Berkeley Michael A. Dunn, Louisiana State University Mahesh Pandit, Louisiana State

### CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

### H. Allen Klaiber Department of Agricultural and Resource Economics North Carolina State University

Incorporating Random Coefficients and Alternative Specific Constants into Discrete Choice Models: Implications for In- Sample Fit and Welfare Estimates * H. Allen Klaiber Department of Agricultural and

### 430 Statistics and Financial Mathematics for Business

Prescription: 430 Statistics and Financial Mathematics for Business Elective prescription Level 4 Credit 20 Version 2 Aim Students will be able to summarise, analyse, interpret and present data, make predictions

### Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

### 11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

### Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

### SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

### Least Squares Estimation

Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

### The primary goal of this thesis was to understand how the spatial dependence of

5 General discussion 5.1 Introduction The primary goal of this thesis was to understand how the spatial dependence of consumer attitudes can be modeled, what additional benefits the recovering of spatial

### Asymmetry and the Cost of Capital

Asymmetry and the Cost of Capital Javier García Sánchez, IAE Business School Lorenzo Preve, IAE Business School Virginia Sarria Allende, IAE Business School Abstract The expected cost of capital is a crucial

### INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its

### Conjoint Analysis: Marketing Engineering Technical Note 1

Conjoint Analysis: Marketing Engineering Technical Note 1 Table of Contents Introduction Conjoint Analysis for Product Design Designing a conjoint study Using conjoint data for market simulations Transforming

### Models for Count Data With Overdispersion

Models for Count Data With Overdispersion Germán Rodríguez November 6, 2013 Abstract This addendum to the WWS 509 notes covers extra-poisson variation and the negative binomial model, with brief appearances

: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

### UNIVERSITY OF WAIKATO. Hamilton New Zealand

UNIVERSITY OF WAIKATO Hamilton New Zealand Can We Trust Cluster-Corrected Standard Errors? An Application of Spatial Autocorrelation with Exact Locations Known John Gibson University of Waikato Bonggeun

### What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

### Models for Longitudinal and Clustered Data

Models for Longitudinal and Clustered Data Germán Rodríguez December 9, 2008, revised December 6, 2012 1 Introduction The most important assumption we have made in this course is that the observations

### Analysis of Microdata

Rainer Winkelmann Stefan Boes Analysis of Microdata With 38 Figures and 41 Tables 4y Springer Contents 1 Introduction 1 1.1 What Are Microdata? 1 1.2 Types of Microdata 4 1.2.1 Qualitative Data 4 1.2.2

### BayesX - Software for Bayesian Inference in Structured Additive Regression

BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich

### Multivariate Statistical Inference and Applications

Multivariate Statistical Inference and Applications ALVIN C. RENCHER Department of Statistics Brigham Young University A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim

### Time Series Analysis. 1) smoothing/trend assessment

Time Series Analysis This (not surprisingly) concerns the analysis of data collected over time... weekly values, monthly values, quarterly values, yearly values, etc. Usually the intent is to discern whether

### Predictability of Average Inflation over Long Time Horizons

Predictability of Average Inflation over Long Time Horizons Allan Crawford, Research Department Uncertainty about the level of future inflation adversely affects the economy because it distorts savings

### ANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? 1. INTRODUCTION

ANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? SAMUEL H. COX AND YIJIA LIN ABSTRACT. We devise an approach, using tobit models for modeling annuity lapse rates. The approach is based on data provided

### The aspect of the data that we want to describe/measure is the degree of linear relationship between and The statistic r describes/measures the degree

PS 511: Advanced Statistics for Psychological and Behavioral Research 1 Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables ( and

### CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships

### Simple Predictive Analytics Curtis Seare

Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

### Earnings in private jobs after participation to post-doctoral programs : an assessment using a treatment effect model. Isabelle Recotillet

Earnings in private obs after participation to post-doctoral programs : an assessment using a treatment effect model Isabelle Recotillet Institute of Labor Economics and Industrial Sociology, UMR 6123,

### Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

### Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate

### FORECASTING DEPOSIT GROWTH: Forecasting BIF and SAIF Assessable and Insured Deposits

Technical Paper Series Congressional Budget Office Washington, DC FORECASTING DEPOSIT GROWTH: Forecasting BIF and SAIF Assessable and Insured Deposits Albert D. Metz Microeconomic and Financial Studies

### CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

Examples: Monte Carlo Simulation Studies CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Monte Carlo simulation studies are often used for methodological investigations of the performance of statistical

### PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

### Fairfield Public Schools

Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

### Correcting for CBC Model Bias: A Hybrid Scanner Data - Conjoint Model. Martin Natter Markus Feurstein

Correcting for CBC Model Bias: A Hybrid Scanner Data - Conjoint Model Martin Natter Markus Feurstein Report No. 57 June 2001 June 2001 SFB Adaptive Information Systems and Modelling in Economics and Management

### 2DI36 Statistics. 2DI36 Part II (Chapter 7 of MR)

2DI36 Statistics 2DI36 Part II (Chapter 7 of MR) What Have we Done so Far? Last time we introduced the concept of a dataset and seen how we can represent it in various ways But, how did this dataset came

### is paramount in advancing any economy. For developed countries such as

Introduction The provision of appropriate incentives to attract workers to the health industry is paramount in advancing any economy. For developed countries such as Australia, the increasing demand for

### Local classification and local likelihoods

Local classification and local likelihoods November 18 k-nearest neighbors The idea of local regression can be extended to classification as well The simplest way of doing so is called nearest neighbor

### Regression analysis of probability-linked data

Regression analysis of probability-linked data Ray Chambers University of Wollongong James Chipperfield Australian Bureau of Statistics Walter Davis Statistics New Zealand 1 Overview 1. Probability linkage

### MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

### 5. Linear Regression

5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

### Sawtooth Software. The CBC System for Choice-Based Conjoint Analysis. Version 8 TECHNICAL PAPER SERIES. Sawtooth Software, Inc.

Sawtooth Software TECHNICAL PAPER SERIES The CBC System for Choice-Based Conjoint Analysis Version 8 Sawtooth Software, Inc. 1 Copyright 1993-2013, Sawtooth Software, Inc. 1457 E 840 N Orem, Utah +1 801

### Lasso on Categorical Data

Lasso on Categorical Data Yunjin Choi, Rina Park, Michael Seo December 14, 2012 1 Introduction In social science studies, the variables of interest are often categorical, such as race, gender, and nationality.

### ON THE ROBUSTNESS OF FIXED EFFECTS AND RELATED ESTIMATORS IN CORRELATED RANDOM COEFFICIENT PANEL DATA MODELS

ON THE ROBUSTNESS OF FIXED EFFECTS AND RELATED ESTIMATORS IN CORRELATED RANDOM COEFFICIENT PANEL DATA MODELS Jeffrey M. Wooldridge THE INSTITUTE FOR FISCAL STUDIES DEPARTMENT OF ECONOMICS, UCL cemmap working

### The Impact of the Medicare Rural Hospital Flexibility Program on Patient Choice

The Impact of the Medicare Rural Hospital Flexibility Program on Patient Choice Gautam Gowrisankaran Claudio Lucarelli Philipp Schmidt-Dengler Robert Town January 24, 2011 Abstract This paper seeks to

### Experiment #1, Analyze Data using Excel, Calculator and Graphs.

Physics 182 - Fall 2014 - Experiment #1 1 Experiment #1, Analyze Data using Excel, Calculator and Graphs. 1 Purpose (5 Points, Including Title. Points apply to your lab report.) Before we start measuring

### UNIT 1: COLLECTING DATA

Core Probability and Statistics Probability and Statistics provides a curriculum focused on understanding key data analysis and probabilistic concepts, calculations, and relevance to real-world applications.

### PROBABILITY AND STATISTICS. Ma 527. 1. To teach a knowledge of combinatorial reasoning.

PROBABILITY AND STATISTICS Ma 527 Course Description Prefaced by a study of the foundations of probability and statistics, this course is an extension of the elements of probability and statistics introduced

### Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:

Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén

### Summary of Probability

Summary of Probability Mathematical Physics I Rules of Probability The probability of an event is called P(A), which is a positive number less than or equal to 1. The total probability for all possible

### Chapter 1 Introduction to Econometrics

Chapter 1 Introduction to Econometrics Econometrics deals with the measurement of economic relationships. It is an integration of economics, mathematical economics and statistics with an objective to provide

### Machine Learning Methods for Demand Estimation

Machine Learning Methods for Demand Estimation By Patrick Bajari, Denis Nekipelov, Stephen P. Ryan, and Miaoyu Yang Over the past decade, there has been a high level of interest in modeling consumer behavior

This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author s institution, sharing

### Multiple Imputation for Missing Data: A Cautionary Tale

Multiple Imputation for Missing Data: A Cautionary Tale Paul D. Allison University of Pennsylvania Address correspondence to Paul D. Allison, Sociology Department, University of Pennsylvania, 3718 Locust

### PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS INTRODUCTION TO STATISTICS MATH 2050

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS INTRODUCTION TO STATISTICS MATH 2050 Class Hours: 2.0 Credit Hours: 3.0 Laboratory Hours: 2.0 Date Revised: Fall 2013 Catalog Course Description: Descriptive

### Glossary of Terms Ability Accommodation Adjusted validity/reliability coefficient Alternate forms Analysis of work Assessment Battery Bias

Glossary of Terms Ability A defined domain of cognitive, perceptual, psychomotor, or physical functioning. Accommodation A change in the content, format, and/or administration of a selection procedure

### Part 1 : 07/27/10 21:30:31

Question 1 - CIA 593 III-64 - Forecasting Techniques What coefficient of correlation results from the following data? X Y 1 10 2 8 3 6 4 4 5 2 A. 0 B. 1 C. Cannot be determined from the data given. D.

### Market Efficiency: Definitions and Tests. Aswath Damodaran

Market Efficiency: Definitions and Tests 1 Why market efficiency matters.. Question of whether markets are efficient, and if not, where the inefficiencies lie, is central to investment valuation. If markets

### South Carolina College- and Career-Ready (SCCCR) Probability and Statistics

South Carolina College- and Career-Ready (SCCCR) Probability and Statistics South Carolina College- and Career-Ready Mathematical Process Standards The South Carolina College- and Career-Ready (SCCCR)

### Interpretation of Somers D under four simple models

Interpretation of Somers D under four simple models Roger B. Newson 03 September, 04 Introduction Somers D is an ordinal measure of association introduced by Somers (96)[9]. It can be defined in terms

### ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008. Description of the course

ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008 Instructor: Maria Heracleous Lectures: M 8:10-10:40 p.m. WARD 202 Office: 221 Roper Phone: 202-885-3758 Office Hours: M W

### X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

### It is important to bear in mind that one of the first three subscripts is redundant since k = i -j +3.

IDENTIFICATION AND ESTIMATION OF AGE, PERIOD AND COHORT EFFECTS IN THE ANALYSIS OF DISCRETE ARCHIVAL DATA Stephen E. Fienberg, University of Minnesota William M. Mason, University of Michigan 1. INTRODUCTION