Traditional Conjoint Analysis with Excel



Similar documents
Regression step-by-step using Microsoft Excel

Assignment objectives:

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

Statistics and Data Analysis

Data Analysis Tools. Tools for Summarizing Data

WEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y X

Predictor Coef StDev T P Constant X S = R-Sq = 0.0% R-Sq(adj) = 0.

Week TSX Index

Bill Burton Albert Einstein College of Medicine April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Economics of Strategy (ECON 4550) Maymester 2015 Applications of Regression Analysis

Univariate Regression

Scatter Plot, Correlation, and Regression on the TI-83/84

How To Run Statistical Tests in Excel

Business Valuation Review

EXCEL Tutorial: How to use EXCEL for Graphs and Calculations.

Chapter 7: Simple linear regression Learning Objectives

SPSS Guide: Regression Analysis

Multiple Regression. Page 24

2. Simple Linear Regression

EXCEL Analysis TookPak [Statistical Analysis] 1. First of all, check to make sure that the Analysis ToolPak is installed. Here is how you do it:

A Quick Guide to Using Excel 2007 s Regression Analysis Tool

1.1. Simple Regression in Excel (Excel 2010).

Multiple Linear Regression in Data Mining

Calibration and Linear Regression Analysis: A Self-Guided Tutorial

Below is a very brief tutorial on the basic capabilities of Excel. Refer to the Excel help files for more information.

Cross Validation techniques in R: A brief overview of some methods, packages, and functions for assessing prediction models.

Final Exam Practice Problem Answers

Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest

11. Analysis of Case-control Studies Logistic Regression

Lin s Concordance Correlation Coefficient

Microsoft Excel Tutorial

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

Simple Predictive Analytics Curtis Seare

Notes on Factoring. MA 206 Kurt Bryan

Market Simulators for Conjoint Analysis

3 The spreadsheet execution model and its consequences

Testing for Lack of Fit

Chapter 3 Quantitative Demand Analysis

Financial Forecasting

Recall this chart that showed how most of our course would be organized:

One-Way Analysis of Variance (ANOVA) Example Problem

When to use Excel. When NOT to use Excel 9/24/2014

Figure 1. An embedded chart on a worksheet.

Simple Methods and Procedures Used in Forecasting

NIKE Case Study Solutions

Appendix 2.1 Tabular and Graphical Methods Using Excel

Tell Me What You Want: Conjoint Analysis Made Simple Using SAS Delali Agbenyegah, Alliance Data Systems, Columbus, Ohio

Module 6: Introduction to Time Series Forecasting

4. Multiple Regression in Practice

Violent crime total. Problem Set 1

Chapter 25 Specifying Forecasting Models

Statistical Models in R

Introduction to Error Analysis

Hedge Effectiveness Testing

Stepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

5. Multiple regression

Activity 3.7 Statistical Analysis with Excel

Tutorial #7A: LC Segmentation with Ratings-based Conjoint Data

How To Analyze Data In Excel 2003 With A Powerpoint 3.5

Introduction to proc glm

Multiple Choice: 2 points each

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Excel Modeling Practice. The Svelte Glove Problem Step-by-Step With Instructions

Penalized regression: Introduction

STATISTICA Formula Guide: Logistic Regression. Table of Contents

International Statistical Institute, 56th Session, 2007: Phil Everson

APPENDIX A Using Microsoft Excel for Error Analysis

Experimental Designs (revisited)

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

Forecasting in STATA: Tools and Tricks

Follow links Class Use and other Permissions. For more information, send to:

SAS Software to Fit the Generalized Linear Model

Seasonal Adjustment for Short Time Series in Excel Catherine C.H. Hood Catherine Hood Consulting

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Interaction effects and group comparisons Richard Williams, University of Notre Dame, Last revised February 20, 2015

Experiment #1, Analyze Data using Excel, Calculator and Graphs.

Using Excel s Analysis ToolPak Add-In

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Regression Analysis: A Complete Example

Normality Testing in Excel

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

Nominal and Real U.S. GDP

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005

5. Linear Regression

The importance of graphing the data: Anscombe s regression examples

South Carolina College- and Career-Ready (SCCCR) Probability and Statistics

Interactions involving Categorical Predictors

Title: Modeling for Prediction Linear Regression with Excel, Minitab, Fathom and the TI-83

Using Excel for Statistical Analysis

Multiple Linear Regression

Didacticiel - Études de cas

Analyzing Research Data Using Excel

2013 MBA Jump Start Program. Statistics Module Part 3

Transcription:

hapter 8 Traditional onjoint nalysis with Excel traditional conjoint analysis may be thought of as a multiple regression problem. The respondent s ratings for the product concepts are observations on the dependent variable. The characteristics of the product or attribute levels are observations on the independent or predictor variables. The estimated regression coefficients associated with the independent variables are the part-worth utilities or preference scores for the levels. The R 2 for the regression characterizes the internal consistency of the respondent. onsider a conjoint analysis problem with three attributes, each with levels as follows: rand olor Price $ $ $1 For simplicity, let us consider a full-factorial experimental design. full-factorial design includes all possible combinations of the attributes. There are 18 possible product concepts or cards that can be created from these three attributes: 3 brands 2 colors 3 prices = 18 cards Further assume that respondents rate each of the 18 product concepts on a scale from 0 to 10, where 10 represents the highest degree of preference. Exhibit 8.1 shows the experimental design. We can use Microsoft Excel to analyze data from traditional conjoint questionnaires. This chapter shows how to code, organize, and analyze data from one hypothetical respondent, working with spreadsheets and spreadsheet functions. Multiple regression functions come from the Excel nalysis ToolPak add-in. 67 Reprinted from Orme,. (2010) Getting Started with onjoint nalysis: Strategies for Product Design and Pricing Research. Second Edition, Madison, Wis.: Research Publishers LL. c 2010 by Research Publishers LL. No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, mechanical, electronic, photocopying, recording, or otherwise, without the prior written permission of the publisher.

68 Traditional onjoint nalysis with Excel ard rand olor Price ($) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 1 1 1 1 1 1 Exhibit 8.1. Full-factorial experimental design

8.1 Data Organization and oding 69 8.1 Data Organization and oding ssume the data for one respondent have been entered into an Excel spreadsheet, illustrated in exhibit 8.2. The first card is made up of the first level on each of the attributes: (rand,, $). The respondent rated that card a 5 on the preference scale. The second card has the first level on brand and color and the second level on price: (rand,, $). This card gets a 5 on the preference scale. nd so on. fter collecting the respondent data, the next step is to code the data in an appropriate manner for estimating utilities using multiple regression. We use a procedure called dummy coding for the independent variables or product characteristics. In its simplest form, dummy coding uses a 1 to reflect the presence of a feature, and a 0 to represent its absence. The brand attribute would be coded as three separate columns, color as two columns, and price as three columns. pplying dummy coding results in an array of columns as illustrated in exhibit 8.3. gain, we see that card 1 is defined as (rand,, $), but we have expanded the layout to reflect dummy coding. To this point, the coding has been straightforward. ut there is one complication that must be resolved. In multiple regression analysis, no independent variable may be perfectly predictable based on the state of any other independent variable or combination of independent variables. If so, the regression procedure cannot separate the effects of the confounded variables. We have that problem with the data as coded in exhibit 8.3, since, for example, we can perfectly predict the state of rand based on the states of rand and rand. This situation is called linear dependency. To resolve this linear dependency, we omit one column from each attribute. It really doesn t matter which column (level) we drop, and for this example we have excluded the first level for each attribute, to produce a modified data table, as illustrated by exhibit 8.4. Even though it appears that one level from each attribute is missing from the data, they are really implicitly included as reference levels for each attribute. The explicitly coded levels are estimated as contrasts with respect to the omitted levels, which are constrained to have a weight of 0.

70 Traditional onjoint nalysis with Excel 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 D E ard rand olor Price Preference 1 1 1 5 2 1 1 5 3 1 1 1 0 4 1 2 8 5 1 2 5 6 1 2 1 2 7 2 1 7 8 2 1 5 9 2 1 1 3 10 2 2 9 11 2 2 6 12 2 2 1 5 13 3 1 10 14 3 1 7 15 3 1 1 5 16 3 2 9 17 3 2 7 18 3 2 1 6 Exhibit 8.2. Excel spreadsheet with conjoint data

8.1 Data Organization and oding 71 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 H I J K L M N O P Q ard $ $ $1 Preference 1 1 0 0 1 0 1 0 0 5 2 1 0 0 1 0 0 1 0 5 3 1 0 0 1 0 0 0 1 0 4 1 0 0 0 1 1 0 0 8 5 1 0 0 0 1 0 1 0 5 6 1 0 0 0 1 0 0 1 2 7 0 1 0 1 0 1 0 0 7 8 0 1 0 1 0 0 1 0 5 9 0 1 0 1 0 0 0 1 3 10 0 1 0 0 1 1 0 0 9 11 0 1 0 0 1 0 1 0 6 12 0 1 0 0 1 0 0 1 5 13 0 0 1 1 0 1 0 0 10 14 0 0 1 1 0 0 1 0 7 15 0 0 1 1 0 0 0 1 5 16 0 0 1 0 1 1 0 0 9 17 0 0 1 0 1 0 1 0 7 18 0 0 1 0 1 0 0 1 6 Exhibit 8.3. Excel spreadsheet with coded data

72 Traditional onjoint nalysis with Excel 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 S T U V W X Y ard $ $1 Preference 1 0 0 0 0 0 5 2 0 0 0 1 0 5 3 0 0 0 0 1 0 4 0 0 1 0 0 8 5 0 0 1 1 0 5 6 0 0 1 0 1 2 7 1 0 0 0 0 7 8 1 0 0 1 0 5 9 1 0 0 0 1 3 10 1 0 1 0 0 9 11 1 0 1 1 0 6 12 1 0 1 0 1 5 13 0 1 0 0 0 10 14 0 1 0 1 0 7 15 0 1 0 0 1 5 16 0 1 1 0 0 9 17 0 1 1 1 0 7 18 0 1 1 0 1 6 Exhibit 8.4. Modified data table for analysis with Excel

8.2 Multiple Regression nalysis 73 8.2 Multiple Regression nalysis Microsoft Excel offers a simple multiple regression tool (under Tools + Data nalysis + Regression with the nalysis Toolpak add-in installed). Using the tool, we can specify the preference score (column Y) as the dependent variable (Input Y Range) and the five dummy-coded attribute columns (columns T through X) as independent variables (Input X range). You should also make sure a constant is estimated; this usually happens by default (by not checking the box labeled onstant is zero ). The mathematical expression of the model is as follows: Y = b 0 + b 1 (rand ) + b 2 (rand ) + b 3 () + b 4 ($) + b 5 ($1) + e where Y is the respondent s preference for the product concept, b 0 is the constant or intercept term, b 1 through b 5 are beta weights (part-worth utilities) for the features, and e is an error term. In this formulation of the model, coefficients for the reference levels are equal to 0. The solution minimizes the sum of squares of the errors in prediction over all observations. portion of the output from Excel is illustrated in exhibit 8.5. Using that output (after rounding to two decimal places of precision), the utilities (coefficients) are the following: rand olor Price = 0.00 = 0.00 $ = 0.00 = 1.67 = 1.11 $ = -2.17 = 3.17 $1 = -4. The constant or intercept term is 5.83, and the fit for this respondent R 2 = 0.90. Depending on the consistency of the respondent, the fit values range from a low of 0 to a high of 1.0. The standard errors of the regression coefficients (betas) reflect how precisely we are able to estimate those coefficients with this design. Lower standard errors are better. The remaining statistics presented in Excel s output are beyond the scope of this chapter and are generally not of much use when considering individual-level conjoint analysis problems. Most traditional conjoint analysis problems solve a separate regression equation for each respondent. Therefore, to estimate utilities, the respondent must have evaluated at least as many cards as parameters to be estimated. When the respondent answers the minimum number of conjoint cards to enable estimation, this is called a saturated design. While such a design is easiest on the respondent, it leaves no room for respondent error. It also always yields an R 2 of 1, and therefore no ability to assess respondent consistency.

74 Traditional onjoint nalysis with Excel SUMMRY OUTPUT Regression Statistics Multiple R 0.94890196 R Square 0.90041494 djusted R Square 0.85892116 Standard Error 0.94280904 Observations 18 NOV df SS MS F Significance F Regression 5 96.4444444 19.2888889 21.7 1.2511E-05 Residual 12 10.6666667 0.88888889 Total 17 107.111111 oefficients Standard Error t Stat P-value Intercept 5.83333333 0.54433105 10.7165176 1.6872E-07 X Variable 1 1.66666667 0.54433105 3.06186218 0.00986485 X Variable 2 3.16666667 0.54433105 5.81753814 8.2445E-05 X Variable 3 1.11111111 0.44444444 2.5 0.0279154 X Variable 4-2.16666667 0.54433105-3.98042083 0.0018249 X Variable 5-4.5 0.54433105-8.26702788 2.6823E-06 Exhibit 8.5. onjoint analysis with multiple regression in Excel

8.2 Multiple Regression nalysis 75 One can easily determine the number of parameters to be estimated in a traditional conjoint analysis: # parameters to be estimated = (# levels) (# attributes) + 1 Most good conjoint designs in practice include more observations than parameters to be estimated (usually 1.5 to 3 times more). The design we have featured in this chapter has three times as many cards (observations) as parameters to be estimated. These designs usually lead to more stable estimates of respondent utilities than saturated designs. Only in the smallest of problems (such as our example with three attributes and eight total levels) would we ask people to respond to all possible combinations of attribute levels. Large full-factorial designs are not practical. Fortunately, design catalogs and computer programs are available to find efficient fractionalfactorial designs. Fractional-factorial designs show an efficient subset of the possible combinations and provide enough information to estimate utilities. In our worked example, the standard errors for the color attribute are lower than for brand and price (recall that lower standard errors imply greater precision of the beta estimate). ecause color only has two levels (as compared to three each for brand and price), each color level has more representation within the design. Therefore, more information is provided for each color level than is provided for the three-level attributes.