Qualitative Choice Analysis Workshop 76 LECTURE / DISCUSSION. Hypothesis Testing



Similar documents
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Generalized Linear Models

11. Analysis of Case-control Studies Logistic Regression

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Multiple Choice Models II

Introduction to Regression and Data Analysis

Simple Linear Regression Inference

How To Check For Differences In The One Way Anova

Factors affecting online sales

A Primer on Forecasting Business Performance

Statistical Models in R

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

SAS Software to Fit the Generalized Linear Model

Tests for Two Proportions

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

Part 2: Analysis of Relationship Between Two Variables

Chapter 3 Local Marketing in Practice

Use of deviance statistics for comparing models

Likelihood: Frequentist vs Bayesian Reasoning

Discrete Choice Analysis II

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Unit 26 Estimation with Confidence Intervals

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Online Appendix to Are Risk Preferences Stable Across Contexts? Evidence from Insurance Data

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541

Introduction to Hypothesis Testing OPRE 6301

Penalized regression: Introduction

Causal Forecasting Models

Elementary Statistics

Association Between Variables

LOGIT AND PROBIT ANALYSIS

STAT 350 Practice Final Exam Solution (Spring 2015)

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Introduction to Hypothesis Testing

Example: Boats and Manatees

How to Discredit Most Real Estate Appraisals in One Minute By Eugene Pasymowski, MAI 2007 RealStat, Inc.

Recall this chart that showed how most of our course would be organized:

Independent samples t-test. Dr. Tom Pierce Radford University

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

Sample Size and Power in Clinical Trials

Chapter 7: Simple linear regression Learning Objectives

Fair Bets and Profitability in College Football Gambling

Introduction to Quantitative Methods

The NCAA Basketball Betting Market: Tests of the Balanced Book and Levitt Hypotheses

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

One-Way Analysis of Variance (ANOVA) Example Problem

Statistical tests for SPSS

Solución del Examen Tipo: 1

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Week TSX Index

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Regression Analysis: A Complete Example

Elasticity. I. What is Elasticity?

Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.

Statistics 2014 Scoring Guidelines

Binary Diagnostic Tests Two Independent Samples

Using Excel for inferential statistics

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Multinomial and Ordinal Logistic Regression

Descriptive Statistics

Module 5: Multiple Regression Analysis

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

We extended the additive model in two variables to the interaction model by adding a third term to the equation.

Evaluation & Validation: Credibility: Evaluating what has been learned

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Regression and Correlation

Network quality control

Statistics in Retail Finance. Chapter 2: Statistical models of default

Study Guide for the Final Exam

Motor and Household Insurance: Pricing to Maximise Profit in a Competitive Market

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Two-sample inference: Continuous data

Estimating Industry Multiples

Lecture Notes Module 1

SAMPLING & INFERENTIAL STATISTICS. Sampling is necessary to make inferences about a population.

Power and sample size in multilevel modeling

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

August 2012 EXAMINATIONS Solution Part I

Statistics Review PSY379

Annex 6 BEST PRACTICE EXAMPLES FOCUSING ON SAMPLE SIZE AND RELIABILITY CALCULATIONS AND SAMPLING FOR VALIDATION/VERIFICATION. (Version 01.

Hypothesis testing - Steps

5.1 Identifying the Target Parameter

3. Mathematical Induction

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

Section 13, Part 1 ANOVA. Analysis Of Variance

Simple linear regression

Principle of Data Reduction

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

6.3 Conditional Probability and Independence

Transcription:

Qualitative Choice Analysis Workshop 76 LECTURE / DISCUSSION Hypothesis Testing

Qualitative Choice Analysis Workshop 77 T-test Use to test value of one parameter. I. Most common application: to test whether a variable enters significantly. Model: V TOU = -"BTOU + $INC + c V ST = -"BST where BTOU = time-fo-use billing; BST = standard billing; and INC = income. Hypothesis: " = 0 t ' Estimated Standard error of estimated If t exceeds critical value (usually 1.96), then REJECT hypothesis. Y " is not zero. BTOU has a significant effect on person s choice. If t is less than critical value, then CANNOT REJECT hypothesis. Y True " might be zero. BTOU does not enter significantly.

Qualitative Choice Analysis Workshop 78 II. More general application: to test whether the coefficient of a variable equals some particular value. More generally, hypothesis: " = a t ' Estimated & a Standard error of estimated If t exceeds critical value, then hypothesis that " = a is rejected. If t is less than critical value, then hypothesis that " = a cannot be rejected.

Qualitative Choice Analysis Workshop 79 Likelihood Ratio Test 1. Hypothesis is a restriction. Specify the unrestricted and the restricted models. 2. Estimate the unrestricted model and note the value of the log likelihood function, say LLU. 3. Estimate the restricted model, constraining parameters to their hypothesized values, and note the value of the log likelihood function, say LLR. 4. Compute the test statistic LRT = 2(LLU - LLR). This must be positive because LLU $ LLR. 5. Under the null hypothesis that the restrictions are true, the LRT statistic is distributed as a chi-squared random variable with degrees of freedom equal to the number of restrictions in the hypothesis. 6. Compare the statistic with the critical value of the chi-squared distribution.

Qualitative Choice Analysis Workshop 80 Critical Values of Chi-Squared Degrees of Freedom Significance Level 10% 5% 1 2.71 3.84 2 4.61 5.99 3 6.25 7.81 4 7.78 9.49 5 9.24 11.07 6 10.64 12.59 7 12.02 14.07 8 13.36 15.51 9 14.68 16.92 10 15.99 18.31 15 22.31 25.00 20 28.41 31.41 40 51.81 55.76

Qualitative Choice Analysis Workshop 81 Example V TOU = -"BTOU + $INC + c V ST = -"BST 1. Test whether a parameter equals zero. Hypothesis: " = 0. C Run model with all explanatory variables: ivalt [bill: btou bst \ inctou: inc zero \ toudum: one zero] This is an unrestricted model. LLU = -376.4. C Run model with BILL omitted; that is, with: ivalt [inctou: inc zero \ toudum: one zero] This is the restricted model. LLR = -384.2. C Test statistic 2(384.2-376.4) = 2(7.8) = 15.6. C C Critical value of chi-squared with one degree of freedom is 3.84 (at 95% confidence). Because test statistic exceeds critical value, hypothesis is REJECTED.

Qualitative Choice Analysis Workshop 82 2. Test whether two parameters both equal zero. Hypothesis: " = $ = 0 C C Run model with BILL, INCTOU, and TOUDUM as explanatory variables. This is the unrestricted model. LLU = -376.4. Run model with BILL and INCTOU omitted; that is, with only TOUDUM. This is the restricted model. LLR = -397.2. C Test statistic: 2(397.2-376.4) = 20.8. C C Critical value of chi-squared with two degrees of freedom is 5.99. Because the test statistic exceeds the critical value, the hypothesis is REJECTED.

Qualitative Choice Analysis Workshop 83 3. Test whether two parameters are equal to each other. Unrestricted model: V TOU = " TOU BTOU + $INC + c V ST = " ST BST Hypothesis: " TOU = " ST C Run model with BTOU, BST, INCTOU, and TOUDUM as explanatory variables: ivalt [btou: btou zero \ bst: zero bst \ inctou: inc zero \ toudum: one zero] Get: LLU = -372.1. C Run model with BILL, INCTOU, TOUDUM: ivalt [ bill: btou bst \ inctou: inc zero \ toudum: one zero] Get: LLR = -376.4. C Test statistic: 2(376.4-372.1) = 8.6. C Critical value: 3.84. C REJECT hypothesis.

Qualitative Choice Analysis Workshop 84 4. Test whether parameters are the same for two subpopulations. Run model on subsample A; get LLA = -162.1. Run model on subsample B; get LLB = -206.1. Calculate the log likelihood for the unrestricted model: LLU = LLA + LLB = -162.1-206.1 = -368.2 Run model on full sample. This is restricted model. Get LLR = -372.1. There are four parameters in the restricted model and eight parameters in the unrestricted model. The number of restrictions is therefore four. Test statistic is 2(372.1-368.2) = 7.8. The critical value of chi-squared with four degrees of freedom is 9.49. ACCEPT hypothesis.

Qualitative Choice Analysis Workshop 85 WORKSHOP ON HYPOTHESIS TESTING

Qualitative Choice Analysis Workshop 86 Heating and Cooling System Choice Alternatives: 1. Gas central heat with cooling. 2. Electric central resistance heat with cooling. 3. Electric room resistance heat with cooling. 4. Electric heat pump (heating and cooling). 5. Gas central heat without cooling. 6. Electric central resistance without cooling. 7. Electric room resistance without cooling. Dependent variable: depvar Data set: nested1.sav Explanatory variables: ich1,...,ich7 icca och1,...,och7 occa income installed cost for heating portion of system installed cost for cooling portion of system (Note: ic1 = ich1 + icca = installed cost for alt. 1, etc. for alts. 2-4. ic5 = ich5 = installed cost for alt. 5, etc. for alts. 6-7, since these systems have no cooling.) annual operating cost for heating portion of system annual operating cost for cooling portion of system (Note: operating cost for heating and cooling combined is calculated analogously to installation cost.) income of household

Qualitative Choice Analysis Workshop 87 1. Estimate a basic model that includes: C full installation cost of alternative (including heating, and, if available, cooling) C full operating cost of alternatives C income entering the room systems (alts. 3 and 7) C income entering the systems with cooling (alts. 1-4) C a constant for the systems with cooling (alts. 1-4) The file hypo.cmd will estimate this model. How do you interpret the two income variables and their estimated coefficients? That is, what do these variables seem to capture or indicate?

Qualitative Choice Analysis Workshop 88 2. Test the hypothesis that the discount rate is 0.20. Recall that LF = IC + (1/r)OC where r is the discount rate. With r =.20, we have LF = IC + 5 * OC. So: a test of the hypothesis that r =.20 is equivalent to a test of the hypothesis that the coefficient of operating cost is five times greater than that of installation cost.

Qualitative Choice Analysis Workshop 89 3. Test the hypothesis that people consider the costs for heating and cooling equally in their choice of HVAC systems. That is, consider a model V i = $ICH i + 8ICCA i + 2OCH i + NOCCA i + other terms The hypothesis to be tested is that $ = 8 and 2 = N.

Qualitative Choice Analysis Workshop 90 DISCUSSION OF WORKSHOP RESULTS

Qualitative Choice Analysis Workshop 91 LECTURE / DISCUSSION Independence from Irrelevant Alternatives

Qualitative Choice Analysis Workshop 92 Independence from Irrelevant Alternatives e V in P in P kn ' j e V jn e V kn ' e V in e V kn ' e V in &V kn j e V jn This ratio does not depend on existence or attributes of alternatives other than i or k, as long as each V in depends only on the attributes of alternative i and/or the characteristics of the subject, and not on the attributes of other alternatives.

Qualitative Choice Analysis Workshop 93 Practical Advantages of IIA 1. Allows sampling of alternatives for estimation. When the choice set is large, data collection and computation costs can be reduced by sampling a subset of alternatives. For many sampling procedures, the parameters of an MNL model for the sample of alternatives are the same as the parameters of an MNL model for the full choice set. 2. Allows forecasting demand for a new alternative. The demand for a new alternative can be forecast in the MNL model simply by adding an e v term for the new alternative to the denominator of the probability. (It must be the case that the estimated parameters are sufficient to calculate the v for the new alternative.) Further reading: The following two pages give further details.

Qualitative Choice Analysis Workshop 94 Sampling of Alternatives for Estimation Suppose C n is the choice set for subject n, P in = e V in j e V jn j0c n is the MNL choice probability for this subject, and i n is the observed choice. Suppose the analyst selects a subset of alternatives A n for this subject which will be used as a basis for estimation; i.e., if i n is contained in A n, then the subject is treated as if the choice were actually being made from A n, and if i n is not contained in A n, the observation is discarded. In selecting A n, the analyst may use the information on which alternative i n was chosen, and may also use information on the variables that enter the determination of the V in. The rule used by the analyst for selecting A n is summarized by a probability B(A*i n,v s) that subset A of C n is selected, given the observed choice i n and the observed V s (or, more precisely, the observed variables behind the V s ). The selection rule is a uniform conditioning rule if for the selected A n, it is the case that B(A n *j,v s) is the same for all j in A n. Examples of uniform conditioning rules are (1) select (randomly or purposively) a subset A n of C n without taking into account what the observed choice or the V s are, and keep the observation if and only if i n is contained in the selected A n ; and (2) given observed choice i n, select m-1 of the remaining alternatives at random from C n, without taking into account the V s. An implication of uniform conditioning is that in the sample containing the pairs (i n,a n ) for which i n is contained in A n, the probability of observed response i n conditioned on A n is P in An ' e V in j e V jn j0a n. This is just a MNL model that treats choice as if it were being made from A n rather than C n, so that maximum likelihood estimation of this model for the sample of (i n,a n ) with i n 0 A n estimates the same parameters as does maximum likelihood estimation on data from the full choice set. Then, this sampling of alternatives cuts down data collection time (for alternatives not in A n ) and computation size and time, but still gives consistent estimates of parameters for the original problem.

Qualitative Choice Analysis Workshop 95 Forecasting Demand for a New Alternative Suppose choice among alternatives in a set C n is observed for a sample, and that choice in this sample can be described by a MNL model. Suppose this sample is used to estimate the parameters of the logit model, so that one has an estimated model P in = e V in j e V jn j0c n that describes the historical choice behavior. Now suppose a new alternative "0" were to be added to C n, so that the subject s choice set B n would be the union of C n and the new alternative "0", B n = {0}cC n. Suppose that the parameters estimated on the historical data are sufficient so that you can calculate the representative utility V 0 of the new alternative. If IIA holds, then the choice probabilities from B n would be P in Bn ' e V 0n e V in % jj0cn e V jn / e V 0n e V in % e I n / P in C n @ P(C n B n ) if i ' C n 1 & P(C n B n ) if i ' 0 where I n ' log j e V jn j0cn (inclusive value for set C n ) P(C n B n ) ' e V 0n e I n % e I n (probability of choosing C n from B n ) Note that if V 0n cannot be calculated completely from the historically estimated parameters, then additional information or assumptions must be introduced to make the forecast. This is a problem if there are alternative-specific intercepts or slope coefficients in the model. Hence, for forecasting, one should try first to describe historical behavior using only generic (i.e., non-alternative-specific) variables, and introduce alternative-specific variables only if they are essential to explain observed behavior and the attributes of the new alternative.

Qualitative Choice Analysis Workshop 96 Red Bus / Blue Bus Problem Two alternatives orginially: car, blue bus Suppose: V c = V bb Then: P c ' P bb ' 1 2 P c P bb ' 1 Now add a third alternative, red bus, which is exactly like the blue bus. Logit predicts: P c = P bb = P rb = 1/3 In reality: P c = 1/2, P bb = P rb = 1/4 P bb P rb ' 1, because bb and rb are the same P c P bb ' 1, by IIA The validity of IIA in an application is an empirical question that can be answered by a specification of the following hypothesis: The IIA property is a sufficiently good approximation to the true data generation process so that patterns of choice are consistent with the MNL model.

Qualitative Choice Analysis Workshop 97 Substitution Patterns IIA implies proportional substitution. Example: Impact of rebates on heat pumps. Alternative Original Probability Logit Forecasts 1.600.54 (-10%) 2.055.0495 (-10%) 3.002.0018 (-10%) 4.033.133 5.080.072 (-10%) 6.010.009 (-10%) 7.220.198 (-10%) 8 P 5 /P 6 8 8 In reality, substitution is rarely proportional.