Interaction between quantitative predictors



Similar documents
We extended the additive model in two variables to the interaction model by adding a third term to the equation.

Estimation of σ 2, the variance of ɛ

Hypothesis testing - Steps

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Part 2: Analysis of Relationship Between Two Variables

Multiple Linear Regression

Premaster Statistics Tutorial 4 Full solutions

Coefficient of Determination

Regression Analysis: A Complete Example

Simple Linear Regression Inference

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Chapter 13 Introduction to Linear Regression and Correlation Analysis

5. Multiple regression

Chapter 7: Simple linear regression Learning Objectives

2. Simple Linear Regression

Chapter 23. Inferences for Regression

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Using R for Linear Regression

Pearson's Correlation Tests

Simple linear regression

Final Exam Practice Problem Answers

1 Simple Linear Regression I Least Squares Estimation

Statistics Review PSY379

SPSS Guide: Regression Analysis

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

11. Analysis of Case-control Studies Logistic Regression

Chapter 3 Quantitative Demand Analysis

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Week TSX Index

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

12: Analysis of Variance. Introduction

Simple Regression Theory II 2010 Samuel L. Baker

ANALYSIS OF TREND CHAPTER 5

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

STAT 350 Practice Final Exam Solution (Spring 2015)

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

Week 5: Multiple Linear Regression

Lecture Notes Module 1

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

2013 MBA Jump Start Program. Statistics Module Part 3

Chapter 5 Analysis of variance SPSS Analysis of variance

A Short Tour of the Predictive Modeling Process

Multinomial and Ordinal Logistic Regression

Comparing Nested Models

Regression step-by-step using Microsoft Excel

Mind on Statistics. Chapter 13

Logs Transformation in a Regression Equation

Point Biserial Correlation Tests

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

How To Test For Significance On A Data Set

Correlational Research

International Statistical Institute, 56th Session, 2007: Phil Everson

Chapter 7 Section 7.1: Inference for the Mean of a Population

Unit 26 Estimation with Confidence Intervals

17. SIMPLE LINEAR REGRESSION II

Factors affecting online sales

Econometrics Simple Linear Regression

Testing for Lack of Fit

2. What is the general linear model to be used to model linear trend? (Write out the model) = or

Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb

HYPOTHESIS TESTING: POWER OF THE TEST

Nonlinear Regression Functions. SW Ch 8 1/54/

5.1 Identifying the Target Parameter

An analysis method for a quantitative outcome and two categorical explanatory variables.

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

DATA INTERPRETATION AND STATISTICS

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Difference of Means and ANOVA Problems

Generalized Linear Models

How To Check For Differences In The One Way Anova

An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA

Chapter 5 Estimating Demand Functions

Principles of Hypothesis Testing for Public Health

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Recall this chart that showed how most of our course would be organized:

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)

Module 5: Multiple Regression Analysis

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Chi Square Tests. Chapter Introduction

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Elements of statistics (MATH0487-1)

Statistical Models in R

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Introduction to Hypothesis Testing

Pearson s Correlation

Independent t- Test (Comparing Two Means)

MULTIPLE REGRESSION EXAMPLE

UNDERSTANDING THE TWO-WAY ANOVA

Example: Boats and Manatees

p ˆ (sample mean and sample

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Analysis of Variance. MINITAB User s Guide 2 3-1

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

A Primer on Forecasting Business Performance

Transcription:

Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors in the model. See Fig. 4.1: relation between E(y) and x 1 is the same regardless of the value of x 2 : all the prediction lines are parallel. If, however, the association between response and one of the predictors depends on the value of other predictors, then a first-order model is no longer appropriate. We say that there is an interaction among predictors. Stat 328 - Fall 2004 1

Interaction (cont d) Example: a company wishes to estimate the association between sales of a beauty product (y) and two potential predictors of sales in each of n markets: $ spent on daytime TV ads in ith market (x 1 ) and average number of years of education of females in ith market. Intuitively, this is what we would expect: Advertisement expenses will tend to increase sales (up to a point). In cities where women are highly educated (on the average), less of them will be watching TV during the day. The effect of $ in ads on sales may then also depend on education of potential consumers. Stat 328 - Fall 2004 2

Interaction (cont d) A figure to represent the association between ads and sales for different levels of education will be drawn in class. How do we include an interaction term in the model? With k = 2 predictors: y i = β 0 + β 1 x 1i + β 2 x 2i + β 3 x 1i x 2i + ɛ i, where the assumptions about the model are the same as before. An interaction between two predictors is a second-order term in the model. Stat 328 - Fall 2004 3

Interaction (cont d) In sales example, we would expect that β 3 < 0: as education increases (and more women are out working), the strength of the association between daytime TV ads on sales decreases. In other words, daytime ads are expected to be more effective in markets where more women are at home watching TV during the day than in markets where most women are not watching TV. In general, with k predictors, we can include pairwise interactions between any two, as appropriate. Higher order interactions (e.g. x j x l x t denoting the three-way interaction between the jth, lth and tth predictors) can also be included in the model, but are much harder to interpret from a subject matter point of view. Stat 328 - Fall 2004 4

Interaction (cont d) When predictors interact, the interpretation of all the β s changes. If the model is y i = β 0 + β 1 x 1i + β 2 x 2i + β 3 x 1i x 2i + ɛ i, β 0 is still interpreted as before. (β 1 + β 3 x 2 ) is change in E(y) when x 1 increases by one unit and x 2 is held fixed. (β 2 + β 3 x 1 ) is change in E(y) when x 2 increases by one unit and x 1 is held fixed. Association between E(y) and x 1 depends on level of x 2, unless β 3 = 0, in which case interaction does not exist. Stat 328 - Fall 2004 5

Interaction (cont d) In sales example, suppose we find that: b 0 = 5, b 1 = 3, b 2 = 0.5, b 3 = 0.2. Interpretation? Number of units sold can be expected to change by 3 0.2x 2 when ad expenses increase by $1 given education. Number of units sold can be expected to change by 0.5 0.2x 1 when education of potential customers increases by one year, given ad expenditures. In a market with 12 years of average education, we expect that sales will increase by 3-0.2(12) = 0.6 units if ad expenditures increase by $1. In a market with average education equal to 8 years, an additional $1 spent on daytime ads would be associated to an increase of about 1.4 units in expected sales. Stat 328 - Fall 2004 6

Interaction (cont d) How do we draw inferences in models with interaction terms? Steps would be the same as in any multiple regression model: 1. Do a global F test of the utility of the model. The null hypothesis in this case is H 0 : β 1 = β 2 =... = β k = 0, tested against the alternative that says that at least one of the β s is different from 0. 2. If F test leads to rejection of H 0, then do a t test on each of the β s associated to interaction terms. 3. If interaction between x j and x k is significant, do not test hypothesis for β j and β k ; if the interaction is important, the individual x s must be important too (some statisticians would argue different here). Stat 328 - Fall 2004 7

Second order model with quadratic predictors Sometimes, the association between E(y) and x j quadratic. is not linear but A second order model with one predictor is: y i = β 0 + β 1 x 1i + β 2 x 2 1i + ɛ i. If β 2 > 0: association is concave upwards (bowl shape). If β 2 < 0: concave downwards (mound shape). β 2 is known as a rate of curvature parameter. Stat 328 - Fall 2004 8

Quadratic predictors - Example Example 4.6, page 198. Data: y is immunoglobin in blood (indicator of immunity, in mgrs) and x is maximum oxygen uptake (indicator of fitness, in ml/kg) measured on 30 individuals. Range: x (32, 70). See scatter plot of data. Model: with usual assumptions. y i = β 0 + β 1 x i + β 2 x 2 i + ɛ i, Stat 328 - Fall 2004 9

Quadratic predictors - Example Results: b 0 = 1, 464, b 1 = 88.3 and b 2 = 0.54, so that the prediction equation is ŷ = 1, 464 + 88.3x 0.54x 2. R 2 a = 0.93 so about 93% of the variability observed in immunoglobin can be associated to fitness. Interpretation of coefficients: The intercept is meaningless. Cannot have negative immunoglobin. b 1 no longer has a simple interpretation. It is NOT the expected change in y when x increases by one. The quadratic term b 2 is negative: response curves downwards as x increases. Stat 328 - Fall 2004 10

Quadratic predictors - Example Be cautious with extrapolations! See Fig. 4.16. Concavity of response implies that for large enough x the E(y) will begin to decrease. This makes no sense from a physiology point of view. Nonsensical predictions may occur if the model is used outside of the range of the data! Stat 328 - Fall 2004 11

Quadratic predictors - Example First test of hypotheses is F -test for entire model. We test: H 0 : β 1 = β 2 = 0, against H a : at least one of the two 0. In this example, F = 203.16 which we know will be larger than the critical value even without looking at the table. We reject H 0 : maximal oxygen uptake contributes information about immunoglobin levels in the blood. Next step is to decide whether curvature is important or not. Stat 328 - Fall 2004 12

Quadratic predictors - Example We now test for significance of the quadratic effect: H 0 : β 2 = 0 against H a : β 2 0 (or we can do a one-tailed test too). t statistic is t = b 2 /ˆσ b2 = 0.536/0.158 = 3.39 which we compare to a table value with α/2 = 0.025 and n 3 degrees of freedom. We reject H 0. Interpretation: There is strong evidence that immunoglobin levels increase more slowly per unit increase in maximal oxygen uptake in individuals with high aerobic fitness than in those with low aerobic fitness. If we had failed to reject H 0 : β 2 = 0, we would conclude that the association between y and x is linear. Stat 328 - Fall 2004 13

Estimation and prediction Same concepts as before. With the model we might wish to: 1. Estimate the expected mean value of the response at a certain value of the predictor(s). 2. Predict a single response for some value of the predictor. In both cases, the point estimator (predictor) is ŷ = b 0 + b 1 x p + b 2 x 2 p for x = x p. The standard error of ŷ depends on whether we predict a mean or a single value. As before, ˆσ (y ŷ) > ˆσŷ. Calculations are complex, so we use the computer to get these standard errors and CIs. Stat 328 - Fall 2004 14

Estimation and prediction In example, suppose we wish to obtain 1. The expected mean immunoglobin levels for people with oxygen uptake of x p = 40 ml/kg. 2. The expected immunoglobin level for a person with x p = 40 ml/kg. In both cases, point estimator is ŷ = 1, 464.4 + 88.3(40) 0.536(40) 2 = 1, 209.9. JMP and SAS will give the (1 α)% CI for the mean or for a single prediction. Stat 328 - Fall 2004 15

Estimation and prediction From CI we can derive ˆσŷ or ˆσ (y ŷ) recalling that (1 α)% Lower bound of CI = ŷ t α/2,n k 1 std error. Then Std error = ŷ Lower bound t α/2,n k 1. We can also derive the std errors using the upper bound of the CI as follows: Upper bound ŷ Std error =. t α/2,n k 1 Stat 328 - Fall 2004 16

Estimation and prediction In example, the 95% CI for the mean immunoglobin at x = 40 ml/kg is (1, 156.2, 1, 263.6). Then: ˆσŷ = 1, 209.9 1, 156.2 2.052 = 26.17. Also, since the 95% CI for a single response is (985, 1, 434.8): ˆσ (y ŷ) = 1, 209.9 985.0 2.052 = 112.45. Stat 328 - Fall 2004 17

More complex models: interaction + curvature Consider the following complete second-order model with two predictors: See Fig. 4.19. y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 1 x 2 + β 4 x 2 1 + β 5 x 2 2 + ɛ. A complete second order model with three predictors includes 3 firstorder terms, 3 squared terms, 3 two-way interactions, and 1 three-way interaction. The number of terms in complete models gets out of hand fast. Samples often not large enough to fit all possible terms. Use subject-matter knowledge to decide which terms to include. Stat 328 - Fall 2004 18

More complex models: Example Example 4.7, page 213: Study to determine whether weight of package (x 1 ) and distance delivered (x 2 ) are associated to shipping costs (y) in a small regional express delivery service. See scatter plots. Complete second-order model fitted with JMP. Data Express on class web site. Results: See output. Stat 328 - Fall 2004 19

More complex models: Example Interpretation of results: Since RMSE = 0.44, about 95% of shipping costs will fall within $0.89 of their predicted values. R 2 a = 0.99: almost all of the variability in shipping costs can be explained by the model. F statistic = 458.39 on 5 and 14 df. Highly significant, model is useful. Weight is associated to cost both linearly and quadratically. Distance only linearly. Interaction between weight and cost is positive: effect of weight on cost is not independent of distance. Stat 328 - Fall 2004 20