Supplement 13A: Partial F Test

Similar documents
Multiple Linear Regression

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Regression Analysis: A Complete Example

Regression step-by-step using Microsoft Excel

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

Premaster Statistics Tutorial 4 Full solutions

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Notes on Applied Linear Regression

Chapter 3 Quantitative Demand Analysis

One-Way Analysis of Variance (ANOVA) Example Problem

Generalized Linear Models

One-Way Analysis of Variance

Comparing Nested Models

International Statistical Institute, 56th Session, 2007: Phil Everson

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

Chapter 7: Simple linear regression Learning Objectives

SPSS Guide: Regression Analysis

Recall this chart that showed how most of our course would be organized:

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Module 5: Multiple Regression Analysis

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

A Primer on Forecasting Business Performance

MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM. R, analysis of variance, Student test, multivariate analysis

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Statistical Functions in Excel

1.5 Oneway Analysis of Variance

Final Exam Practice Problem Answers

Simple linear regression

Chapter 5 Analysis of variance SPSS Analysis of variance

Part 2: Analysis of Relationship Between Two Variables

STAT 350 Practice Final Exam Solution (Spring 2015)

Section 13, Part 1 ANOVA. Analysis Of Variance

Simple Linear Regression Inference

Factors affecting online sales

Regression Analysis (Spring, 2000)

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Week TSX Index

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Data Analysis Tools. Tools for Summarizing Data

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

Regression III: Advanced Methods

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

Difference of Means and ANOVA Problems

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Linear Models in STATA and ANOVA

Hypothesis testing - Steps

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

5. Linear Regression

Descriptive Statistics

ABSORBENCY OF PAPER TOWELS

Solución del Examen Tipo: 1

Multiple Linear Regression in Data Mining

12: Analysis of Variance. Introduction

2. What is the general linear model to be used to model linear trend? (Write out the model) = or

Estimation of σ 2, the variance of ɛ

Independent t- Test (Comparing Two Means)

Study Guide for the Final Exam

Testing for Lack of Fit

Statistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice!

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Guide to Microsoft Excel for calculations, statistics, and plotting data

Causal Forecasting Models

EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Predictor Coef StDev T P Constant X S = R-Sq = 0.0% R-Sq(adj) = 0.

1 Simple Linear Regression I Least Squares Estimation

Simple Regression Theory II 2010 Samuel L. Baker

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Nonlinear Regression Functions. SW Ch 8 1/54/

Statistiek II. John Nerbonne. October 1, Dept of Information Science

APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING

Chapter 2 Probability Topics SPSS T tests

4. Multiple Regression in Practice

Logs Transformation in a Regression Equation

Chapter 5 Estimating Demand Functions

Not Your Dad s Magic Eight Ball

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

N-Way Analysis of Variance

STATISTICA Formula Guide: Logistic Regression. Table of Contents

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

The correlation coefficient

Chapter 15. Mixed Models Overview. A flexible approach to correlated data.

Chapter 7. One-way ANOVA

Introduction to Regression and Data Analysis

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

5. Multiple regression

2. Simple Linear Regression

Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb

CHAPTER 13. Experimental Design and Analysis of Variance

SAS Software to Fit the Generalized Linear Model

Transcription:

Supplement 13A: Partial F Test Purpose of the Partial F Test For a given regression model, could some of the predictors be eliminated without sacrificing too much in the way of fit? Conversely, would it be worthwhile to add a certain set of new predictors to a given regression model? The partial F test is designed to answer questions such as these by comparing two linear models for the same response variable. The extra sum of squares is used to measure the marginal increase in the error sum of squares when one or more predictors are deleted from a model. Conversely, the extra sum of squares measures the marginal reduction in the error sum of squares when one or more predictors are added to a model. Eliminating Some Predictors We will start by showing how to assess the effect of eliminating some predictors from a model that contains k predictors. The model containing all the predictors is called the full model: (13A.1) Y = 0 + 1 X 1 + 2 X 2 + + k X k A model with fewer predictors is a reduced model. We estimate the linear regression for each of the two models, and then look at the error sum of squares (SSE) from the ANOVA table for each model. We can use the following notation, assuming that m predictors were eliminated in the reduced model: Full model SSE: SSE Full df Full = n k 1 Reduced model SSE: SSE Reduced df Reduced = n k 1+m Extra SSE: SSE Reduced SSE Full df=( n k 1+m) ( n k 1) = m The partial F test statistic is the ratio of two variances. The numerator is the difference in error sums of squares (the extra sum of squares ) between the two models, divided by the number of predictors eliminated. The denominator is the mean squared error for the full model (SSE Full ) divided by its degrees of freedom. (13A.2) F SSEReduced SSE m SSEFull nk1 Full if m predictors are eliminated Degrees of freedom for this test will then be (m, n k 1). If only one predictor has been eliminated, then m = 1. We can ulate the p-value for the partial F test using =F.DIST.RT(F, m, n k 1). Illustration: Predicting Used Car Prices CarPrice Table 13A.1 shows a data set consisting of 40 observations on prices of used cars of a particular brand and model (hence controlling for an obviously important factor that would affect prices).

The response variable is Y (SellPrice) = sale price of the vehicle (in thousands of dollars). We have observations on three potential predictors: X 1 (Age) = age of car in years, X 2 (Mileage) = miles on odometer (in thousands of miles), X 3 (ManTran) = 1 if manual transmission, 0 otherwise. The three predictors are viewed as non-stochastic, independent variables (we can later investigate the latter assumption by looking at VIFs, if we wish). TABLE 13A.1 Selling Price and Characteristics of 40 Used Cars X 1 X 2 X 3 Age Mileage ManTran CarPrice Y SellPrice 13 148.599 0 0.370 2 17.367 0 29.810 13 174.904 0 0.390 10 145.886 0 11.210 8 93.22 0 12.270 5 75.907 0 19.260 Note: Only the first and last three observations are shown here. The units for SellPrice and Mileage have been adjusted to thousands to improve data conditioning. Eliminating a Single Predictor Let us first test whether the single predictor ManTran could be eliminated to achieve a more parsimonious model than using all three predictors. We are comparing two potential linear regression models: Full model: Reduced model: SellPrice = 0 + 1 Age + 2 Mileage + 3 ManTran SellPrice = 0 + 1 Age + 2 Mileage Here are the ANOVA tables from these two regressions, presented side-by-side: Full Model ANOVA table Regression 2,334.5984 3 778.1995 Error 199.1586 36 5.5322 Reduced Model ANOVA table Regression 2,314.3730 2 1,157.1865 Error 219.3840 37 5.9293 The elimination of ManTran increases the sum of squared errors, as you would expect (you have already learned that extra predictors can never decrease the R 2.even if they are not significant). Although the predictor ManTran is contributing something to the model s overall explanatory power (reduced SSE) the question remains whether ManTran is making a statistically significant extra contribution. The ulations are: Full model: SSE Full =199.1586 df Full = n k 1 = 40 3 1 = 36 Reduced model: SSE Reduced =219.3840 df Reduced = n k 1+m = 40 3 1+1 = 37 Extra SSE: SSE Reduced SSE Full df=(n k 1+m) (n k 1) = 1

F SSEReduced SSEFull 219.3840 199.1586 m 1 20.2254 3.6559 SSEFull 199.1586 5.5322 n k 1 36 From Excel, we obtain the p-value =F.DIST.RT(3.65559,1,36) =.0639. Therefore, if we are using α =.05, we would say that the extra sum of squares is not significant (i.e., ManTran does not make a significant marginal contribution). Instead of using the p-value, we could reach the same conclusion by comparing F = 3.6559 with F.05 (1,36) =F.INV.RT(0.05,1,36) = 4.114 to draw the same conclusion. In effect, the hypotheses we are testing are: H 0 : 3 = 0 H 1 : 3 0 The test statistic is not far enough from zero to reject the hypothesis H 0 : 3 = 0. You may already have realized that if we are only considering the effect of one single predictor, we could reach the same conclusion from its t-statistic in the fitted regression of the full model: Regression output Variables Coefficients Std. Error t (df=36) p-value Intercept 33.7261 0.9994 33.747 7.60E-29 Age -1.6630 0.2938-5.660 1.98E-06 Mileage -0.0584 0.0224-2.610.0131 ManTran? -1.6538 0.8650-1.912.0639 In the single predictor case, the partial F test statistic is equal to the square of the corresponding t test statistic in the full model. The t-test uses the same degrees of freedom as the denominator of the partial F test, so the p-values will be the same as long as we use a two-tailed t-test (to eliminate the sign so that rejection in either tail could occur): Predictor ManTran: t 2 = (-1.912) 2 = 3.656 Excel s p-value: =T.DIST.2T(1.912,36) =.0639 In effect, the hypotheses we are testing are: H 0 : 3 = 0 H 1 : 3 0 In the case of a single predictor, we could get by without using the partial F test. It is shown here because it illustrates the test in a simple way, and reveals the connection between F and t distributions. An advantage of the t-test is that it could also be used to test a one-sided hypothesis (e.g., H 1 : 3 < 0) which might be relevant in the case of this example (all our predictors seem to have an inverse relationship with a car s selling price).

Eliminating More Than One Predictor We now turn to the more general case of using the partial F test to assess the effect of eliminating m predictors simultaneously (where m > 1). This can be especially useful when we have a large model with many predictors that we are thinking of eliminating because their effects seem to be weak in the full model. To test the effects of discarding m predictors at once, the hypotheses are: H 0 : All the j = 0 for a subset of m predictors in the full model H 1 : Not all the j = 0 (at least some of the m coefficients are non-zero) For example, suppose we want to know whether we can eliminate both Mileage and ManTran at once. The hypotheses are: H 0 : 2 = 0 and 3 = 0 H 1 : One or both coefficients are non-zero The models to be compared are: Full model: Reduced model: SellPrice = 0 + 1 Age + 2 Mileage + 3 ManTran SellPrice = 0 + 1 Age Here are the ANOVA tables from these two regressions, presented side-by-side: Full Model ANOVA table Regression 2,334.5984 3 778.1995 Error 199.1586 36 5.5322 Reduced Model ANOVA table Regression 2,269.8421 1 2,269.8421 Residual 263.9148 38 6.9451 The elimination of both Mileage and ManTran increases the sum of squared errors, as you would. The question is whether these two predictors are making a statistically significant extra contribution to reducing the sum of squared errors. The ulations are: Full model: SSE Full =199.1586 df Full = n k 1 = 40 3 1 = 36 Reduced model: SSE Reduced =263.9148 df Reduced = n k 1+m = 40 3 2+2 = 38 Extra SSE: SSE Reduced SSE Full df=(n k 1+m) (n k 1) = m = 2 F 263.9148 199.1586 2 64.7562 11.7053 199.1586 5.5322 36 From Excel, we obtain the p-value =F.DIST.RT(11.7053,2,36) =.0001. If we are using α =.05, we would say that the extra sum of squares is highly significant (i.e., these two predictors do make a

significant marginal contribution). Alternatively, we can compare F = 11.705 with F.05 (2,26) =F.INV.RT(0.05,2,36) = 3.259 to draw the same conclusion. Adding Predictors We have been discussing eliminating predictors. The ulations for adding predictors to a linear model are similar if we define the full model as the big model (more predictors) and the reduced model as the small model (fewer predictors). The extra sum of squares is still the difference between the two sums of squares: (13A.3) F SSE for big model SSE for small model Number of extra predictors SSE for big model nk1 More Complex Models We can use variations on these partial F tests based on error sums of squares for other purposes. For example, we can test whether two coefficients in a model are the same (e.g., 2 = 3 ) or to ulate the effects of any given predictor given the presence of other sets of predictors in the model (using coefficient of partial determination). Such tests are ordinarily reserved for more advanced classes in statistics, and may entail using more specialized software. Full Results for Car Data CarPrice To allow you to explore the car data on your own, full results are shown below for the full model based on the used car data. SellPrice is negatively affected by Age and Mileage (both highly significant) and marginally so by ManTran (p-value significant at α =.10 but not at α =.05. You can also look at the data file and do your own regressions. Regression Analysis R² 0.921 Adjusted R² 0.915 n 40 R 0.960 k 3 Std. Error 2.352 Dep. Var. SellPrice ANOVA table F p-value Regression 2,334.5984 3 778.1995 140.67 6.17E-20 Residual 199.1586 36 5.5322 Regression output confidence interval Variables Coefficients Std. Error t (df=36) p-value 95% lower 95% upper VIF Intercept 33.7261 0.9994 33.747 7.60E-29 31.6993 35.7530 Age -1.6630 0.2938-5.660 1.98E-06-2.2589-1.0671 6.384 Mileage -0.0584 0.0224-2.610.0131-0.1038-0.0130 6.371 ManTran? -1.6538 0.8650-1.912.0639-3.4081 0.1004 1.014

It appears that as a car ages, it loses about $1,663 in value per year (ceteris paribus). Similarly, for each extra mile driven, a car loses on average about $58. Cars with manual transmission seem to sell for about $1,654 less than those with automatic transmission (remember, the brand and model are controlled already). There is evidence of multicollinearity between Age and Mileage, which would be expected (as cars get older, they accumulate more miles). This would require further consideration, by the analyst. Section Exercises 13A.1 Instructions: Use α =.05 in all tests. (a) Perform a full linear regression to predict ColGrad% using all eight predictors in DATA SET E shown here. State the SSE and df for the full model. (b) Fit a reduced linear regression model by eliminating predictor Age. State the SSE and df for the reduced model. (c) Calculate the partial F test statistic to see whether predictor Age was significant. (d) Calculate the p-value for the partial F-test. What is your conclusion? (e) Does your conclusion from the partial F test agree with the test using the t-statistic in the full model regression? (e) Fit a reduced regression model by eliminating two predictors Age and Seast simultaneously. State the SSE and df for the reduced model. (f) Calculate the partial F test statistic to see whether predictors Age and Seast can both be eliminated. State your conclusion. References Kutner, Michael H.; Christopher J. Nachtsheim; and John Neter. Applied Linear Regression Models. 4th ed. McGraw-Hill/Irwin, 2004, pp. 256-271.