# Supplement 13A: Partial F Test

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Supplement 13A: Partial F Test Purpose of the Partial F Test For a given regression model, could some of the predictors be eliminated without sacrificing too much in the way of fit? Conversely, would it be worthwhile to add a certain set of new predictors to a given regression model? The partial F test is designed to answer questions such as these by comparing two linear models for the same response variable. The extra sum of squares is used to measure the marginal increase in the error sum of squares when one or more predictors are deleted from a model. Conversely, the extra sum of squares measures the marginal reduction in the error sum of squares when one or more predictors are added to a model. Eliminating Some Predictors We will start by showing how to assess the effect of eliminating some predictors from a model that contains k predictors. The model containing all the predictors is called the full model: (13A.1) Y = X X k X k A model with fewer predictors is a reduced model. We estimate the linear regression for each of the two models, and then look at the error sum of squares (SSE) from the ANOVA table for each model. We can use the following notation, assuming that m predictors were eliminated in the reduced model: Full model SSE: SSE Full df Full = n k 1 Reduced model SSE: SSE Reduced df Reduced = n k 1+m Extra SSE: SSE Reduced SSE Full df=( n k 1+m) ( n k 1) = m The partial F test statistic is the ratio of two variances. The numerator is the difference in error sums of squares (the extra sum of squares ) between the two models, divided by the number of predictors eliminated. The denominator is the mean squared error for the full model (SSE Full ) divided by its degrees of freedom. (13A.2) F SSEReduced SSE m SSEFull nk1 Full if m predictors are eliminated Degrees of freedom for this test will then be (m, n k 1). If only one predictor has been eliminated, then m = 1. We can ulate the p-value for the partial F test using =F.DIST.RT(F, m, n k 1). Illustration: Predicting Used Car Prices CarPrice Table 13A.1 shows a data set consisting of 40 observations on prices of used cars of a particular brand and model (hence controlling for an obviously important factor that would affect prices).

2 The response variable is Y (SellPrice) = sale price of the vehicle (in thousands of dollars). We have observations on three potential predictors: X 1 (Age) = age of car in years, X 2 (Mileage) = miles on odometer (in thousands of miles), X 3 (ManTran) = 1 if manual transmission, 0 otherwise. The three predictors are viewed as non-stochastic, independent variables (we can later investigate the latter assumption by looking at VIFs, if we wish). TABLE 13A.1 Selling Price and Characteristics of 40 Used Cars X 1 X 2 X 3 Age Mileage ManTran CarPrice Y SellPrice Note: Only the first and last three observations are shown here. The units for SellPrice and Mileage have been adjusted to thousands to improve data conditioning. Eliminating a Single Predictor Let us first test whether the single predictor ManTran could be eliminated to achieve a more parsimonious model than using all three predictors. We are comparing two potential linear regression models: Full model: Reduced model: SellPrice = Age + 2 Mileage + 3 ManTran SellPrice = Age + 2 Mileage Here are the ANOVA tables from these two regressions, presented side-by-side: Full Model ANOVA table Regression 2, Error Reduced Model ANOVA table Regression 2, , Error The elimination of ManTran increases the sum of squared errors, as you would expect (you have already learned that extra predictors can never decrease the R 2.even if they are not significant). Although the predictor ManTran is contributing something to the model s overall explanatory power (reduced SSE) the question remains whether ManTran is making a statistically significant extra contribution. The ulations are: Full model: SSE Full = df Full = n k 1 = = 36 Reduced model: SSE Reduced = df Reduced = n k 1+m = = 37 Extra SSE: SSE Reduced SSE Full df=(n k 1+m) (n k 1) = 1

3 F SSEReduced SSEFull m SSEFull n k 1 36 From Excel, we obtain the p-value =F.DIST.RT( ,1,36) = Therefore, if we are using α =.05, we would say that the extra sum of squares is not significant (i.e., ManTran does not make a significant marginal contribution). Instead of using the p-value, we could reach the same conclusion by comparing F = with F.05 (1,36) =F.INV.RT(0.05,1,36) = to draw the same conclusion. In effect, the hypotheses we are testing are: H 0 : 3 = 0 H 1 : 3 0 The test statistic is not far enough from zero to reject the hypothesis H 0 : 3 = 0. You may already have realized that if we are only considering the effect of one single predictor, we could reach the same conclusion from its t-statistic in the fitted regression of the full model: Regression output Variables Coefficients Std. Error t (df=36) p-value Intercept E-29 Age E-06 Mileage ManTran? In the single predictor case, the partial F test statistic is equal to the square of the corresponding t test statistic in the full model. The t-test uses the same degrees of freedom as the denominator of the partial F test, so the p-values will be the same as long as we use a two-tailed t-test (to eliminate the sign so that rejection in either tail could occur): Predictor ManTran: t 2 = (-1.912) 2 = Excel s p-value: =T.DIST.2T(1.912,36) =.0639 In effect, the hypotheses we are testing are: H 0 : 3 = 0 H 1 : 3 0 In the case of a single predictor, we could get by without using the partial F test. It is shown here because it illustrates the test in a simple way, and reveals the connection between F and t distributions. An advantage of the t-test is that it could also be used to test a one-sided hypothesis (e.g., H 1 : 3 < 0) which might be relevant in the case of this example (all our predictors seem to have an inverse relationship with a car s selling price).

4 Eliminating More Than One Predictor We now turn to the more general case of using the partial F test to assess the effect of eliminating m predictors simultaneously (where m > 1). This can be especially useful when we have a large model with many predictors that we are thinking of eliminating because their effects seem to be weak in the full model. To test the effects of discarding m predictors at once, the hypotheses are: H 0 : All the j = 0 for a subset of m predictors in the full model H 1 : Not all the j = 0 (at least some of the m coefficients are non-zero) For example, suppose we want to know whether we can eliminate both Mileage and ManTran at once. The hypotheses are: H 0 : 2 = 0 and 3 = 0 H 1 : One or both coefficients are non-zero The models to be compared are: Full model: Reduced model: SellPrice = Age + 2 Mileage + 3 ManTran SellPrice = Age Here are the ANOVA tables from these two regressions, presented side-by-side: Full Model ANOVA table Regression 2, Error Reduced Model ANOVA table Regression 2, , Residual The elimination of both Mileage and ManTran increases the sum of squared errors, as you would. The question is whether these two predictors are making a statistically significant extra contribution to reducing the sum of squared errors. The ulations are: Full model: SSE Full = df Full = n k 1 = = 36 Reduced model: SSE Reduced = df Reduced = n k 1+m = = 38 Extra SSE: SSE Reduced SSE Full df=(n k 1+m) (n k 1) = m = 2 F From Excel, we obtain the p-value =F.DIST.RT( ,2,36) = If we are using α =.05, we would say that the extra sum of squares is highly significant (i.e., these two predictors do make a

5 significant marginal contribution). Alternatively, we can compare F = with F.05 (2,26) =F.INV.RT(0.05,2,36) = to draw the same conclusion. Adding Predictors We have been discussing eliminating predictors. The ulations for adding predictors to a linear model are similar if we define the full model as the big model (more predictors) and the reduced model as the small model (fewer predictors). The extra sum of squares is still the difference between the two sums of squares: (13A.3) F SSE for big model SSE for small model Number of extra predictors SSE for big model nk1 More Complex Models We can use variations on these partial F tests based on error sums of squares for other purposes. For example, we can test whether two coefficients in a model are the same (e.g., 2 = 3 ) or to ulate the effects of any given predictor given the presence of other sets of predictors in the model (using coefficient of partial determination). Such tests are ordinarily reserved for more advanced classes in statistics, and may entail using more specialized software. Full Results for Car Data CarPrice To allow you to explore the car data on your own, full results are shown below for the full model based on the used car data. SellPrice is negatively affected by Age and Mileage (both highly significant) and marginally so by ManTran (p-value significant at α =.10 but not at α =.05. You can also look at the data file and do your own regressions. Regression Analysis R² Adjusted R² n 40 R k 3 Std. Error Dep. Var. SellPrice ANOVA table F p-value Regression 2, E-20 Residual Regression output confidence interval Variables Coefficients Std. Error t (df=36) p-value 95% lower 95% upper VIF Intercept E Age E Mileage ManTran?

6 It appears that as a car ages, it loses about \$1,663 in value per year (ceteris paribus). Similarly, for each extra mile driven, a car loses on average about \$58. Cars with manual transmission seem to sell for about \$1,654 less than those with automatic transmission (remember, the brand and model are controlled already). There is evidence of multicollinearity between Age and Mileage, which would be expected (as cars get older, they accumulate more miles). This would require further consideration, by the analyst. Section Exercises 13A.1 Instructions: Use α =.05 in all tests. (a) Perform a full linear regression to predict ColGrad% using all eight predictors in DATA SET E shown here. State the SSE and df for the full model. (b) Fit a reduced linear regression model by eliminating predictor Age. State the SSE and df for the reduced model. (c) Calculate the partial F test statistic to see whether predictor Age was significant. (d) Calculate the p-value for the partial F-test. What is your conclusion? (e) Does your conclusion from the partial F test agree with the test using the t-statistic in the full model regression? (e) Fit a reduced regression model by eliminating two predictors Age and Seast simultaneously. State the SSE and df for the reduced model. (f) Calculate the partial F test statistic to see whether predictors Age and Seast can both be eliminated. State your conclusion. References Kutner, Michael H.; Christopher J. Nachtsheim; and John Neter. Applied Linear Regression Models. 4th ed. McGraw-Hill/Irwin, 2004, pp

### Multiple Linear Regression

Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### Regression step-by-step using Microsoft Excel

Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

### Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

### MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL by Michael L. Orlov Chemistry Department, Oregon State University (1996) INTRODUCTION In modern science, regression analysis is a necessary part

### Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between

### Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

### Notes on Applied Linear Regression

Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:

### e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

### Chapter 5: Basic Statistics and Hypothesis Testing

Chapter 5: Basic Statistics and Hypothesis Testing In this chapter: 1. Viewing the t-value from an OLS regression (UE 5.2.1) 2. Calculating critical t-values and applying the decision rule (UE 5.2.2) 3.

### Generalized Linear Models

Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

### One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

### One-Way Analysis of Variance

One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

### Simple Linear Regression in SPSS STAT 314

Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,

### Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario

Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario I have found that the best way to practice regression is by brute force That is, given nothing but a dataset and your mind, compute everything

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

### International Statistical Institute, 56th Session, 2007: Phil Everson

Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

### Chapter 3 Quantitative Demand Analysis

Managerial Economics & Business Strategy Chapter 3 uantitative Demand Analysis McGraw-Hill/Irwin Copyright 2010 by the McGraw-Hill Companies, Inc. All rights reserved. Overview I. The Elasticity Concept

### Recall this chart that showed how most of our course would be organized:

Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

### Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.

### Lecture 5 Hypothesis Testing in Multiple Linear Regression

Lecture 5 Hypothesis Testing in Multiple Linear Regression BIOST 515 January 20, 2004 Types of tests 1 Overall test Test for addition of a single variable Test for addition of a group of variables Overall

### Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

### Comparing Nested Models

Comparing Nested Models ST 430/514 Two models are nested if one model contains all the terms of the other, and at least one additional term. The larger model is the complete (or full) model, and the smaller

### DEPARTMENT OF ECONOMICS. Unit ECON 12122 Introduction to Econometrics. Notes 4 2. R and F tests

DEPARTMENT OF ECONOMICS Unit ECON 11 Introduction to Econometrics Notes 4 R and F tests These notes provide a summary of the lectures. They are not a complete account of the unit material. You should also

### Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

### , has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results.

BA 275 Review Problems - Week 9 (11/20/06-11/24/06) CD Lessons: 69, 70, 16-20 Textbook: pp. 520-528, 111-124, 133-141 An SRS of size 100 is taken from a population having proportion 0.8 of successes. An

### A Primer on Forecasting Business Performance

A Primer on Forecasting Business Performance There are two common approaches to forecasting: qualitative and quantitative. Qualitative forecasting methods are important when historical data is not available.

### Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple

### Statistical Functions in Excel

Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.

### Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

### SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

### Final Exam Practice Problem Answers

Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

### STAT 350 Practice Final Exam Solution (Spring 2015)

PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

### Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

### Module 5: Multiple Regression Analysis

Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

### This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the

### MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM. R, analysis of variance, Student test, multivariate analysis

Journal of tourism [No. 8] MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM Assistant Ph.D. Erika KULCSÁR Babeş Bolyai University of Cluj Napoca, Romania Abstract This paper analysis

### Factorial Analysis of Variance

Chapter 560 Factorial Analysis of Variance Introduction A common task in research is to compare the average response across levels of one or more factor variables. Examples of factor variables are income

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### 1.5 Oneway Analysis of Variance

Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

### The Multiple Regression Model: Hypothesis Tests and the Use of Nonsample Information

Chapter 8 The Multiple Regression Model: Hypothesis Tests and the Use of Nonsample Information An important new development that we encounter in this chapter is using the F- distribution to simultaneously

### Bivariate Regression Analysis. The beginning of many types of regression

Bivariate Regression Analysis The beginning of many types of regression TOPICS Beyond Correlation Forecasting Two points to estimate the slope Meeting the BLUE criterion The OLS method Purpose of Regression

### Lesson Lesson Outline Outline

Lesson 15 Linear Regression Lesson 15 Outline Review correlation analysis Dependent and Independent variables Least Squares Regression line Calculating l the slope Calculating the Intercept Residuals and

### Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under

### Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

### Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

### Factors affecting online sales

Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

### Chapter 9. Section Correlation

Chapter 9 Section 9.1 - Correlation Objectives: Introduce linear correlation, independent and dependent variables, and the types of correlation Find a correlation coefficient Test a population correlation

### Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

### Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480

1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500

### Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

### Regression Analysis (Spring, 2000)

Regression Analysis (Spring, 2000) By Wonjae Purposes: a. Explaining the relationship between Y and X variables with a model (Explain a variable Y in terms of Xs) b. Estimating and testing the intensity

### CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

### DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

### Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

### Bivariate Analysis. Correlation. Correlation. Pearson's Correlation Coefficient. Variable 1. Variable 2

Bivariate Analysis Variable 2 LEVELS >2 LEVELS COTIUOUS Correlation Used when you measure two continuous variables. Variable 2 2 LEVELS X 2 >2 LEVELS X 2 COTIUOUS t-test X 2 X 2 AOVA (F-test) t-test AOVA

### Collinearity of independent variables. Collinearity is a condition in which some of the independent variables are highly correlated.

Collinearity of independent variables Collinearity is a condition in which some of the independent variables are highly correlated. Why is this a problem? Collinearity tends to inflate the variance of

### Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

### One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

### COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies

### Exploring Relationships using SPSS inferential statistics (Part II) Dwayne Devonish

Exploring Relationships using SPSS inferential statistics (Part II) Dwayne Devonish Reminder: Types of Variables Categorical Variables Based on qualitative type variables. Gender, Ethnicity, religious

### Difference of Means and ANOVA Problems

Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly

### Hypothesis testing - Steps

Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

### 7 Hypothesis testing - one sample tests

7 Hypothesis testing - one sample tests 7.1 Introduction Definition 7.1 A hypothesis is a statement about a population parameter. Example A hypothesis might be that the mean age of students taking MAS113X

### IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### Introduction to Stata

Introduction to Stata September 23, 2014 Stata is one of a few statistical analysis programs that social scientists use. Stata is in the mid-range of how easy it is to use. Other options include SPSS,

### 3. Nonparametric methods

3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests

### 5. Linear Regression

5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

### Regression III: Advanced Methods

Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

### Lectures 8, 9 & 10. Multiple Regression Analysis

Lectures 8, 9 & 0. Multiple Regression Analysis In which you learn how to apply the principles and tests outlined in earlier lectures to more realistic models involving more than explanatory variable and

### ABSORBENCY OF PAPER TOWELS

ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?

### 12: Analysis of Variance. Introduction

1: Analysis of Variance Introduction EDA Hypothesis Test Introduction In Chapter 8 and again in Chapter 11 we compared means from two independent groups. In this chapter we extend the procedure to consider

### Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### Linear Models in STATA and ANOVA

Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

### Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

### Estimation of σ 2, the variance of ɛ

Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated

### Independent t- Test (Comparing Two Means)

Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent

### 2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or

Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.\$ and Sales \$: 1. Prepare a scatter plot of these data. The scatter plots for Adv.\$ versus Sales, and Month versus

### Chapter 12 Sample Size and Power Calculations. Chapter Table of Contents

Chapter 12 Sample Size and Power Calculations Chapter Table of Contents Introduction...253 Hypothesis Testing...255 Confidence Intervals...260 Equivalence Tests...264 One-Way ANOVA...269 Power Computation

### Testing for Lack of Fit

Chapter 6 Testing for Lack of Fit How can we tell if a model fits the data? If the model is correct then ˆσ 2 should be an unbiased estimate of σ 2. If we have a model which is not complex enough to fit

### Statistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice!

Statistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice!) Part A - Multiple Choice Indicate the best choice

### POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression

### CHAPTER 11 SECTION 2: INTRODUCTION TO HYPOTHESIS TESTING

CHAPTER 11 SECTION 2: INTRODUCTION TO HYPOTHESIS TESTING MULTIPLE CHOICE 56. In testing the hypotheses H 0 : µ = 50 vs. H 1 : µ 50, the following information is known: n = 64, = 53.5, and σ = 10. The standardized

### Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

### EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION

EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY 5-10 hours of input weekly is enough to pick up a new language (Schiff & Myers, 1988). Dutch children spend 5.5 hours/day

### Causal Forecasting Models

CTL.SC1x -Supply Chain & Logistics Fundamentals Causal Forecasting Models MIT Center for Transportation & Logistics Causal Models Used when demand is correlated with some known and measurable environmental

### Multiple Hypothesis Testing: The F-test

Multiple Hypothesis Testing: The F-test Matt Blackwell December 3, 2008 1 A bit of review When moving into the matrix version of linear regression, it is easy to lose sight of the big picture and get lost

### General Procedure for Hypothesis Test. Five types of statistical analysis. 1. Formulate H 1 and H 0. General Procedure for Hypothesis Test

Five types of statistical analysis General Procedure for Hypothesis Test Descriptive Inferential Differences Associative Predictive What are the characteristics of the respondents? What are the characteristics

### Guide to Microsoft Excel for calculations, statistics, and plotting data

Page 1/47 Guide to Microsoft Excel for calculations, statistics, and plotting data Topic Page A. Writing equations and text 2 1. Writing equations with mathematical operations 2 2. Writing equations with

### Chapter 10. Analysis of Covariance. 10.1 Multiple regression

Chapter 10 Analysis of Covariance An analysis procedure for looking at group effects on a continuous outcome when some other continuous explanatory variable also has an effect on the outcome. This chapter

### Inferences About Differences Between Means Edpsy 580

Inferences About Differences Between Means Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at Urbana-Champaign Inferences About Differences Between Means Slide

### 1 Simple Linear Regression I Least Squares Estimation

Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and

### THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

### Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0.

Statistical analysis using Microsoft Excel Microsoft Excel spreadsheets have become somewhat of a standard for data storage, at least for smaller data sets. This, along with the program often being packaged