# Comparing Nested Models

Size: px
Start display at page:

Transcription

1 Comparing Nested Models ST 430/514 Two models are nested if one model contains all the terms of the other, and at least one additional term. The larger model is the complete (or full) model, and the smaller is the reduced (or restricted) model. Example: with two independent variables x 1 and x 2, possible terms are x 1, x 2, x 1 x 2, x 2 1, and so on. 1 / 20 Multiple Linear Regression Comparing Nested Models

2 Consider three models: First order: E(Y ) = β 0 + β 1 x 1 + β 2 x 2 ; Interaction: Full second order: E(Y ) = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 1 x 2 ; E(Y ) = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 1 x 2 + β 4 x β 5 x / 20 Multiple Linear Regression Comparing Nested Models

3 The first order model is nested within both the Interaction model and the Full second order model. The Interaction model is nested within the Full second order model. We usually want to use the simplest (most parsimonious) model that adequately fits the observed data. One way to decide between a full model and a reduced model is by testing H 0 : reduced model is adequate; H a : full model is better. 3 / 20 Multiple Linear Regression Comparing Nested Models

4 When the full model has exactly one more term than the reduced model, we can use a t-test. E.g., testing the Interaction model E(Y ) = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 1 x 2 ; against the First order model E(Y ) = β 0 + β 1 x 1 + β 2 x 2. H 0 : reduced model is adequate is the same as H 0 : β 3 = 0. So the usual t-statistic is the relevant test statistic. 4 / 20 Multiple Linear Regression Comparing Nested Models

5 When the full model has more than one additional term, we use an F -test, which generalizes the t-test. Basic idea: fit both models, and test whether the full model fits significantly better than the reduced model: ) F = ( Drop in SSE Number of extra terms s 2 for full model where SSE is the sum of squared residuals. When H 0 is true, F follows the F -distribution, which we use to find the P-value. 5 / 20 Multiple Linear Regression Comparing Nested Models

6 E.g., testing the (full) second order model E(Y ) = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 1 x 2 + β 4 x1 2 + β 5 x2 2 ; against the (reduced) interaction model E(Y ) = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 1 x 2. Here H 0 is β 4 = β 5 = 0, and H a is the opposite. In R, the lm() method is not convenient for carrying out this test; aov() is better. 6 / 20 Multiple Linear Regression Comparing Nested Models

7 summary(aov(cost ~ Weight + Distance + I(Weight * Distance) + I(Weight^2) + I(Distance^2), express)) Df Sum Sq Mean Sq F value Pr(>F) Weight e-15 *** Distance e-13 *** I(Weight * Distance) e-09 *** I(Weight^2) *** I(Distance^2) Residuals Signif. codes: 0 *** ** 0.01 * summary(aov(cost ~ Weight + Distance + I(Weight * Distance), express)) Df Sum Sq Mean Sq F value Pr(>F) Weight e-14 *** Distance e-12 *** I(Weight * Distance) e-07 *** Residuals Signif. codes: 0 *** ** 0.01 * / 20 Multiple Linear Regression Comparing Nested Models

8 These are sequential sums of squares, adding each term to the model in order. See Residuals line in each set of results: SSE(Full second order) = 2.74, SSE(Interaction) = 6.63, so F = ( )/ = 9.75, P <.01 8 / 20 Multiple Linear Regression Comparing Nested Models

9 You can, in fact, calculate F from the output for the full model: Note that, because the terms are added sequentially, the sums of squares for the common terms (Weight, Distance, and Weight * Distance) are the same in both models. In the reduced model, the extra terms (Weight^2 and Distance^2) have gone away. Their combined sum of squares, = 3.89, is exactly the increase in SSE, = / 20 Multiple Linear Regression Comparing Nested Models

10 So we can also calculate F = (Sum Sq for Weight^ 2 + Sum Sq for Distance^ 2)/2 Mean Square for Residuals using only the output for the full model. Note that F was calculated imprecisely, because of rounding. We can get more digits using print(summary(...), digits = 8) for example, or calculate F to full precision: s <- summary(aov(cost ~ Weight + Distance + I(Weight * Distance) + I(Weight^2) + I(Distance^2), express))[[1]] sum(s[c("i(weight^2)", "I(Distance^2)"), "Sum Sq"]) / 2 / s["residuals", "Mean Sq"] 10 / 20 Multiple Linear Regression Comparing Nested Models

11 We could use the same F -test when there is only one additional term in the full model, based on just one line in the ANOVA table, provided it is the last term in the formula. It appears very different from the t-test described earlier. Some matrix algebra shows that it is, in fact, exactly the same test: The F -statistic is exactly the square of the t-statistic. The F critical values are exactly the squares of the (two-sided) t critical values. So the P-value is exactly the same. 11 / 20 Multiple Linear Regression Comparing Nested Models

12 Complete Example: Road Construction Cost Data from the Florida Attorney General s office y = successful bid; x 1 = DOT engineer s estimate of cost x 2 = indicator of fixed bidding: { 1 if fixed x 2 = 0 if competitive 12 / 20 Multiple Linear Regression A Complete Example

13 Get the data and plot them: flag <- read.table("text/exercises&examples/flag.txt", header = TRUE) pairs(flag[, -1]) Section 4.14 suggests beginning with the full second order model, and simplifying it as far as possible (but no further!). We ll take the opposite approach: begin with the first order model, and complicate it as far as necessary. Because x 2 is an indicator variable, the first order model is a pair of parallel straight lines: summary(lm(cost ~ DOTEST + STATUS, flag)) 13 / 20 Multiple Linear Regression A Complete Example

14 First order model Call: lm(formula = COST ~ DOTEST + STATUS, data = flag) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) DOTEST < 2e-16 *** STATUS *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 232 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 4610 on 2 and 232 DF, p-value: < 2.2e / 20 Multiple Linear Regression A Complete Example

15 Both variables look important. The DOTEST coefficient is close to 1, so the winning bids roughly track the estimated cost. The positive STATUS coefficient means the line for STATUS = 1 is higher than the line for STATUS = 0. Are the slopes different? Try the interaction model: summary(lm(cost ~ DOTEST * STATUS, flag)) 15 / 20 Multiple Linear Regression A Complete Example

16 Interaction model Call: lm(formula = COST ~ DOTEST * STATUS, data = flag) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) DOTEST < 2e-16 *** STATUS DOTEST:STATUS e-05 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 231 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 3281 on 3 and 231 DF, p-value: < 2.2e / 20 Multiple Linear Regression A Complete Example

17 The interaction term is highly significant: reject the first order model in favor of the interaction model. The slopes are for STATUS = 0, and = for STATUS = 1. So the competitive auctions are won with bids that fall slightly below the estimated cost, while the fixed winning bids fall slightly above the estimated cost. 17 / 20 Multiple Linear Regression A Complete Example

18 To validate the interaction model, we compare it with (finally!) the full second order model. Note When some variables are qualitative, the full second order model consists of the full second order (i.e., quadratic) model in the quantitative variables, plus the interactions of those terms with the qualititative variables: summary(lm(cost ~ DOTEST + STATUS + I(DOTEST * STATUS) + I(DOTEST^2) + I(DOTEST^2 * STATUS), flag)) 18 / 20 Multiple Linear Regression A Complete Example

19 Full second order model ST 430/514 Call: lm(formula = COST ~ DOTEST + STATUS + I(DOTEST * STATUS) + I(DOTEST^2) + I(DOTEST^2 * STATUS), data = flag) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e e DOTEST 9.155e e < 2e-16 *** STATUS e e I(DOTEST * STATUS) 3.242e e ** I(DOTEST^2) 7.191e e I(DOTEST^2 * STATUS) e e Signif. codes: 0 *** ** 0.01 * Residual standard error: on 229 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 1970 on 5 and 229 DF, p-value: < 2.2e / 20 Multiple Linear Regression A Complete Example

20 Full second order model, ANOVA summary(aov(cost ~ DOTEST + STATUS + I(DOTEST * STATUS) + I(DOTEST^2) + I(DOTEST^2 * STATUS), flag)) Df Sum Sq Mean Sq F value Pr(>F) DOTEST < 2e-16 *** STATUS *** I(DOTEST * STATUS) e-05 *** I(DOTEST^2) I(DOTEST^2 * STATUS) Residuals Signif. codes: 0 *** ** 0.01 * The significance of the two terms that were added is tested using an F statistic with 2 degrees of freedom in the numerator; the value is only slightly greater than 1, and is completely consistent with the null hypothesis that the interaction model is adequate. 20 / 20 Multiple Linear Regression A Complete Example

### We extended the additive model in two variables to the interaction model by adding a third term to the equation.

Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic

### Multiple Linear Regression

Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

### Statistical Models in R

Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate

### Correlation and Simple Linear Regression

Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a

### SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

### Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance

Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance 14 November 2007 1 Confidence intervals and hypothesis testing for linear regression Just as there was

### 5. Linear Regression

5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

### DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

### Lets suppose we rolled a six-sided die 150 times and recorded the number of times each outcome (1-6) occured. The data is

In this lab we will look at how R can eliminate most of the annoying calculations involved in (a) using Chi-Squared tests to check for homogeneity in two-way tables of catagorical data and (b) computing

### Using R for Linear Regression

Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

### ANOVA. February 12, 2015

ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R

### Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb 2 2015

Stat 412/512 CASE INFLUENCE STATISTICS Feb 2 2015 Charlotte Wickham stat512.cwick.co.nz Regression in your field See website. You may complete this assignment in pairs. Find a journal article in your field

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F

Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,

### EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION

EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY 5-10 hours of input weekly is enough to pick up a new language (Schiff & Myers, 1988). Dutch children spend 5.5 hours/day

### Stat 5303 (Oehlert): Tukey One Degree of Freedom 1

Stat 5303 (Oehlert): Tukey One Degree of Freedom 1 > catch

### Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

### Week 5: Multiple Linear Regression

BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School

### Testing for Lack of Fit

Chapter 6 Testing for Lack of Fit How can we tell if a model fits the data? If the model is correct then ˆσ 2 should be an unbiased estimate of σ 2. If we have a model which is not complex enough to fit

### n + n log(2π) + n log(rss/n)

There is a discrepancy in R output from the functions step, AIC, and BIC over how to compute the AIC. The discrepancy is not very important, because it involves a difference of a constant factor that cancels

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

### Univariate Regression

Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

### Statistical Models in R

Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

### Module 5: Multiple Regression Analysis

Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

### Part II. Multiple Linear Regression

Part II Multiple Linear Regression 86 Chapter 7 Multiple Regression A multiple linear regression model is a linear model that describes how a y-variable relates to two or more xvariables (or transformations

### Psychology 205: Research Methods in Psychology

Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready

### Lucky vs. Unlucky Teams in Sports

Lucky vs. Unlucky Teams in Sports Introduction Assuming gambling odds give true probabilities, one can classify a team as having been lucky or unlucky so far. Do results of matches between lucky and unlucky

### Regression step-by-step using Microsoft Excel

Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

### Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

### Generalized Linear Models

Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

### Comparing Means in Two Populations

Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

### Final Exam Practice Problem Answers

Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

### 2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or

Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.\$ and Sales \$: 1. Prepare a scatter plot of these data. The scatter plots for Adv.\$ versus Sales, and Month versus

### N-Way Analysis of Variance

N-Way Analysis of Variance 1 Introduction A good example when to use a n-way ANOVA is for a factorial design. A factorial design is an efficient way to conduct an experiment. Each observation has data

### Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

### Factors affecting online sales

Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

### Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480

1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500

### 2. Regression and Correlation. Simple Linear Regression Software: R

2. Regression and Correlation Simple Linear Regression Software: R Create txt file from SAS data set data _null_; file 'C:\Documents and Settings\sphlab\Desktop\slr1.txt'; set temp; put input day:date7.

### Interaction between quantitative predictors

Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors

### MIXED MODEL ANALYSIS USING R

Research Methods Group MIXED MODEL ANALYSIS USING R Using Case Study 4 from the BIOMETRICS & RESEARCH METHODS TEACHING RESOURCE BY Stephen Mbunzi & Sonal Nagda www.ilri.org/rmg www.worldagroforestrycentre.org/rmg

### MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

### Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

### KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

### Least Squares Regression. Alan T. Arnholt Department of Mathematical Sciences Appalachian State University arnholt@math.appstate.

Least Squares Regression Alan T. Arnholt Department of Mathematical Sciences Appalachian State University arnholt@math.appstate.edu Spring 2006 R Notes 1 Copyright c 2006 Alan T. Arnholt 2 Least Squares

### POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression

### Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.

### HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

### Simple Linear Regression

Chapter Nine Simple Linear Regression Consider the following three scenarios: 1. The CEO of the local Tourism Authority would like to know whether a family s annual expenditure on recreation is related

### Exercises on using R for Statistics and Hypothesis Testing Dr. Wenjia Wang

Exercises on using R for Statistics and Hypothesis Testing Dr. Wenjia Wang School of Computing Sciences, UEA University of East Anglia Brief Introduction to R R is a free open source statistics and mathematical

### 2 Sample t-test (unequal sample sizes and unequal variances)

Variations of the t-test: Sample tail Sample t-test (unequal sample sizes and unequal variances) Like the last example, below we have ceramic sherd thickness measurements (in cm) of two samples representing

### Analysis of Variance ANOVA

Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### Developing a Stock Price Model Using Investment Valuation Ratios for the Financial Industry Of the Philippine Stock Market

Developing a Stock Price Model Using Investment Valuation Ratios for the Financial Industry Of the Philippine Stock Market Tyrone Robin 1, Carlo Canquin 1, Donald Uy 1, Al Rey Villagracia 1 1 Physics Department,

### Exchange Rate Regime Analysis for the Chinese Yuan

Exchange Rate Regime Analysis for the Chinese Yuan Achim Zeileis Ajay Shah Ila Patnaik Abstract We investigate the Chinese exchange rate regime after China gave up on a fixed exchange rate to the US dollar

### Exchange Rate Regime Analysis for the Indian Rupee

Exchange Rate Regime Analysis for the Indian Rupee Achim Zeileis Ajay Shah Ila Patnaik Abstract We investigate the Indian exchange rate regime starting from 1993 when trading in the Indian rupee began

### Example: Boats and Manatees

Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

### 2. Simple Linear Regression

Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

### Introduction to Regression and Data Analysis

Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

### Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

### Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:

### 1.5 Oneway Analysis of Variance

Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

### Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

### Causal Forecasting Models

CTL.SC1x -Supply Chain & Logistics Fundamentals Causal Forecasting Models MIT Center for Transportation & Logistics Causal Models Used when demand is correlated with some known and measurable environmental

### International Statistical Institute, 56th Session, 2007: Phil Everson

Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### General Regression Formulae ) (N-2) (1 - r 2 YX

General Regression Formulae Single Predictor Standardized Parameter Model: Z Yi = β Z Xi + ε i Single Predictor Standardized Statistical Model: Z Yi = β Z Xi Estimate of Beta (Beta-hat: β = r YX (1 Standard

### Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

### Discussion Section 4 ECON 139/239 2010 Summer Term II

Discussion Section 4 ECON 139/239 2010 Summer Term II 1. Let s use the CollegeDistance.csv data again. (a) An education advocacy group argues that, on average, a person s educational attainment would increase

### 1 Simple Linear Regression I Least Squares Estimation

Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and

### Rockefeller College University at Albany

Rockefeller College University at Albany PAD 705 Handout: Hypothesis Testing on Multiple Parameters In many cases we may wish to know whether two or more variables are jointly significant in a regression.

### Chapter Four. Data Analyses and Presentation of the Findings

Chapter Four Data Analyses and Presentation of the Findings The fourth chapter represents the focal point of the research report. Previous chapters of the report have laid the groundwork for the project.

### Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

### CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

### Linear functions Increasing Linear Functions. Decreasing Linear Functions

3.5 Increasing, Decreasing, Max, and Min So far we have been describing graphs using quantitative information. That s just a fancy way to say that we ve been using numbers. Specifically, we have described

### MSwM examples. Jose A. Sanchez-Espigares, Alberto Lopez-Moreno Dept. of Statistics and Operations Research UPC-BarcelonaTech.

MSwM examples Jose A. Sanchez-Espigares, Alberto Lopez-Moreno Dept. of Statistics and Operations Research UPC-BarcelonaTech February 24, 2014 Abstract Two examples are described to illustrate the use of

### MULTIPLE REGRESSION WITH CATEGORICAL DATA

DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting

### individualdifferences

1 Simple ANalysis Of Variance (ANOVA) Oftentimes we have more than two groups that we want to compare. The purpose of ANOVA is to allow us to compare group means from several independent samples. In general,

### STAT 350 Practice Final Exam Solution (Spring 2015)

PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

### Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

### Wooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables

Wooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables We often consider relationships between observed outcomes

### Module 5: Statistical Analysis

Module 5: Statistical Analysis To answer more complex questions using your data, or in statistical terms, to test your hypothesis, you need to use more advanced statistical tests. This module reviews the

### Research Methods & Experimental Design

Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and

### Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

### Time Series Analysis

Time Series 1 April 9, 2013 Time Series Analysis This chapter presents an introduction to the branch of statistics known as time series analysis. Often the data we collect in environmental studies is collected

### Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015

Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015 Stata Example (See appendices for full example).. use http://www.nd.edu/~rwilliam/stats2/statafiles/multicoll.dta,

### Introduction to Hierarchical Linear Modeling with R

Introduction to Hierarchical Linear Modeling with R 5 10 15 20 25 5 10 15 20 25 13 14 15 16 40 30 20 10 0 40 30 20 10 9 10 11 12-10 SCIENCE 0-10 5 6 7 8 40 30 20 10 0-10 40 1 2 3 4 30 20 10 0-10 5 10 15

### 1 Basic ANOVA concepts

Math 143 ANOVA 1 Analysis of Variance (ANOVA) Recall, when we wanted to compare two population means, we used the 2-sample t procedures. Now let s expand this to compare k 3 population means. As with the

### Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Part 3 Comparing Groups Chapter 7 Comparing Paired Groups 189 Chapter 8 Comparing Two Independent Groups 217 Chapter 9 Comparing More Than Two Groups 257 188 Elementary Statistics Using SAS Chapter 7 Comparing

### Hedge Effectiveness Testing

Hedge Effectiveness Testing Using Regression Analysis Ira G. Kawaller, Ph.D. Kawaller & Company, LLC Reva B. Steinberg BDO Seidman LLP When companies use derivative instruments to hedge economic exposures,

### Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl

Dept of Information Science j.nerbonne@rug.nl October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic

### Recall this chart that showed how most of our course would be organized:

Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

### This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the

### 1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

Using Your TI-83/84 Calculator: Linear Correlation and Regression Elementary Statistics Dr. Laura Schultz This handout describes how to use your calculator for various linear correlation and regression

### 2. Linear regression with multiple regressors

2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

### Multiple Regression: What Is It?

Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in

### Multivariate Logistic Regression

1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

### Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction