Multiple Linear Regression

Size: px
Start display at page:

Download "Multiple Linear Regression"

Transcription

1 Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is now modeled as a function of several explanatory variables. The function lm can be used to perform multiple linear regression in R and much of the syntax is the same as that used for fitting simple linear regression models. To perform multiple linear regression with p explanatory variables use the command: lm(response ~ explanatory_1 + explanatory_2 + + explanatory_p) Here the terms response and explanatory_i in the function should be replaced by the names of the response and explanatory variables, respectively, used in the analysis. Ex. Data was collected on 100 houses recently sold in a city. It consisted of the sales price (in $), house size (in square feet), the number of bedrooms, the number of bathrooms, the lot size (in square feet) and the annual real estate tax (in $). The following program reads in the data. > Housing = read.table("c:/users/martin/documents/w2024/housing.txt", header=true) > Housing Taxes Bedrooms Baths Price Size Lot Suppose we are only interested in working with a subset of the variables (e.g., Price, Size and Lot ). It is possible (but not necessary) to construct a new data frame consisting solely of these values using the commands: > myvars = c("price", "Size", "Lot") > Housing2 = Housing[myvars] > Housing2 Price Size Lot

2 Before fitting our regression model we want to investigate how the variables are related to one another. We can do this graphically by constructing scatter plots of all pair-wise combinations of variables in the data frame. This can be done by typing: > plot(housing2) To fit a multiple linear regression model with price as the response variable and size and lot as the explanatory variables, use the command: > results = lm(price ~ Size + Lot, data=housing) > results Call: lm(formula = Price ~ Size + Lot, data = Housing) Coefficients: (Intercept) Size Lot This output indicates that the fitted value is given by yˆ x x2

3 Inference in the multiple regression setting is typically performed in a number of steps. We begin by testing whether the explanatory variables collectively have an effect on the response variable, i.e. H 0 : 1 2 p 0 If we can reject this hypothesis, we continue by testing whether the individual regression coefficients are significant while controlling for the other variables in the model. We can access the results of each test by typing: > summary(results) Call: lm(formula = Price ~ Size + Lot, data = Housing) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e e Size 5.378e e e-13 *** Lot 2.840e e e-09 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 97 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 2 and 97 DF, p-value: < 2.2e-16 The output shows that F = (p < 2.2e-16), indicating that we should clearly reject the null hypothesis that the variables Size and Lot collectively have no effect on Price. The results also show that the variable Size is significant controlling for the variable Lot (p = 8.39e-13), as is Lot controlling for the variable Size (p=1.68e-09). In addition, the output also shows that R 2 = and R 2 adjusted = B. Testing a subset of variables using a partial F-test Sometimes we are interested in simultaneously testing whether a certain subset of the coefficients are equal to 0 (e.g. 3 = 4 = 0). We can do this using a partial F-test. This test involves comparing the SSE from a reduced model (excluding the parameters we hypothesis are equal to zero) with the SSE from the full model (including all of the parameters).

4 In R we can perform partial F-tests by fitting both the reduced and full models separately and thereafter comparing them using the anova function. Ex. Suppose we include the variables bedroom, bath, size and lot in our model and are interested in testing whether the number of bedrooms and bathrooms are significant after taking size and lot into consideration. The following code performs the partial F-test: > reduced = lm(price ~ Size + Lot, data=housing) # Reduced model > full = lm(price ~ Size + Lot + Bedrooms + Baths, data=housing) # Full Model > anova(reduced, full) # Compare the models Analysis of Variance Table Model 1: Price ~ Size + Lot Model 2: Price ~ Size + Lot + Bedrooms + Baths Res.Df RSS Df Sum of Sq F Pr(>F) e e Signif. codes: 0 *** ** 0.01 * The output shows the results of the partial F-test. Since F=2.82 (p-value=0.0647) we cannot reject the null hypothesis ( 3 = 4 = 0) at the 5% level of significance. It appears that the variables Bedrooms and Baths do not contribute significant information to the sales price once the variables Size and Lot have been taken into consideration. C. Confidence and Prediction Intervals We often use our regression models to estimate the mean response or predict future values of the response variable for certain values of the response variables. The function predict() can be used to make both confidence intervals for the mean response and prediction intervals. To make confidence intervals for the mean response use the option interval= confidence. To make a prediction interval use the option interval= prediction. By default this makes 95% confidence and prediction intervals. If you instead want to make a 99% confidence or prediction interval use the option level=0.99. Ex. Obtain a 95% confidence interval for the mean sales price of houses whose size is 1,000 square feet and lot size is 20,000 square feet. > results = lm(price ~ Size + Lot, data=housing)

5 > predict(results,data.frame(size=1000, Lot=20000),interval="confidence") fit lwr upr [1,] A 95% confidence interval is given by (90711, ) Ex. Obtain a 95% prediction interval for the sales price of a particular house whose size is 1,000 square feet and lot size is 20,000 square feet. > predict(results,data.frame(size=1000, Lot=20000),interval="prediction") fit lwr upr [1,] A 95% prediction interval is given by (38627, ). Note that this is quite a bit wider than the confidence interval, indicating that the variation about the mean is fairly large.

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate

More information

Regression in ANOVA. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Regression in ANOVA. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Regression in ANOVA James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 30 Regression in ANOVA 1 Introduction 2 Basic Linear

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

More information

Stat 411/511 ANOVA & REGRESSION. Charlotte Wickham. stat511.cwick.co.nz. Nov 31st 2015

Stat 411/511 ANOVA & REGRESSION. Charlotte Wickham. stat511.cwick.co.nz. Nov 31st 2015 Stat 411/511 ANOVA & REGRESSION Nov 31st 2015 Charlotte Wickham stat511.cwick.co.nz This week Today: Lack of fit F-test Weds: Review email me topics, otherwise I ll go over some of last year s final exam

More information

Using R for Linear Regression

Using R for Linear Regression Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

More information

Comparing Nested Models

Comparing Nested Models Comparing Nested Models ST 430/514 Two models are nested if one model contains all the terms of the other, and at least one additional term. The larger model is the complete (or full) model, and the smaller

More information

Testing for Lack of Fit

Testing for Lack of Fit Chapter 6 Testing for Lack of Fit How can we tell if a model fits the data? If the model is correct then ˆσ 2 should be an unbiased estimate of σ 2. If we have a model which is not complex enough to fit

More information

We extended the additive model in two variables to the interaction model by adding a third term to the equation.

We extended the additive model in two variables to the interaction model by adding a third term to the equation. Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic

More information

Regression step-by-step using Microsoft Excel

Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

More information

ANOVA. February 12, 2015

ANOVA. February 12, 2015 ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R

More information

EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION

EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY 5-10 hours of input weekly is enough to pick up a new language (Schiff & Myers, 1988). Dutch children spend 5.5 hours/day

More information

12-1 Multiple Linear Regression Models

12-1 Multiple Linear Regression Models 12-1.1 Introduction Many applications of regression analysis involve situations in which there are more than one regressor variable. A regression model that contains more than one regressor variable is

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Classical Hypothesis Testing in R. R can do all of the common analyses that are available in SPSS, including:

Classical Hypothesis Testing in R. R can do all of the common analyses that are available in SPSS, including: Classical Hypothesis Testing in R R can do all of the common analyses that are available in SPSS, including: Classical Hypothesis Testing in R R can do all of the common analyses that are available in

More information

Stat 5303 (Oehlert): Tukey One Degree of Freedom 1

Stat 5303 (Oehlert): Tukey One Degree of Freedom 1 Stat 5303 (Oehlert): Tukey One Degree of Freedom 1 > catch

More information

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

More information

Lets suppose we rolled a six-sided die 150 times and recorded the number of times each outcome (1-6) occured. The data is

Lets suppose we rolled a six-sided die 150 times and recorded the number of times each outcome (1-6) occured. The data is In this lab we will look at how R can eliminate most of the annoying calculations involved in (a) using Chi-Squared tests to check for homogeneity in two-way tables of catagorical data and (b) computing

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F

E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,

More information

And sample sizes > tapply(count, spray, length) A B C D E F And a boxplot: > boxplot(count ~ spray) How does the data look?

And sample sizes > tapply(count, spray, length) A B C D E F And a boxplot: > boxplot(count ~ spray) How does the data look? ANOVA in R 1-Way ANOVA We re going to use a data set called InsectSprays. 6 different insect sprays (1 Independent Variable with 6 levels) were tested to see if there was a difference in the number of

More information

Exercise Page 1 of 32

Exercise Page 1 of 32 Exercise 10.1 (a) Plot wages versus LOS. Describe the relationship. There is one woman with relatively high wages for her length of service. Circle this point and do not use it in the rest of this exercise.

More information

Chapter 11: Two Variable Regression Analysis

Chapter 11: Two Variable Regression Analysis Department of Mathematics Izmir University of Economics Week 14-15 2014-2015 In this chapter, we will focus on linear models and extend our analysis to relationships between variables, the definitions

More information

Model Diagnostics for Regression

Model Diagnostics for Regression Model Diagnostics for Regression After fitting a regression model it is important to determine whether all the necessary model assumptions are valid before performing inference. If there are any violations,

More information

The F distribution

The F distribution 10-5.1 The F distribution 11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis is a statistical technique

More information

Psychology 205: Research Methods in Psychology

Psychology 205: Research Methods in Psychology Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready

More information

N-Way Analysis of Variance

N-Way Analysis of Variance N-Way Analysis of Variance 1 Introduction A good example when to use a n-way ANOVA is for a factorial design. A factorial design is an efficient way to conduct an experiment. Each observation has data

More information

Factors affecting online sales

Factors affecting online sales Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

More information

Example. Regression. Lion Data Scatter Plot. The Data

Example. Regression. Lion Data Scatter Plot. The Data Example Regression Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison December 8 15, 2011 Case Study The proportion of blackness in a male lion s nose increases as the

More information

Math 141. Lecture 24: Model Comparisons and The F-test. Albyn Jones 1. 1 Library jones/courses/141

Math 141. Lecture 24: Model Comparisons and The F-test. Albyn Jones 1. 1 Library jones/courses/141 Math 141 Lecture 24: Model Comparisons and The F-test Albyn Jones 1 1 Library 304 jones@reed.edu www.people.reed.edu/ jones/courses/141 Nested Models Two linear models are Nested if one (the restricted

More information

Chapter 13 In this chapter, you learn: Multiple Regression Model with k Independent Variables:

Chapter 13 In this chapter, you learn: Multiple Regression Model with k Independent Variables: Chapter 4 4- Business Statistics: A First Course Fifth Edition Chapter 3 Multiple Regression Business Statistics: A First Course, 5e 9 Prentice-Hall, Inc. Chap 3- Learning Objectives In this chapter, you

More information

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

More information

Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance

Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance 14 November 2007 1 Confidence intervals and hypothesis testing for linear regression Just as there was

More information

Lecture 5 Hypothesis Testing in Multiple Linear Regression

Lecture 5 Hypothesis Testing in Multiple Linear Regression Lecture 5 Hypothesis Testing in Multiple Linear Regression BIOST 515 January 20, 2004 Types of tests 1 Overall test Test for addition of a single variable Test for addition of a group of variables Overall

More information

Multiple Linear Regression. Chapter 12

Multiple Linear Regression. Chapter 12 13 Multiple Linear Regression Chapter 12 Multiple Regression Analysis Definition The multiple regression model equation is Y = b 0 + b 1 x 1 + b 2 x 2 +... + b p x p + ε where E(ε) = 0 and Var(ε) = s 2.

More information

Statistics 100 Simple and Multiple Regression

Statistics 100 Simple and Multiple Regression Statistics 100 Simple and Multiple Regression Simple linear regression: Idea: The probability distribution of a random variable Y may depend on the value x of some predictor variable. Ingredients: Y is

More information

Correlation and Regression 07/10/09

Correlation and Regression 07/10/09 Correlation and Regression Eleisa Heron 07/10/09 Introduction Correlation and regression for quantitative variables - Correlation: assessing the association between quantitative variables - Simple linear

More information

The scatterplot indicates a positive linear relationship between waist size and body fat percentage:

The scatterplot indicates a positive linear relationship between waist size and body fat percentage: STAT E-150 Statistical Methods Multiple Regression Three percent of a man's body is essential fat, which is necessary for a healthy body. However, too much body fat can be dangerous. For men between the

More information

Supplement 13A: Partial F Test

Supplement 13A: Partial F Test Supplement 13A: Partial F Test Purpose of the Partial F Test For a given regression model, could some of the predictors be eliminated without sacrificing too much in the way of fit? Conversely, would it

More information

August 2013 EXAMINATIONS ECO220Y1Y. Solutions. PART 1: 20 multiple choice questions with point values from 1 to 3 points each for a total of 47 points

August 2013 EXAMINATIONS ECO220Y1Y. Solutions. PART 1: 20 multiple choice questions with point values from 1 to 3 points each for a total of 47 points Page 1 of 7 August 2013 EXAMINATIONS ECO220Y1Y Solutions PART 1: 20 multiple choice questions with point values from 1 to 3 points each for a total of 47 points (1) Determine whether the following statement

More information

Transformations and Polynomial Regression

Transformations and Polynomial Regression Transformations and Polynomial Regression One of the first steps in the construction of a regression model is to hypothesize the form of the regression function. We can dramatically expand the scope of

More information

Week 5: Multiple Linear Regression

Week 5: Multiple Linear Regression BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School

More information

Introduction to Linear Regression Part 2

Introduction to Linear Regression Part 2 Introduction to Linear Regression Part 2 James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Linear Regression 2 1 / 44 Introduction

More information

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.

More information

Chapter 12 Relationships Between Quantitative Variables: Regression and Correlation

Chapter 12 Relationships Between Quantitative Variables: Regression and Correlation Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 12 Relationships Between Quantitative Variables: Regression and Correlation

More information

Lucky vs. Unlucky Teams in Sports

Lucky vs. Unlucky Teams in Sports Lucky vs. Unlucky Teams in Sports Introduction Assuming gambling odds give true probabilities, one can classify a team as having been lucky or unlucky so far. Do results of matches between lucky and unlucky

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Statistics for Management II-STAT 362-Final Review

Statistics for Management II-STAT 362-Final Review Statistics for Management II-STAT 362-Final Review Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. The ability of an interval estimate to

More information

where b is the slope of the line and a is the intercept i.e. where the line cuts the y axis.

where b is the slope of the line and a is the intercept i.e. where the line cuts the y axis. Least Squares Introduction We have mentioned that one should not always conclude that because two variables are correlated that one variable is causing the other to behave a certain way. However, sometimes

More information

The statistical procedures used depend upon the kind of variables (categorical or quantitative):

The statistical procedures used depend upon the kind of variables (categorical or quantitative): Math 143 Correlation and Regression 1 Review: We are looking at methods to investigate two or more variables at once. bivariate: multivariate: The statistical procedures used depend upon the kind of variables

More information

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION In this lab you will learn how to use Excel to display the relationship between two quantitative variables, measure the strength and direction of the

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

Mind on Statistics. Chapter Which expression is a regression equation for a simple linear relationship in a population?

Mind on Statistics. Chapter Which expression is a regression equation for a simple linear relationship in a population? Mind on Statistics Chapter 14 Sections 14.1-14.3 1. Which expression is a regression equation for a simple linear relationship in a population? A. ŷ = b 0 + b 1 x B. ŷ = 44 + 0.60 x C. ( Y) x D. E 0 1

More information

Regression, least squares

Regression, least squares Regression, least squares Joe Felsenstein Department of Genome Sciences and Department of Biology Regression, least squares p.1/24 Fitting a straight line X Two distinct cases: The X values are chosen

More information

Statistics 512: Homework 1 Solutions

Statistics 512: Homework 1 Solutions Statistics 512: Homework 1 Solutions 1. A regression analysis relating test scores (Y ) to training hours (X) produced the following fitted question: ŷ = 25 0.5x. (a) What is the fitted value of the response

More information

Paired Differences and Regression

Paired Differences and Regression Paired Differences and Regression Students sometimes have difficulty distinguishing between paired data and independent samples when comparing two means. One can return to this topic after covering simple

More information

Data and Regression Analysis. Lecturer: Prof. Duane S. Boning. Rev 10

Data and Regression Analysis. Lecturer: Prof. Duane S. Boning. Rev 10 Data and Regression Analysis Lecturer: Prof. Duane S. Boning Rev 10 1 Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance (ANOVA) 2. Multivariate Analysis of Variance Model forms 3.

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

STAT 350 Review Set (#3) Make sure to study Lab 8 software output and questions.

STAT 350 Review Set (#3) Make sure to study Lab 8 software output and questions. Make sure to study Lab 8 software output and questions. 1. The editor of a statistics textbook would like to plan for the next edition. A key variable is the number of pages that will be in the final version.

More information

Statistiek II. John Nerbonne. March 24, 2010. Information Science, Groningen Slides improved a lot by Harmut Fitz, Groningen!

Statistiek II. John Nerbonne. March 24, 2010. Information Science, Groningen Slides improved a lot by Harmut Fitz, Groningen! Information Science, Groningen j.nerbonne@rug.nl Slides improved a lot by Harmut Fitz, Groningen! March 24, 2010 Correlation and regression We often wish to compare two different variables Examples: compare

More information

Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question. Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between

More information

Predicting a New Response

Predicting a New Response Predicting a New Response ST 516 Recall the regression model y = β 0 + β 1 x 1 + β 2 x 2 + + β k x k + ɛ = x β + ɛ, and the estimated mean response at x 0 : ŷ (x 0 ) = x ˆβ. 0 To predict a single new response

More information

Correlation and Regression

Correlation and Regression Correlation and Regression Fathers and daughters heights Fathers heights mean = 67.7 SD = 2.8 55 60 65 70 75 height (inches) Daughters heights mean = 63.8 SD = 2.7 55 60 65 70 75 height (inches) Reference:

More information

BINF 702 Chapter 11 Regression and Correlation Methods. Chapter 11 Regression and Correlation Methods (SPRING 2014) 1

BINF 702 Chapter 11 Regression and Correlation Methods. Chapter 11 Regression and Correlation Methods (SPRING 2014) 1 BINF 702 Chapter 11 Regression and Correlation Methods (SPRING 2014) 1 Section 11.1 Introduction Example 11.1 Obstetrics Obstetricians sometimes order tests for estriol levels from 24-hour urine specimens

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

Spring 2014 Math 263 Deb Hughes Hallett. Class 23: Regression and Hypothesis Testing (Text: Sections 10.1)

Spring 2014 Math 263 Deb Hughes Hallett. Class 23: Regression and Hypothesis Testing (Text: Sections 10.1) Class 23: Regression and Hypothesis Testing (Text: Sections 10.1) Review of Regression (from Chapter 2) We fit a line to data to make projections. The Tower of Pisa is leaning more each year. The measurements

More information

Tukey s HSD (Honestly Significant Difference).

Tukey s HSD (Honestly Significant Difference). Agenda for Week 4 (Tuesday, Jan 26) Week 4 Hour 1 AnOVa review. Week 4 Hour 2 Multiple Testing Tukey s HSD (Honestly Significant Difference). Week 4 Hour 3 (Thursday) Two-way AnOVa. AnOVa Review AnOVa

More information

Inference for Regression

Inference for Regression Inference for Regression IPS Chapter 10 10.1: Simple Linear Regression 10.: More Detail about Simple Linear Regression 01 W.H. Freeman and Company Inference for Regression 10.1 Simple Linear Regression

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

More information

DEPARTMENT OF STATISTICS Course STATS 330: Advanced Statistical Modelling

DEPARTMENT OF STATISTICS Course STATS 330: Advanced Statistical Modelling VERSION STATS DEPARTMENT OF STATISTICS Course STATS : Advanced Statistical Modelling Term Test: 9.am - :am, Tuesday Sept 5, 5 INSTRUCTIONS Answer ALL 5 questions on the answer sheet provided. All questions

More information

Lab 11: Simple Linear Regression

Lab 11: Simple Linear Regression Lab 11: Simple Linear Regression Objective: In this lab, you will examine relationships between two quantitative variables using a graphical tool called a scatterplot. You will interpret scatterplots in

More information

2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or

2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus

More information

MIXED MODEL ANALYSIS USING R

MIXED MODEL ANALYSIS USING R Research Methods Group MIXED MODEL ANALYSIS USING R Using Case Study 4 from the BIOMETRICS & RESEARCH METHODS TEACHING RESOURCE BY Stephen Mbunzi & Sonal Nagda www.ilri.org/rmg www.worldagroforestrycentre.org/rmg

More information

Regression Add a line to the plot that fits the data well. Don t do any calculations, just add the line.

Regression Add a line to the plot that fits the data well. Don t do any calculations, just add the line. Regression 137 9 Regression 9.1 Simple Linear Regression 9.1.1 The Least Squares Method Example. Consider the following small data set. somedata

More information

Exchange Rate Regime Analysis for the Chinese Yuan

Exchange Rate Regime Analysis for the Chinese Yuan Exchange Rate Regime Analysis for the Chinese Yuan Achim Zeileis Ajay Shah Ila Patnaik Abstract We investigate the Chinese exchange rate regime after China gave up on a fixed exchange rate to the US dollar

More information

Working with orthogonal contrasts in R

Working with orthogonal contrasts in R Working with orthogonal contrasts in R Once you ve done an Analysis of Variance (ANOVA), you may reach a point where you want to know: What levels of the factor of interest were significantly different

More information

STATISTICS 110/201 PRACTICE FINAL EXAM KEY (REGRESSION ONLY)

STATISTICS 110/201 PRACTICE FINAL EXAM KEY (REGRESSION ONLY) STATISTICS 110/201 PRACTICE FINAL EXAM KEY (REGRESSION ONLY) Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built

More information

STA-3123: Statistics for Behavioral and Social Sciences II. Text Book: McClave and Sincich, 12 th edition. Contents and Objectives

STA-3123: Statistics for Behavioral and Social Sciences II. Text Book: McClave and Sincich, 12 th edition. Contents and Objectives STA-3123: Statistics for Behavioral and Social Sciences II Text Book: McClave and Sincich, 12 th edition Contents and Objectives Initial Review and Chapters 8 14 (Revised: Aug. 2014) Initial Review on

More information

Unit 9 Multiple Regression: Chapter 11 in IPS

Unit 9 Multiple Regression: Chapter 11 in IPS Unit 9 Multiple Regression: Chapter 11 in IPS The `mathematics of multiple regression is a direct extension of simple linear regression Predictions use the regression equation t-statistics have same interpretation

More information

Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology

Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology Regression in SPSS Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology John P. Bentley Department of Pharmacy Administration University of

More information

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used

More information

Solutions to Homework 11 Statistics 302 Professor Larget

Solutions to Homework 11 Statistics 302 Professor Larget s to Homework 11 Statistics 302 Professor Larget Textbook Exercises 8.47 Body Mass Gain (Graded for Completeness) Computer output showing body mass gain (in grams) for the mice after four weeks in each

More information

Introduction to Linear Regression and Correlation Analysis

Introduction to Linear Regression and Correlation Analysis Introduction to Linear Regression and Correlation Analsis Goals After this, ou should be able to: Calculate and interpret the simple correlation between two variables Determine whether the correlation

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

Practice 3 SPSS. Partially based on Notes from the University of Reading:

Practice 3 SPSS. Partially based on Notes from the University of Reading: Practice 3 SPSS Partially based on Notes from the University of Reading: http://www.reading.ac.uk Simple Linear Regression A simple linear regression model is fitted when you want to investigate whether

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Lab5: CIs, PIs, and Hypothesis Testing for: Proportions, Means and Linear Regression

Lab5: CIs, PIs, and Hypothesis Testing for: Proportions, Means and Linear Regression Lab5: CIs, PIs, and Hypothesis Testing for: Proportions, Means and Linear Regression M. George Akritas Each CI is a Bernoulli trial: It either contains the true parameter value or not. After the CI is

More information

SCHOOL OF MATHEMATICS AND STATISTICS

SCHOOL OF MATHEMATICS AND STATISTICS RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: Statistics Tables by H.R. Neave MAS5052 SCHOOL OF MATHEMATICS AND STATISTICS Basic Statistics Spring Semester

More information

After the ANOVA. Categorical Predictors: Gene Expression and Mental Disorders. The Data. Fit the Data with a Linear Model

After the ANOVA. Categorical Predictors: Gene Expression and Mental Disorders. The Data. Fit the Data with a Linear Model Categorical Predictors: Gene Expression and Mental Disorders After the ANOVA The Data Fit the Data with a Linear Model 0.0 mean.expression 0.1 bg.sub.lm

More information

Stat479 Assignment #6 Solution Key Fall 2013

Stat479 Assignment #6 Solution Key Fall 2013 Stat479 Assignment #6 Solution Key Fall 2013 Problem 1 (a) Source d.f. SS MS F p-value Regression 1 2059.78145 2059.78145 29.59

More information

SELF-TEST: SIMPLE REGRESSION

SELF-TEST: SIMPLE REGRESSION ECO 22000 McRAE SELF-TEST: SIMPLE REGRESSION Note: Those questions indicated with an (N) are unlikely to appear in this form on an in-class examination, but you should be able to describe the procedures

More information

1) In regression, an independent variable is sometimes called a response variable. 1)

1) In regression, an independent variable is sometimes called a response variable. 1) Exam Name TRUE/FALSE. Write 'T' if the statement is true and 'F' if the statement is false. 1) In regression, an independent variable is sometimes called a response variable. 1) 2) One purpose of regression

More information

ABSORBENCY OF PAPER TOWELS

ABSORBENCY OF PAPER TOWELS ABSORBENCY OF PAPER TOWELS 9. General Factorial Model In this section General Factorial Analysis will be used to investigate the effects of all possible combinations of towel brand and immersion time on

More information

RNR / ENTO Assumptions for Simple Linear Regression

RNR / ENTO Assumptions for Simple Linear Regression 74 RNR / ENTO 63 --Assumptions for Simple Linear Regression Statistical statements (hypothesis tests and CI estimation) with least squares estimates depends on 4 assumptions:. Linearity of the mean responses

More information

Simple Linear Regression

Simple Linear Regression Chapter Nine Simple Linear Regression Consider the following three scenarios: 1. The CEO of the local Tourism Authority would like to know whether a family s annual expenditure on recreation is related

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 8 Hypothesis testing Siv-Elisabeth Skjelbred University of Oslo February 12th Last updated: February 11, 2016 1 / 50 Predictions Suppose you start with the equation: Y

More information

Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression

Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression Correlation Linear correlation and linear regression are often confused, mostly

More information

Math 130 Final Exam Spring 2014 \ NAME: . You must show all work, calculations, formulas used to receive any credit. NO WORK =NO CREDIT.

Math 130 Final Exam Spring 2014 \ NAME: . You must show all work, calculations, formulas used to receive any credit. NO WORK =NO CREDIT. Math 130 Final Exam Spring 2014 \ NAME:. You must show all work, calculations, formulas used to receive any credit. NO WORK =NO CREDIT. Round the final answers to 3 decimal places. Good luck! Question

More information

HH Chapter 9. Nov 3, 2005

HH Chapter 9. Nov 3, 2005 Nov 3, 2005 Topics of Output Variable Selection Data Scatter Plot Added Variable Plots hh/datasets/usair.dat Response SO2 measurements in 41 metropolitan areas Predictors temp firms popn wind precip rain

More information