Chapter 14. Inference for Regression

Size: px
Start display at page:

Download "Chapter 14. Inference for Regression"

Transcription

1 Chapter 14 Inference for Regression

2 Lesson 14-1, Part 1 Inference for Regression

3 Review Least-Square Regression A family doctor is interested in examining the relationship between patient s age and total cholesterol. He randomly selects 14 of his female patients and obtains the data present in Table 1. The data are based upon results obtained from the National Center for Health Statistics. Table 1 Age Total Cholesterol Age Total Cholesterol

4 Review Least-Square Regression 1. What is the least-square regression line for predicting total cholesterol from age for women? The least square regression equation is ŷ = x, where ŷ represents the predicted total cholesterol for a female who age is x.

5 Review Least-Square Regression 2. What is the correlation coefficient between age and cholesterol? Interpret the correlation coefficient in the context of the problem The linear correlation coefficient is There is a moderate, positive linear relationship between female age and total cholesterol.

6 Review Least-Square Regression 3. What is the predicted cholesterol level of 67 year old female? cholesterol yˆ x ( age) (67) 245

7 Review Least-Square Regression 4. Interpret the slope of the regression line in the context of the problem? For each increase in age of one year, the total cholesterol is predicted to increases by

8 Statistics and Parameters When doing inference for regression, we use ŷ abx to estimate the population regression line. a and b are estimators of population parameters α and β, the intercept and slope of the population regression line.

9 Conditions The conditions necessary for doing inference for regression are: For each given value of x, the values of the response variable y-values are independent and normally distributed. For each value of x, the standard deviation, σ, of y- values is the same. The mean response of the y values for the fixed values of x are linearly related by the equation μ y = α + βx

10 Standard Error of the Regression Line Gives the variability of the vertical distances of the y-values from the regression line Remember that a residual was the error involved when making a prediction from the regression equation The spread around the line is measured with the standard deviation of the residual, s. s y ˆ y residuals 2 2 i i n 2 n 2

11 Standard Error of the Slope of the Regression Line Gives the variability of the estimates of the slope of the regression line SE b s n2 yˆ 2 2 x x x x i y i i i 2

12 Summary Inference for regression depends upon estimating μ y = α + βx with ŷ = a + bx For each x, the response values of y are independent and follow a normal distribution, each distribution having the same standard deviation. Inference for regression depends on the following statistics: a, the estimate of the y intercept, α, of μ y b, the estimate of the slope, β, of μ y s, the standard error of the residuals SE b the standard error of the slope of the regression line.

13 Age, x Computing Standard Error Total Cholesterol, y of the Residual ŷ = x Residuals (y ŷ) Residuals 2 (y ŷ) Σ residuals 2 =

14 Computing Standard Error 2 residuals S n

15 Example Page 787, #14.2 Body weights and backpack weights were collected for eight students Weight (lbs) Backpack weight (lbs) These data were entered into a statistical package and leastsquares regression of backpack weight on body weight as requested. Here are the results.

16 Example Page 787, #14.2 Predictor Coef Stdev t-ratio p Constant BodyWT S = R-sq = 63.2% R-sq(adj) = 57.0% A) What is the equation of the least-square line? Backpack weight = (bodyweight)

17 Example Page 787, #14.2 Predictor Coef Stdev t-ratio p Constant BodyWT S = R-sq = 63.2% R-sq(adj) = 57.0% B) The model for regression inference has three parameters, which we call α, β and σ. Can you determine the estimates for α and β from the computer printout? a = estimates the true intercept α and b = estimates the true slope β.

18 Example Page 787, #14.2 Predictor Coef Stdev t-ratio p Constant BodyWT S = R-sq = 63.2% R-sq(adj) = 57.0% C) The computer output reports that s = This is an estimate of the parameter σ. Use the formula for s to verify the computer s value of s. Use your TI to verify this.

19 Example Page 788, #14.4 Exercise 3.71 on page 187 provided data on the speed of competitive runners and the number of steps they took per second. Good runners take more steps per second as they speed up. Here is the data again. speed steps A) Enter the data into your calculator, perform least-square regression, and plot the scatterplot with the least-square line. What is the strength of the association between speed and steps per second?

20 Example Page 788, #14.4 Steps = (speed). There is a very strong positive linear relationship between speed and steps; r = nearly all the variation (r 2 = 0.998) 99.8% of it in steps per second is explained by the linear relationship.

21 steps per second Example Page 788, #14.4 speed (feet per second)

22 Example Page 788, #14.4 C) The model for regression inference has three parameters, α, β and σ. Estimate these parameters from the data a = is the estimate of α b = is the estimate of β s = is the estimate of σ

23 Lesson 14-1, Part 2 Inference for Regression

24 Significance Test for the Slope of a Regression Line We want to test whether the slope of the regression line is zero or not. If the slope of the line is zero, then there is no linear relationship between x and y variables. Remember (formula for b) if r = 0, then b = 0 Hypothesis Two Tailed: H o : β = 0 and H a : β 0 Left Tailed: H o : β = 0 and H a : β < 0 Right Tailed: H o : β = 0 and H a : β > 0

25 Test Statistics and Confidence Interval t b β b SE SE b b b t * SE b t distribution with n 2 degrees of freedom SE b = Standard error of the slope SE b s x i x 2

26 Reading Computer Printouts

27 Example Page 794, #14.6 Exercise 14.1 (page 786) presents data on the lengths of two bones in five fossil specimens of the extinct beast Archaeopteryx. Here is part of the output from the S-PLUS statistical software when we regress the length y of the humerus on the length x of the femur. Coefficients Value Std Error t value Pr(> t ) (Intercepts) Femur

28 Example Page 794, #14.6 Coefficients Value Std Error t value Pr(> t ) (Intercepts) Femur A) What is the equation of the least-squares regression line? humerus ( femur )

29 Example Page 794, #14.6 Coefficients Value Std Error t value Pr(> t ) (Intercepts) Femur B) We left out the t statistic for testing H o : β = 0 and its P-value. Use the output to find t. t b S b

30 Example Page 794, #14.6 C)How many degrees of freedom does t have? Use Table C to approximate the P-value of t against the one-sided alternative H a : β > 0. df = 3; since t > 12.92, we know P-value < tcdf ( , E99,3)

31 Example Page 794, #14.6 D)Write a sentence to describe your conclusion about the slope of the true regression line. There is very strong evidence that β > 0, that is, that the line is useful for predicting the length of the humerus given the length of the femur

32 Example Page 794, #14.6 E) Determine a 99% confidence interval for the true slope of the regression line.

33 Example Page 794, #14.6 b * t S (0.0751) (0.758,1.636) b

34 Example Page 794, #14.8 There is some evidence that drinking moderate amounts of wine helps prevent heart attacks. Exercise 3.63 (Page 183) gives data on yearly wine consumption (liters of alcohol from drinking wine, per person) and yearly deaths from heart disease (deaths per 100,000 people) in 19 developed nations. A) Is there statistically significant evidence of a negative association between wine consumption and heart disease deaths? Carry out the appropriate test of significance and write a summary statement about your conclusions.

35 Example Page 794, #14.8

36 Example Page 794, #14.8 β = negative association between wine consumption and heart disease deaths. H H o a : β 0 : β 0

37 Example Page 794, #14.8 Linear Regression T-test Condition 1. The observations are independent 2. The true relationship is linear (check scatterplot to check that the overall pattern is linear or plot of residuals against the predicted values) 3. The standard deviation of the response about the true line is the same everywhere (make sure the spread around the line is nearly constant) 4. The response varies normally about the true regression line (normal probability plot of residuals is quite straight)

38 Example Page 794, #14.8 t S b S b b s ( x x) p value P value Reject Ho, since p-value = < = 0.05 and conclude that there a linear relationship between wine consumption and heart disease deaths.

39 Example Page 795, #14.10 Exercise 14.4 (page 788) presents data on the relationship between the speed of runners (x, in feet per second) and the number of steps y that they take in a second. Here is part of the Data Desk Regression output for these data: R squared = 99.8% s = with 7 2 = 5 degrees of freedom Variable Coefficient s.e. of Coeff t-ratio prob Constant < speed <0.0001

40 Example Page 795, #14.10 R squared = 99.8% s = with 7 2 = 5 degrees of freedom Variable Coefficient s.e. of Coeff t-ratio prob Constant < speed < A) How can you tell from this output, even without the scatterplot, that there is a very strong straight-line relationship between running speed and steps per second?

41 Example Page 795, #14.10 R squared = 99.8% s = with 7 2 = 5 degrees of freedom Variable Coefficient s.e. of Coeff t-ratio prob Constant < speed < r 2 is very close to 1, which means that nearly all the variation in steps per second is accounted for by foot speed. Also, the P-value for β is small.

42 Example Page 795, #14.10 R squared = 99.8% s = with 7 2 = 5 degrees of freedom Variable Coefficient s.e. of Coeff t-ratio prob Constant < speed < B) What parameter in the regression model gives the rate at which steps per second increase as running speed increases? Give a 99% confidence interval for this rate.

43 Example Page 795, #14.10 R squared = 99.8% s = with 7 2 = 5 degrees of freedom Variable Coefficient s.e. of Coeff t-ratio prob Constant < speed < β (the slope) is this rate; the estimate is listed as coeffincient of Speed, * bt S b (0.0016) (0.074,0.087)

44 Lesson 14-2, Part 1 Predictions and Conditions

45 Confidence Intervals Write the given value of the explanatory variable x as x*. The distinction between predicting a single outcome and predicting the mean of all outcomes when x = x* determines what margin of error is correct. Estimate the mean response, we use a confidence interval. µ y = α + βx* Estimate an individual response y, we use a prediction interval

46 Confidence Intervals for Regression Response A level C confidence interval for the mean response µ y when x takes the value x* is yˆ * t SE μˆ The standard error SE s μˆ 2 * 1 x x n x x 2

47 Prediction Intervals for Regression Response A level C prediction interval for a single observation on y when x takes the value x* yˆ * t SE yˆ The standard error SE * 1 x x s 1 n x x yˆ 2 2

48 Conditions for Regression Inference The observations are independent The true relationship is linear The standard deviation of the response about the true line is the same everywhere. The response varies normally about the true regression line. Check conditions using the residuals.

49 Examine the residual plot to check that the relationship is roughly linear and that the scatter about the line is the same from end to end.

50 Violations of the regression conditions: The variation of the residuals is not constant.

51 Violations of the regression conditions: There is a curved relationship between the response variable and the explanatory variable.

52 Example Page 802, #14.12 A) The residuals for the crying and IQ data appear in Example 14.3 (page 785). Make a stemplot to display the distribution of the residuals. Are there outliers or signs of strong departures from normality?

53 Example Page 802, # One residual (51.32) may be a high outlier, but the stemplot does not Show any other deviations from normality.

54 Example Page 802, #14.12 B) What other assumptions or conditions are required for using inference for regression on these data? Check that those conditions are satisfied and then describe your findings.

55 Example Page 802, #14.12

56 Example Page 802, #14.12 The scatter of the data points about the regression line varies to a extent as we move along the line, but the variation is not serious, as a residual plot shows. The other conditions can be assumed to be satisfied.

57 Example Page 802, #14.12 C) Would a 95% prediction interval for x = 25 be narrower, the same size, or wider than a 95% confidence interval? Explain your reasoning. A prediction interval would be wider. For a fixed confidence level, the margin of error is always larger when we are predicting a single observation than when we are estimating the mean response.

58 Example Page 802, #14.12 D) A computer package reports that the 95% prediction interval for x = 25 is (91.85, ). Explain what this interval means in simple language. We are 95% confident that when x (crying intensity) = 25, the corresponding value of y (IQ) will be between and

59 Example Page 802, #14.14 In exercise (page 795) we regressed the lean of the Leaning Tower of Pisa on year to estimate the rate at which the tower is tilting. Here are the residuals from that regression, in order by years across the rows: Use the residuals to check the regression conditions, and describe your findings. Is the regression in exercise trustworthy?

60 Example Page 802, #14.14 In exercise (page 795) we regressed the lean of the Leaning Tower of Pisa on year to estimate the rate at which the tower is tilting. Here are the residuals from that regression, in order by years across the rows: Use the residuals to check the regression conditions, and describe your findings. Is the regression in exercise trustworthy?

61 Example Page 802, #14.14 Residual Normal Prop. Of Residual The scatterplot of the residual versus year does not suggest any problems. The regression in Exercise should be fairly reliable

62 Example Page 809, #14.24 Here are data on the time (in minutes) Professor Moore takes to swim 2000 yards and his pulse rate (beat per minute) after swimming: Time: Pulse: Time: Pulse: Time: Pulse:

63 Example Page 809, #14.24 A scatterplot shows a negative linear relationship: a faster time (fewer minutes) is associated with a higher heart rate. Here is part of the output from the regression function in Excel spreadsheets. Coefficients Standard Error t Stat P-value Intercepts E 07 X variable E 05 Give a 90% confidence interval for the slope of the true regression line. Explain what your result tells us about the relationship between the professor s swimming time and heart rate.

64 Example Page 809, #14.24 Coefficients Standard Error t Stat P-value Intercepts E 07 X variable E 05 * 21 b t SE b (1.8887) to bpm per minute With a 90% confidence, we can say that for each 1-minute increase in swimming time, pulse rate drops by 6 to 13 bpm.

65 Example Page 809, #14.24 Using the TI

66 Example Page 809, #14.25 Exercise gives data on a swimmer s time and heart rate. One day the swimmer completes his laps in 34.3 minutes but forgets to take his pulse. Minitab gives this prediction for heart rate when x* = 34.3: Fit StDev Fit 90.0% CI 90.0% PI (144.02, ) (135.79, ) A) Verify that Fit is the predicted heart rate from the least-square line found in exercise Then choose one of the intervals from the output to estimate the swimmer s heart rate that day and explain why you chose this interval.

67 Example Page 809, #14.25 Fit StDev Fit 90.0% CI 90.0% PI (144.02, ) (135.79, ) ˆy ( pulse) x( time) when x = 34.3 minutes ˆy ( pulse) (34.3) this agrees the output

68 Example Page 809, #14.25 Fit StDev Fit 90.0% CI 90.0% PI (144.02, ) (135.79, ) The prediction interval is appropriate for estimating one value (as opposed to mean of many values): to bpm

69 Example Page 809, #14.25 Fit StDev Fit 90.0% CI 90.0% PI (144.02, ) (135.79, ) B) Minitab gives only one of the two standard errors used in prediction. It is SE the standard error for estimating ˆ the mean response. Use this fact and a critical value from table C to verify Minitab s 90% confidence interval for the mean heart rate on days when the swimming time is 34.3 minutes.

70 Example Page 809, #14.25 Fit StDev Fit 90.0% CI 90.0% PI (144.02, ) (135.79, ) yˆ * 21 ˆ t SE (1.97) to , which agrees with the computer output

Correlation and Regression

Correlation and Regression Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Factors affecting online sales

Factors affecting online sales Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

More information

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5 Using Your TI-83/84 Calculator: Linear Correlation and Regression Elementary Statistics Dr. Laura Schultz This handout describes how to use your calculator for various linear correlation and regression

More information

Homework 8 Solutions

Homework 8 Solutions Math 17, Section 2 Spring 2011 Homework 8 Solutions Assignment Chapter 7: 7.36, 7.40 Chapter 8: 8.14, 8.16, 8.28, 8.36 (a-d), 8.38, 8.62 Chapter 9: 9.4, 9.14 Chapter 7 7.36] a) A scatterplot is given below.

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

MULTIPLE REGRESSION EXAMPLE

MULTIPLE REGRESSION EXAMPLE MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if

More information

Simple Linear Regression

Simple Linear Regression STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals (9.3) Conditions for inference (9.1) Want More Stats??? If you have enjoyed learning how to analyze

More information

2 Sample t-test (unequal sample sizes and unequal variances)

2 Sample t-test (unequal sample sizes and unequal variances) Variations of the t-test: Sample tail Sample t-test (unequal sample sizes and unequal variances) Like the last example, below we have ceramic sherd thickness measurements (in cm) of two samples representing

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb 2 2015

Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb 2 2015 Stat 412/512 CASE INFLUENCE STATISTICS Feb 2 2015 Charlotte Wickham stat512.cwick.co.nz Regression in your field See website. You may complete this assignment in pairs. Find a journal article in your field

More information

Section 3 Part 1. Relationships between two numerical variables

Section 3 Part 1. Relationships between two numerical variables Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

Coefficient of Determination

Coefficient of Determination Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Review MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) All but one of these statements contain a mistake. Which could be true? A) There is a correlation

More information

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

More information

WEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y 7.1 1 8.92 X 10.0 0.0 16.0 10.0 1.6

WEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y 7.1 1 8.92 X 10.0 0.0 16.0 10.0 1.6 WEB APPENDIX 8A Calculating Beta Coefficients The CAPM is an ex ante model, which means that all of the variables represent before-thefact, expected values. In particular, the beta coefficient used in

More information

MTH 140 Statistics Videos

MTH 140 Statistics Videos MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative

More information

Example: Boats and Manatees

Example: Boats and Manatees Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

More information

STAT 350 Practice Final Exam Solution (Spring 2015)

STAT 350 Practice Final Exam Solution (Spring 2015) PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

Copyright 2013 by Laura Schultz. All rights reserved. Page 1 of 7

Copyright 2013 by Laura Schultz. All rights reserved. Page 1 of 7 Using Your TI-83/84/89 Calculator: Linear Correlation and Regression Dr. Laura Schultz Statistics I This handout describes how to use your calculator for various linear correlation and regression applications.

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

August 2012 EXAMINATIONS Solution Part I

August 2012 EXAMINATIONS Solution Part I August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,

More information

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

More information

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2 Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables

More information

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Simple Linear Regression, Scatterplots, and Bivariate Correlation 1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.

More information

Scatter Plot, Correlation, and Regression on the TI-83/84

Scatter Plot, Correlation, and Regression on the TI-83/84 Scatter Plot, Correlation, and Regression on the TI-83/84 Summary: When you have a set of (x,y) data points and want to find the best equation to describe them, you are performing a regression. This page

More information

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers Correlation Greg C Elvers What Is Correlation? Correlation is a descriptive statistic that tells you if two variables are related to each other E.g. Is your related to how much you study? When two variables

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Display and Summarize Correlation for Direction and Strength Properties of Correlation Regression Line Cengage

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

More information

17. SIMPLE LINEAR REGRESSION II

17. SIMPLE LINEAR REGRESSION II 17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

More information

Chapter 9 Descriptive Statistics for Bivariate Data

Chapter 9 Descriptive Statistics for Bivariate Data 9.1 Introduction 215 Chapter 9 Descriptive Statistics for Bivariate Data 9.1 Introduction We discussed univariate data description (methods used to eplore the distribution of the values of a single variable)

More information

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles. Math 1530-017 Exam 1 February 19, 2009 Name Student Number E There are five possible responses to each of the following multiple choice questions. There is only on BEST answer. Be sure to read all possible

More information

4. Multiple Regression in Practice

4. Multiple Regression in Practice 30 Multiple Regression in Practice 4. Multiple Regression in Practice The preceding chapters have helped define the broad principles on which regression analysis is based. What features one should look

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares Linear Regression Chapter 5 Regression Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). We can then predict the average response for all subjects

More information

y = a + bx Chapter 10: Horngren 13e The Dependent Variable: The cost that is being predicted The Independent Variable: The cost driver

y = a + bx Chapter 10: Horngren 13e The Dependent Variable: The cost that is being predicted The Independent Variable: The cost driver Chapter 10: Dt Determining ii How Costs Behave Bh Horngren 13e 1 The Linear Cost Function y = a + bx The Dependent Variable: The cost that is being predicted The Independent Variable: The cost driver The

More information

Hypothesis testing - Steps

Hypothesis testing - Steps Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables. SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation

More information

Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0.

Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0. Statistical analysis using Microsoft Excel Microsoft Excel spreadsheets have become somewhat of a standard for data storage, at least for smaller data sets. This, along with the program often being packaged

More information

Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

Regression and Correlation

Regression and Correlation Regression and Correlation Topics Covered: Dependent and independent variables. Scatter diagram. Correlation coefficient. Linear Regression line. by Dr.I.Namestnikova 1 Introduction Regression analysis

More information

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Statistics: Correlation Richard Buxton. 2008. 1 Introduction We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Do

More information

Lean Six Sigma Analyze Phase Introduction. TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY

Lean Six Sigma Analyze Phase Introduction. TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY Before we begin: Turn on the sound on your computer. There is audio to accompany this presentation. Audio will accompany most of the online

More information

Interaction between quantitative predictors

Interaction between quantitative predictors Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors

More information

Correlation and Regression Analysis: SPSS

Correlation and Regression Analysis: SPSS Correlation and Regression Analysis: SPSS Bivariate Analysis: Cyberloafing Predicted from Personality and Age These days many employees, during work hours, spend time on the Internet doing personal things,

More information

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

More information

The importance of graphing the data: Anscombe s regression examples

The importance of graphing the data: Anscombe s regression examples The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective

More information

Logs Transformation in a Regression Equation

Logs Transformation in a Regression Equation Fall, 2001 1 Logs as the Predictor Logs Transformation in a Regression Equation The interpretation of the slope and intercept in a regression change when the predictor (X) is put on a log scale. In this

More information

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS. SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed

More information

TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics

TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics UNIVERSITY OF DUBLIN TRINITY COLLEGE Faculty of Engineering, Mathematics and Science School of Computer Science & Statistics BA (Mod) Enter Course Title Trinity Term 2013 Junior/Senior Sophister ST7002

More information

DATA INTERPRETATION AND STATISTICS

DATA INTERPRETATION AND STATISTICS PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

More information

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences. 1 Commands in JMP and Statcrunch Below are a set of commands in JMP and Statcrunch which facilitate a basic statistical analysis. The first part concerns commands in JMP, the second part is for analysis

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance 2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

More information

Lecture 13/Chapter 10 Relationships between Measurement (Quantitative) Variables

Lecture 13/Chapter 10 Relationships between Measurement (Quantitative) Variables Lecture 13/Chapter 10 Relationships between Measurement (Quantitative) Variables Scatterplot; Roles of Variables 3 Features of Relationship Correlation Regression Definition Scatterplot displays relationship

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Independent t- Test (Comparing Two Means)

Independent t- Test (Comparing Two Means) Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent

More information

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:

More information

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there is a relationship between variables, To find out the

More information

Using R for Linear Regression

Using R for Linear Regression Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

More information

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different

More information

Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

More information

Regression step-by-step using Microsoft Excel

Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

More information

Chapter 4 and 5 solutions

Chapter 4 and 5 solutions Chapter 4 and 5 solutions 4.4. Three different washing solutions are being compared to study their effectiveness in retarding bacteria growth in five gallon milk containers. The analysis is done in a laboratory,

More information

Correlation key concepts:

Correlation key concepts: CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)

More information

An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression

An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression Chapter 9 Simple Linear Regression An analysis appropriate for a quantitative outcome and a single quantitative explanatory variable. 9.1 The model behind linear regression When we are examining the relationship

More information

Statistics 151 Practice Midterm 1 Mike Kowalski

Statistics 151 Practice Midterm 1 Mike Kowalski Statistics 151 Practice Midterm 1 Mike Kowalski Statistics 151 Practice Midterm 1 Multiple Choice (50 minutes) Instructions: 1. This is a closed book exam. 2. You may use the STAT 151 formula sheets and

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Ordinal Regression. Chapter

Ordinal Regression. Chapter Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe

More information

Example G Cost of construction of nuclear power plants

Example G Cost of construction of nuclear power plants 1 Example G Cost of construction of nuclear power plants Description of data Table G.1 gives data, reproduced by permission of the Rand Corporation, from a report (Mooz, 1978) on 32 light water reactor

More information