# CHAPTER 2 AND 10: Least Squares Regression

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 CHAPTER 2 AND 0: Least Squares Regression In chapter 2 and 0 we will be looking at the relationship between two quantitative variables measured on the same individual. General Procedure:. Make a scatterplot and describe the form, direction and strength of the relationship. Note: fitting a line only makes sense if the overall pattern of the scatterplot is roughly linear. 2. Look for outliers and influential observations on the scatterplot. a. Note: Inference is not safe if there are influential points as results depend strongly on these few points. It is often helpful to rework the data without the influential variables and compare the results. 3. Find the correlation r to get a numerical measure of direction and strength of the linear relationship. 4. Find r 2, the fraction of variation in the values of y that is explained by the least squares regression of y on x. 5. If the data is reasonably linear, find the least squares regression line for the data. Note: The line can be used to predict y for a given x. 6. Make a residual plot and normal probability plot to check the regression assumptions. 7. If and only if your data was collected using random sampling techniques, you can look at the hypothesis tests and confidence interval for the correlation, slope and intercept. 8. If and only if your data was collected using random sampling techniques, you can look at the hypothesis tests and confidence intervals for the mean response and prediction intervals.

2 Association Between Variables: Two variables measured on the same individuals are associated if some values of one variable tend to occur more often with some values of the second variable than with other values of that variable. Just because two variables are associated doesn t mean that a change in one variable causes a change in the other. (Causation section 2.6) Also, the relationship between two variables might not tell the whole story. Other variables may affect the relationship. These other variables are called lurking variables. Positive association: When above average values of one variable tend to accompany aboveaverage values of the other, and below average values also tend to occur together. Negative association: When above average values of one variable tend to accompany below average values of the other and visa versa. No association: Hard to find a pattern showing a relationship between the variables. Response variable: measures an outcome of a study. Dependent variable Y Explanatory variable: explains or causes changes in the response variable. Independent variable X 2

3 Example : A forester has become adept at estimating the volume (in cubic feet) of trees on a particular site prior to a timber sale. Since his operation has now expanded, he would like to train another person to assist in estimating the cubic foot volume of trees. He decided to create a model that will allow him to obtain the actual tree volume based on his assistant s estimation. The forester selects a random sample of trees to be felled. For each tree, the assistant is to guess the cubic foot volume of the tree. The forester also obtains the actual foot volume after the tree has been chopped down. Below is his data: Tree Estimated Volume Actual Volume STEP : Make a scatterplot; describe the form, direction and strength of the relationship. Before doing the scatterplot you need to decide which variable is the explanatory variable and which is the response variable. For Example, identify the explanatory and the response variables. Explanatory Variable: Response Variable: A scatterplot shows the relationship between two quantitative variables measured on the same individual. The explanatory variable is plotted on the x axis; the response variable is plotted on the y axis. Look at the overall pattern. The overall pattern can be described by form, direction and strength. Form: is the scatterplot linear, quadratic, etc. Direction: is the association positive or negative? Strength: of the relationship. Describe the scatterplot in Example. Form: Strength: Direction: 3

4 Scatterplot Using SPSS: >Graphs >Scatter/Dot Select simple and click define Pull estimate into the X Axis box and actual into the Y Axis box then click OK. Note: To get the fitted line you need to double click on your graph to bring up the chart editor. You will then need to click on a button that looks like a sdatterplot with a fitted line through it. Select linear and then close. Estimated Volume versus Actual Volume of Trees Actual R Sq Linear = Estimate STEP 2: Look for outliers and influential observations on the scatterplot. Look for striking deviations from the overall pattern. Outlier: An observation that lies outside of the overall pattern of the other observations. Points that are outliers in the y direction of a scatterplot have large regression residuals, but other outliers need not have large residuals. Influential observations: an observation that if removed would markedly change the results of the regression calculation. Points that are outliers in the x direction of a scatterplot are often influential for the least squares regression line. Are there any outliers or influential observations in our data? Note: To add a categorical variable to a scatterplot, use a different plot color or symbol for each category. 4

5 STEP 3: Find the correlation r. The correlation measures the direction and strength of the linear relationship between two quantitative variables. Correlation is usually written as r. x ix yiy r n s x s y Properties of correlation: It makes no difference which variable you call x and which you call y since correlation does not make use of the distinction between the explanatory variable and the response variable. Both variables need to be quantitative to calculate correlation. The correlation r does not change if we change the units of measurements of x, y, or both. A positive r corresponds to a positive relationship between the variables. A negative r corresponds to a negative relationship between the variables. ( r ) Values near 0 indicate a weak relationship and values close to or indicate a strong relationship. Correlation measures the strength of only a linear relationship. Like the mean and standard deviation, the correlation is not resistant. The correlation r is strongly affected by a few outlying observations. Use r with caution when outliers appear in the scatterplot. Correlation is not a complete description of two variable data. You should give the mean and standard deviations of both x and y along with the correlation. For Example, find the correlation between estimated volume and actual volume. We will use the Pearson Correlation output from SPSS here. 5

6 Note: The SPSS manual tells you where to find r using the least squares regression output, but this r is actually the ABSOLUTE VALUE OF r, so you need to figure out the sign yourself by looking at the association (positive or negative) of your data. Estimate Actual Correlations Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Estimate Actual.936** ** **. Correlation is significant at the 0.0 level (2-tailed). STEP 4: Find r 2. r 2 is the percent of variation in y explained by the regression line (the closer to 00%, the better). We can get this from the regression output by squaring the correlation r. For Example, find the percent of variation in actual volume of trees explained by the regression line. STEP 5: Find the least squares regression line for the data. The least squares regression line of y on x is the line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible. We have data on an explanatory variable x and a response variable y for n individuals. The means and standard deviations of the data are x and s x for x and y and s y for y; the correlation between x and y is r. The regression Model for the population is: yi 0 x i The sample prediction equation of the least squares regression is: y b bx 0 The slope is: sy b r s x Where the slope measures the amount of change caused in the response variable when the explanatory variable is increased by one unit. The intercept is: b0 y bx 6

7 Using SPSS: >Analyze >Regression >linear. Put estimate into the independent box and actual into the dependent box. Click OK. Model Model Summary Adjusted Std. Error of R R Square R Square the Estimate.936 a a. Predictors: (Constant), Estimate Model Regression Residual Total a. Predictors: (Constant), Estimate b. Dependent Variable: Actual ANOVA b Sum of Squares df Mean Square F Sig a Model (Constant) Estimate a. Dependent Variable: Actual Unstandardized Coefficients Coefficients a Standardized Coefficients B Std. Error Beta t Sig For Example, find the least squares regression line. Based on the r 2 value you found previously, do you think this line will be useful for predicting actual tree volumes? For Example, use the regression line to predict the actual volume of a tree with an estimated volume of 3 cubic feet? We can use a regression line to make predictions as long as we follow the following rules: Only use the least squares regression line to find y for a specific value of x. (Don t use it to find x for a specific value of y!) Extrapolation involves using the line to find y values corresponding to x values that are outside the range of our data x values. Typically we want to avoid this since the line may not be valid for wide ranges of x values. 7

8 Example 2: (From Moore and McCabe fourth edition) During the period after birth, a male white rat gains 40 grams (g) per week till about 0 weeks of age. (This is unusually regular in his growth, but 40g per week is a realistic rate.) a. If the rat weighed 00g at birth, give an equation for his weight after a week. What is the slope of this line? b. Would you be willing to use this line to predict the rat s weight at age 2 years: Do the prediction and think about the reasonableness of the results? (There are 454 g in a pound. To help you asses the results, a large cat weighs about 0 pounds. Prediction Intervals Predicting a future observation under conditions similar to those used in the study. Since there is variability involved in using a model created from sample data, a prediction interval is better than a single prediction. They re related to confidence intervals. Use SPSS to calculate these intervals. Residual: The vertical distance between the observed y value and the corresponding predicted y value. residual e y y y ( b b x ) i i i i 0 i For example 2, find the residual for tree number and tree number 7. Assumptions for Regression Inference and Regression Model:. Repeated responses of y are independent of each other. This basically means the data comes from a simple random sample. (To check this assumption examine the way in which the units were selected.) 2. For any fixed value of x, the response y varies according to a normal distribution. (To check the assumption of normality you can do a normal probability plot of the residuals on SPSS). 3. The relationship is linear. (To check the linearity assumption, you can make a scatterplot or a residual plot of the data.) 4. The standard deviation of y is the same for all values of x. The value of is unknown. (To check for constant variability you can look at the residual plot of the data.) 8

9 STEP 6: Make a residual plot and normal probability plot to check the regression assumptions. It is always important to check that the assumptions of the regression model have been met to determine whether your results are valid. This is also important to do before you proceed with inference. Normal Probability plots: If your points fall in a relatively straight line, then you can assume that your response is relatively normal and the second assumption has been met. To check the normality assumption we make a normal probability plot by doing the following: Using SPSS: >Analyze >Regression >linear. Put estimate into the independent box and actual into the dependent box. Then select plots and click on the box for normal probability plot and click continue followed by OK. Normal P-P Plot of Regression Standardized Residual.0 Dependent Variable: Actual 0.8 Expected Cum Prob Observed Cum Prob For example 2, has the normality assumption has been met? Residual Plots: A residual plot is a scatterplot of the regression residuals against the explanatory variable. It is used to assess the fit of a regression line and to check for a constant variability. The residual plot magnifies the deviations from the line to make the pattern easy to read. If the points are random with no pattern and approximately the same number of points above and below the center line, you can feel confident that assumptions three and four have been met. If you have a funnel shape this shows you that the assumption of constant variance has not been met. If you have some other pattern like a parabola, this shows you that the linearity assumption has not been met. 9

10 Using SPSS: >Analyze >Regression >linear. Put estimate into the independent box and actual into the dependent box. Click on the save button, check unstandardized residuals, and click continue. The residuals will appear in the data editor. Make a scatterplot of the residuals on the y axis against the estimated volume on the x axis. To get the line at y=0, when in the chart editor right click on the graph and select Add Y Axis reference line. Then select reference line and plug in a zero for y axis position. Residual Plot For example 2, have the assumptions of linearity and constant variability been met? Unstandardized Residual Estimate Lastly, it is important to check for outliers and influential variables. Looking for high residuals or points that are far from the other points is important. Often we will want to do the analysis both with and without the outliers, particularly if they are influential variables as well. Are there any outliers or influential variables in example 2? 0

11 Scatterplot Without the Influential Variable Actual Derived from Actual 4.00 R Sq Linear = Estimate Model Model Summary Adjusted Std. Error of R R Square R Square the Estimate.86 a a. Predictors: (Constant), Estimate Model (Constant) Estimate a. Dependent Variable: Actual Unstandardized Coefficients Coefficients a Standardized Coefficients B Std. Error Beta t Sig

12 STEP 7: Look at the hypothesis tests and confidence intervals Up until this point we have looked at some regression related concepts that can be used in an exploratory data analysis setting as well as a more formal setting. We will now look at inference for regression. Before we do this, however, it must be understood that the tests and confidence intervals that we find from now on out can only be found on data that has been collected using a random sampling technique such as simple random sampling. If we did not collect our data using a random sample, or if we have conducted a census, these techniques are meaningless. Test for a Zero Population Correlation: State null and alternative hypotheses H : 0 versus 0 a : 0 a Find the test statistic r n 2 t 2 r H, H : 0 or H : 0 where n is the sample size and r is the sample correlation Calculate the P value in terms of a random variable T having the ( 2) The P value for a test of H 0 against H a : 0 is PT ( t) H a : 0 is PT ( t) H : 0 is 2 P( T t ) a Compare the P value to the α level If P value α, then reject H 0 If P value > α, fail to then reject State your conclusions in terms of the problem H 0 a tn distribution. For example, test H : 0 versus 0 H a : 0 2

13 Confidence Intervals for Regression Slope and Intercept: A level C confidence interval for the intercept is 0 b t SE * 0 b 0 A level C confidence interval for the slope is b t SE * b SPSS will also give you these confidence intervals for 95%, but you may have to use the estimates for the coefficients and their standard errors to find the other confidence intervals. (Use the t table and n 2 degrees of freedom to get t*). Using SPSS: >Analyze >Regression >linear. Put estimate into the independent box and actual into the dependent box. Click on statistics and select confidence intervals and click continue followed by OK Model (Constant) Estimate a. Dependent Variable: Actual Unstandardized Coefficients Coefficients a Standardized Coefficients 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta For example 2, find the 95% and 99% confidence intervals for slope and y intercept. 3

14 Hypothesis test for the regression slope: State the null and Alternative Hypotheses H0 : 0 versus Ha : 0, a : 0 Find the test statistic t b, with df = n 2 SEb H or H : 0 (SPSS will give you the test statistic) Calculate the P value (SPSS will give you the 2 sided P value. If you have a one sided test, you will have to divide the P value by 2). Compare the P value to the α level If P value α, then reject H 0 If P value > α, then fail to reject H State your conclusions in terms of the problem 0 a The test statistic for correlation is numerically identical to the test statistic used to test slope. Therefore, you can read the test statistic and P value off of the SPSS output for the slope when doing a test for the correlation. For example 2, perform a significance test to see whether the slope of the regression line is positive. 4

15 Example 4: This example will use data that is part of a data set from Dr. T.N.K. Raju, Department of Neonatology, University of Illinois at Chicago. IMR=Infant Mortality rate PQLI=Physical Quality of life Index (Indicator of average wealth) Case PQLI IMR Case PQLI IMR How does the physical quality of life index affect infant mortality rate? Answer the questions below based on the output that follows. a. Describe the form, direction and strength of the relationship. b. What is the correlation? c. What percent of the variation in infant mortality rate is explained by the regression line? d. Give an estimate for the standard deviation of the model. (Find s.) e. Do a hypothesis test to test H : 0 0 versus H : 0. a f. What is the equation of the least squares regression line? g. Use the regression line to predict a PQLI of 25. 5

16 h. Is the prediction in part 6 good? Why? i. Find the residual for case. j. Find a 99% confidence interval for the slope. k. What assumptions need to be met for the above to be of use? How Physical Quality of Life affects Infant Mortality Rate 30 0 IMR PQLI Model Model Summary Adjusted Std. Error of R R Square R Square the Estimate.300 a a. Predictors: (Constant), PQLI Model (Constant) PQLI a. Dependent Variable: IMR Unstandardized Coefficients Coefficients a Standardized Coefficients 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta

17 Residual Plot Unstandardized Residual PQLI Normal P-P Plot of Regression Standardized Residual.0 Dependent Variable: IMR 0.8 Expected Cum Prob Observed Cum Prob Does the relationship make sense? For example, does it make sense that the infant mortality rate will go up as the physical quality of life index gets better? What could be a potential lurking variable here? Now let s look at what happens if we add a categorical variable to the picture. Case PQLI IMR Location 7 0 rural rural rural urban rural urban rural urban urban rural urban 7

18 40 How Physical Quality of Life Affects Infant Mo LOCATION urban IMR rural PQLI 8

### Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

### Simple Linear Regression in SPSS STAT 314

Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,

### SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

### Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

### Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

### Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

### Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### Simple Linear Regression, Scatterplots, and Bivariate Correlation

1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.

### , has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results.

BA 275 Review Problems - Week 9 (11/20/06-11/24/06) CD Lessons: 69, 70, 16-20 Textbook: pp. 520-528, 111-124, 133-141 An SRS of size 100 is taken from a population having proportion 0.8 of successes. An

### Lesson Lesson Outline Outline

Lesson 15 Linear Regression Lesson 15 Outline Review correlation analysis Dependent and Independent variables Least Squares Regression line Calculating l the slope Calculating the Intercept Residuals and

### Chapter 5. Regression

Topics covered in this chapter: Chapter 5. Regression Adding a Regression Line to a Scatterplot Regression Lines and Influential Observations Finding the Least Squares Regression Model Adding a Regression

### Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### SIMPLE REGRESSION ANALYSIS

SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two

### Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:

### Correlation and Regression

Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look

### Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

### HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

### Bivariate Regression Analysis. The beginning of many types of regression

Bivariate Regression Analysis The beginning of many types of regression TOPICS Beyond Correlation Forecasting Two points to estimate the slope Meeting the BLUE criterion The OLS method Purpose of Regression

### Exercise 1.12 (Pg. 22-23)

Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### Univariate Regression

Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

### Mind on Statistics. Chapter 3

Mind on Statistics Chapter 3 Section 3.1 1. Which one of the following is not appropriate for studying the relationship between two quantitative variables? A. Scatterplot B. Bar chart C. Correlation D.

### Introduction to Regression and Data Analysis

Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

### e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

### Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.

### Statistiek II. John Nerbonne. March 24, 2010. Information Science, Groningen Slides improved a lot by Harmut Fitz, Groningen!

Information Science, Groningen j.nerbonne@rug.nl Slides improved a lot by Harmut Fitz, Groningen! March 24, 2010 Correlation and regression We often wish to compare two different variables Examples: compare

### Linear Regression in SPSS

Linear Regression in SPSS Data: mangunkill.sav Goals: Examine relation between number of handguns registered (nhandgun) and number of man killed (mankill) checking Predict number of man killed using number

### Correlation and Regression Analysis: SPSS

Correlation and Regression Analysis: SPSS Bivariate Analysis: Cyberloafing Predicted from Personality and Age These days many employees, during work hours, spend time on the Internet doing personal things,

### Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under

### Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

Linear Regression Chapter 5 Regression Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). We can then predict the average response for all subjects

### SPSS for Exploratory Data Analysis Data used in this guide: studentp.sav (http://people.ysu.edu/~gchang/stat/studentp.sav)

Data used in this guide: studentp.sav (http://people.ysu.edu/~gchang/stat/studentp.sav) Organize and Display One Quantitative Variable (Descriptive Statistics, Boxplot & Histogram) 1. Move the mouse pointer

### Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

### 2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

### Bivariate Analysis. Correlation. Correlation. Pearson's Correlation Coefficient. Variable 1. Variable 2

Bivariate Analysis Variable 2 LEVELS >2 LEVELS COTIUOUS Correlation Used when you measure two continuous variables. Variable 2 2 LEVELS X 2 >2 LEVELS X 2 COTIUOUS t-test X 2 X 2 AOVA (F-test) t-test AOVA

### 2. Simple Linear Regression

Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

### Using Minitab for Regression Analysis: An extended example

Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to

### STAT 350 Practice Final Exam Solution (Spring 2015)

PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

### Yiming Peng, Department of Statistics. February 12, 2013

Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop

### Formula for linear models. Prediction, extrapolation, significance test against zero slope.

Formula for linear models. Prediction, extrapolation, significance test against zero slope. Last time, we looked the linear regression formula. It s the line that fits the data best. The Pearson correlation

### Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Display and Summarize Correlation for Direction and Strength Properties of Correlation Regression Line Cengage

### Example: Boats and Manatees

Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

### Chapter 10. Analysis of Covariance. 10.1 Multiple regression

Chapter 10 Analysis of Covariance An analysis procedure for looking at group effects on a continuous outcome when some other continuous explanatory variable also has an effect on the outcome. This chapter

### An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression

Chapter 9 Simple Linear Regression An analysis appropriate for a quantitative outcome and a single quantitative explanatory variable. 9.1 The model behind linear regression When we are examining the relationship

### Module 5: Multiple Regression Analysis

Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

### CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

### MTH 140 Statistics Videos

MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative

### Chapter 2 Probability Topics SPSS T tests

Chapter 2 Probability Topics SPSS T tests Data file used: gss.sav In the lecture about chapter 2, only the One-Sample T test has been explained. In this handout, we also give the SPSS methods to perform

### Multiple Linear Regression

Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

### SPSS: Descriptive and Inferential Statistics. For Windows

For Windows August 2012 Table of Contents Section 1: Summarizing Data...3 1.1 Descriptive Statistics...3 Section 2: Inferential Statistics... 10 2.1 Chi-Square Test... 10 2.2 T tests... 11 2.3 Correlation...

### Simple Regression Theory II 2010 Samuel L. Baker

SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

### The importance of graphing the data: Anscombe s regression examples

The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective

### 1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

### X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

### Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between

### Pearson s correlation

Pearson s correlation Introduction Often several quantitative variables are measured on each member of a sample. If we consider a pair of such variables, it is frequently of interest to establish if there

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Introductory Statistics Notes

Introductory Statistics Notes Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone: (205) 348-4431 Fax: (205) 348-8648 August

### The aspect of the data that we want to describe/measure is the degree of linear relationship between and The statistic r describes/measures the degree

PS 511: Advanced Statistics for Psychological and Behavioral Research 1 Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables ( and

### Chapter 9. Section Correlation

Chapter 9 Section 9.1 - Correlation Objectives: Introduce linear correlation, independent and dependent variables, and the types of correlation Find a correlation coefficient Test a population correlation

### Linear Models in STATA and ANOVA

Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

### Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

### Final Exam Practice Problem Answers

Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

### Directions for using SPSS

Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

### Simple Regression and Correlation

Simple Regression and Correlation Today, we are going to discuss a powerful statistical technique for examining whether or not two variables are related. Specifically, we are going to talk about the ideas

### Chapter 10 - Practice Problems 1

Chapter 10 - Practice Problems 1 1. A researcher is interested in determining if one could predict the score on a statistics exam from the amount of time spent studying for the exam. In this study, the

### DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

### Homework 8 Solutions

Math 17, Section 2 Spring 2011 Homework 8 Solutions Assignment Chapter 7: 7.36, 7.40 Chapter 8: 8.14, 8.16, 8.28, 8.36 (a-d), 8.38, 8.62 Chapter 9: 9.4, 9.14 Chapter 7 7.36] a) A scatterplot is given below.

### Correlation key concepts:

CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)

### The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

### Relationships Between Two Variables: Scatterplots and Correlation

Relationships Between Two Variables: Scatterplots and Correlation Example: Consider the population of cars manufactured in the U.S. What is the relationship (1) between engine size and horsepower? (2)

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### ABSORBENCY OF PAPER TOWELS

ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?

### Fairfield Public Schools

Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

### Once saved, if the file was zipped you will need to unzip it.

1 Commands in SPSS 1.1 Dowloading data from the web The data I post on my webpage will be either in a zipped directory containing a few files or just in one file containing data. Please learn how to unzip

### Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed

### We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

Statistics: Correlation Richard Buxton. 2008. 1 Introduction We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Do

### Using R for Linear Regression

Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

### Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

### REGRESSION LINES IN STATA

REGRESSION LINES IN STATA THOMAS ELLIOTT 1. Introduction to Regression Regression analysis is about eploring linear relationships between a dependent variable and one or more independent variables. Regression

### 1.5 Oneway Analysis of Variance

Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

### Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

### Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

### An analysis method for a quantitative outcome and two categorical explanatory variables.

Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that

### EPS 625 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM

EPS 6 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM ANCOVA One Continuous Dependent Variable (DVD Rating) Interest Rating in DVD One Categorical/Discrete Independent Variable

### AP Statistics 2001 Solutions and Scoring Guidelines

AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use

### COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies

### FEV1 (litres) Figure 1: Models for gas consumption and lung capacity

Simple Linear Regression: Reliability of predictions Richard Buxton. 2008. 1 Introduction We often use regression models to make predictions. In Figure 1 (a), we ve fitted a model relating a household

### Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple

### Regression step-by-step using Microsoft Excel

Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

### Independent t- Test (Comparing Two Means)

Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent

### MULTIPLE REGRESSION EXAMPLE

MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if

### Statistics and research

Statistics and research Usaneya Perngparn Chitlada Areesantichai Drug Dependence Research Center (WHOCC for Research and Training in Drug Dependence) College of Public Health Sciences Chulolongkorn University,

### SPSS Explore procedure

SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,

### Predictability Study of ISIP Reading and STAAR Reading: Prediction Bands. March 2014

Predictability Study of ISIP Reading and STAAR Reading: Prediction Bands March 2014 Chalie Patarapichayatham 1, Ph.D. William Fahle 2, Ph.D. Tracey R. Roden 3, M.Ed. 1 Research Assistant Professor in the

### KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

### Factors affecting online sales

Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4