Chapters 2 and 10: Least Squares Regression

Size: px
Start display at page:

Download "Chapters 2 and 10: Least Squares Regression"

Transcription

1 Chapters 2 and 0: Least Squares Regression Learning goals for this chapter: Describe the form, direction, and strength of a scatterplot. Use SPSS output to find the following: least-squares regression line, correlation, r 2, and estimate for σ. Interpret a scatterplot, residual plot, and Normal probability plot. Calculate the predicted response and residual for a particular x-value. Understand that least-squares regression is only appropriate if there is a linear relationship between x and y. Determine explanatory and response variables from a story. Use SPSS to calculate a prediction interval for a future observation. Perform a hypothesis test for the regression slope and for zero population correlation/independence, including: stating the null and alternative hypotheses, obtaining the test statistic and P-value from SPSS, and stating the conclusions in terms of the story. Understand that correlation and causation are not the same thing. Estimate correlation for a scatterplot display of data. Distinguish between prediction and extrapolation. Check for differences between outliers and influential outliers by rerunning the regression. Know that scatterplots and regression lines are based on sample data, but hypothesis tests and confidence intervals give you information about the population parameter. When you have 2 quantitative variables and you want to look at the relationship between them, use a scatterplot. If the scatter plot looks linear, then you can do least squares regression to get an equation of a line that uses x to explain what happens with y. The general procedure:. Make a scatter plot of the data from the x and y variables. Describe the form, direction, and strength. Look for outliers. 2. Look at the correlation to get a numerical value for the direction and strength. 3. If the data is reasonably linear, get an equation of the line using least squares regression. 4. Look at the residual plot to see if there are any outliers or the possibility of lurking variables. (Patterns bad, randomness good.)

2 5. Look at the normal probability plot to determine whether the residuals are normally distributed. (The dots sticking close to the 45-degree line is good.) 6. Look at hypothesis tests for the correlation, slope, and intercept. Look at confidence intervals for the slope, intercept, and mean response, and at the prediction intervals. 7. If you had an outlier, you should re-work the data without the outlier and comment on the differences in your results. Association Positive, negative, or no association Remember: ASSOCIATON or CORRELATION is NOT the same thing as CAUSATION. (See chapter 3/2.5 notes.) Response variable: Y Dependent variable measures an outcome of a study Explanatory variable: X Independent variable explains or is related to changes in the response variables (p. 05) Scatterplots: Show the relationship between 2 quantitative variables measured on the same individuals Dots only don t connect them with a line or a curve Form: Linear? Non-linear? No obvious pattern? Direction: Positive or negative association? No association? Strength: how closely do the points follow a clear form? Strong or weak or moderate? Look for OUTLIERS! Correlation: measures the direction and strength of the linear relationship between 2 quantitative variables, r. It is the standardized value for each observation with respect to the mean and standard deviation. 2

3 r xi x yi y n s s x y where we have data on variables x and y for n individuals. You won t need to use this formula, but SPSS will. Using SPSS to get correlation: Use the Pearson Correlation output. Analyze --> Correlate --> Bivariate (see page 55 in the SPSS manual). The SPSS manual tells you where to find r using the least squares regression output, but this r is actually the ABSOLUTE VALUE OF r, so you need to pay attention to the direction yourself. The Pearson Correlation gives you the actual r with the correct sign. Properties of correlation: X and Y both have to be quantitative. It makes no difference which you call X and which you call Y. Does not change when you change the units of measurement. If r is positive, there is a positive association between X and Y As X increases, Y increases If r is negative, there is a negative association between X and Y As X increases, Y decreases r The closer r is to or to, the stronger the linear relationship The closer r is to 0, the weaker the linear relationship Outliers strongly affect r. Use r with caution if outliers are present. 3

4 Example: We want to examine whether the amount of rainfall per year increases or decreases corn bushel output. A sample of 0 observations was taken, and the amount of rainfall (in inches) was measured, as was the subsequent growth of corn. Amount of Rain Bushels of Corn The scatterplot: amount of rain (in) a) What does the scatterplot tell us? What is the form? Direction? Strength? What do we expect the correlation to be? 4

5 Correlations amount of corn yield rain (in) (bushels) amount of rain (in) Pearson Correlation.995(**) Sig. (2-tailed)..000 N 0 0 corn yield (bushels) Pearson Correlation.995(**) Sig. (2-tailed).000. N 0 0 ** Correlation is significant at the 0.0 level (2-tailed). Inference for Correlation: R = correlation R 2 = % of variation in Y explained by the regression line (the closer to 00%, the better) ρ (Greek letter rho) = correlation for the population When ρ = 0, there is no linear association in the population, so X and Y are independent (if X and Y are both normally distributed). Hypothesis test for correlation: To test the null hypothesis H 0 : ρ = 0, SPSS will compute the t statistic: degrees of freedom = n 2 for simple linear regression. t r n r 2 2, b) Are corn yield and rain independent in the population? Perform a test of significance to determine this. c) Do corn yield and rain have a positive correlation in the population? Perform a test of significance to determine this. This test statistic for the correlation is numerically identical to the t statistic used to test H 0 : = 0. Can we do better than just a scatter plot and the correlation in describing how x and y are related? What if we want to predict y for other values of x? 5

6 Least-Squares Regression fits a straight line through the data points that will minimize the sum of the vertical distances of the data points from the line. Minimizes n i ( e ) i 2 Equation of the line is: yˆ b ˆ 0 bx, with y= the predicted y line Slope of the line is: b s y r, where the slope measures the amount of change s x caused in the predicted response variable when the explanatory variable is increased by one unit. Intercept of the line is: b0 y bx, where the intercept is the value of the predicted response variable when the explanatory variable = 0. Type of line Least Squares Regression slope y-intercept equation of line Ch. 0 Sample ŷ b0 bx b b 0 Ch. 0 Population (model) yi 0 x i i 0 Using the corn example, find the least squares regression line. Tell SPSS to do Analyze Regression Linear. Put rain into the independent box and corn into the dependent box. Click OK. Model Model Model (Constant) amount of rain (in) Unstandardized Coefficients a. Dependent Variable: corn yield (bushels) Model Summary b Adjusted Std. Error of R R Square R Square the Estimate.995 a a. Predictors: (Constant), amount of rain (in) b. Dependent Variable: corn y ield (bus hels) ANOVA b Regression Residual Total Sum of Squares df Mean Square F Sig a a. Predictors: (Constant), amount of rain (in) b. Dependent Variable: corn yield (bushels) Coe fficients a Standardized Coefficients 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta

7 d) What is the least-squares regression line equation? The scatterplot with the least squares regression line looks like: R 2 is the percent of variation in corn yield explained by the regression line with rain= 99.06% Rsq = amount of rain (in) Hypothesis testing for H 0 : = 0 b Test statistic: t SE b with df = n - 2 SPSS will give you the test statistic (under t), and the 2-sided P-value (under Sig.). e) Is the slope positive in the population? Perform a test of significance. f) What % of the variability in corn yield is explained by the least squares regression line? g) What is the estimate of the standard error of the model? 7

8 What do we mean by prediction or extrapolation? Use your least-squares regression line to find y for other x-values. Prediction: using the line to find y-values corresponding to x-values that are within the range of your data x-values. Extrapolation: using the line to find y-values corresponding to x-values that are outside the range of your data x-values. Be careful about extrapolating y-values for x-values that are far away from the x data you currently have. The line may not be valid for wide ranges of x! Example: On the rain/corn data above, predict the corn yield for a) 5 inches of rain b) 7.2 inches of rain c) 0 inches of rain d) 00 inches of rain e) For which amounts of rainfall above do you think the line does a good job of predicting actual corn yield? Why? Cartoon by J.B. Landers on (used with permission) 8

9 Prediction Intervals Predicting a future observation under conditions similar to those used in the study. Since there is variability involved in using a model created from sample data, a prediction interval is better than a single prediction. They re related to confidence intervals. Use SPSS. The 95% prediction interval for future corn yield measurements when rain = 5. is (96.90, 03.4). Assumptions for Regression:. Repeated responses y are independent of each other. 2. For any fixed value of x, the response y varies according to a Normal distribution. 3. The mean response has a straight-line relationship with x. y 4. The standard deviation of y (σ) is the same for all values of x. The value of σ is unknown. How do you check these assumptions? Scatterplot and R 2 : Do you have a straight-line relationship between X and Y? How strong is it? How close to 00% is R 2? Hopefully no outliers! (#3) 9

10 Expected Cum Prob Normal probability plot: Are the residuals approximately normally distributed? Do the dots fall fairly close to the diagonal line (which is always there in the same spot)? (#2) Normal P-P Plot of Regression Standardized Residual Dependent Variable: corn yield (bushels) Residual plot: Do you have constant variability? Do the dots on your residual plot look random and fairly evenly distributed above and below the 0 line? Hopefully no outliers! (# and 4) Residual is the vertical difference between the observed y-value and the regression line y-value: residual e y y y a bx y y ˆ i i i i i data line Observed Cum Prob Residual plot: scatterplot of the regression residuals against the explanatory variable (e vs. x) e-axis has both negative and positive values but centered about e = 0. the mean of the least-squares residuals is always zero. e 0 Good: total randomness, no pattern, approximately the same number of points above and below the e = 0 line Bad: obvious pattern, funnel shape, parabola, more points above 0 than below (or vice versa) if you have a pattern, your data does not necessarily fit the model (line) well 0.0 0

11 Unstandardized Residual Example: Show a residual plot for the corn/rain data using SPSS amount of rain (in) Outliers: Outliers are observations that lie outside the overall pattern of the other observations. Outliers in the y direction of a scatterplot have large regression residuals (e i ) Outliers in the x direction of a scatterplot are often influential for the regression line An observation is influential if removing it would markedly change the result of the calculation Outliers can drastically affect regression line, correlation, means, and standard deviations. You can draw a second regression line that doesn t include the outliers if the second line moves more than a small amount when the point is deleted or if R 2 changes much, the point is influential Which hypothesis test do you use when? If you re not sure whether to use β or ρ, here are some guidelines. The test statistics and P-values are identical for either symbol. Use β ρ Either β or ρ If the words are: Slope, regression coefficient Correlation, independence linear relationship

12 Review of SPSS instructions for Regression: When you set up your regression, you click on: Analyze-->Regression-->Linear. Put in your y variable for "dependent" and your x variable for "independent" on the gray screen. Don't hit "ok" yet though. Back on the regression gray screen, click on "Plots", and then click on "normal probability plot." Click "continue" on the Plots gray screen. Back on the regression gray screen, click on "Save", and then click on unstandardized residuals." Click Individual under the Prediction Interval section, and adjust the confidence level, if needed. Click "continue" on the Save gray screen and then "ok" to the big Regression gray screen. The prediction interval and the residuals will show up back on the data input screen. The LICI_ and UICI_ give you the prediction interval lower and upper bounds. You still won't have a residual plot yet. If you click back to your data input screen, you now have a new column called "Res_". To make the residual plot, you follow the same steps for making a scatterplot: go to graphs-->scatter-->simple, then put "Res_" in for y and your x variable in for x. Click "ok." Once you see your residual plot, you'll need to double click on it to go to Chart Editor. On the Chart Editor tool bar, you can see a button that shows a graph with a horizontal line. Click on that button. Make sure that the y- axis is set to 0. 2

13 Sodium content Example: The scatterplot below shows the calories and sodium content for each of 7 brands of meat hot dogs. 600 a) Describe the main features of the relationship Calories b) What is the correlation between calories and sodium? Correlations Sodium Calories content Calories Pearson Correlation.863** Sig. (2-tailed)..000 N 7 7 Sodium content Pearson Correlation.863** Sig. (2-tailed).000. N 7 7 **. Correlation is significant at the 0. 0 level (2-tailed). c) Report the least-squares regression line. Model Model Summary b Adjusted Std. Error of R R Square R Square the Estimate.863 a a. Predictors: (Constant), Calories b. Dependent Variable: Sodium content 4

14 Expected Cum Prob Model (Constant) Calories Unstandardized Coefficients a. Dependent Variable: Sodium content Coe fficients a Standardized Coefficients 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta d) Show a residual plot and comment on its features Calories e) Is there an outlier? If so, where is it? f) Show a normal probability plot and comment on its features. Normal P-P Plot of Regression Standardized Residual.0 Dependent Variable: Sodium content Observed Cum Prob 5

15 g) Leave off the outlier, and recalculate the correlation and another leastsquares regression line. Is your outlier influential? Explain your answer. cal2 sod2 Correlations Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N cal2 sod2.834** **.000. **. Correlation is significant at the 0. 0 level (2-tailed). 6 6 Model Model Summary b Adjusted Std. Error of R R Square R Square the Estimate.834 a a. Predictors: (Constant), cal2 b. Dependent Variable: sod2 Model (Constant) cal2 a. Dependent Variable: sod2 Unstandardized Coefficients Coe fficients a Standardized Coefficients 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta h) If there is a new brand of meat hot dog with 50 calories per frank, how many milligrams of sodium do you estimate that one of these hotdogs contains? 6

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Simple Linear Regression, Scatterplots, and Bivariate Correlation 1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.

More information

Correlation and Regression

Correlation and Regression Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices: Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares Linear Regression Chapter 5 Regression Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). We can then predict the average response for all subjects

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

MTH 140 Statistics Videos

MTH 140 Statistics Videos MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative

More information

The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

More information

Relationships Between Two Variables: Scatterplots and Correlation

Relationships Between Two Variables: Scatterplots and Correlation Relationships Between Two Variables: Scatterplots and Correlation Example: Consider the population of cars manufactured in the U.S. What is the relationship (1) between engine size and horsepower? (2)

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS. SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

ch12 practice test SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

ch12 practice test SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. ch12 practice test 1) The null hypothesis that x and y are is H0: = 0. 1) 2) When a two-sided significance test about a population slope has a P-value below 0.05, the 95% confidence interval for A) does

More information

Formula for linear models. Prediction, extrapolation, significance test against zero slope.

Formula for linear models. Prediction, extrapolation, significance test against zero slope. Formula for linear models. Prediction, extrapolation, significance test against zero slope. Last time, we looked the linear regression formula. It s the line that fits the data best. The Pearson correlation

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression

An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression Chapter 9 Simple Linear Regression An analysis appropriate for a quantitative outcome and a single quantitative explanatory variable. 9.1 The model behind linear regression When we are examining the relationship

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

The importance of graphing the data: Anscombe s regression examples

The importance of graphing the data: Anscombe s regression examples The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective

More information

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Display and Summarize Correlation for Direction and Strength Properties of Correlation Regression Line Cengage

More information

Example: Boats and Manatees

Example: Boats and Manatees Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

Regression and Correlation

Regression and Correlation Regression and Correlation Topics Covered: Dependent and independent variables. Scatter diagram. Correlation coefficient. Linear Regression line. by Dr.I.Namestnikova 1 Introduction Regression analysis

More information

How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

More information

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

STAT 350 Practice Final Exam Solution (Spring 2015)

STAT 350 Practice Final Exam Solution (Spring 2015) PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

More information

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables. SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

More information

Directions for using SPSS

Directions for using SPSS Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

More information

Homework 8 Solutions

Homework 8 Solutions Math 17, Section 2 Spring 2011 Homework 8 Solutions Assignment Chapter 7: 7.36, 7.40 Chapter 8: 8.14, 8.16, 8.28, 8.36 (a-d), 8.38, 8.62 Chapter 9: 9.4, 9.14 Chapter 7 7.36] a) A scatterplot is given below.

More information

Using R for Linear Regression

Using R for Linear Regression Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

More information

SPSS Explore procedure

SPSS Explore procedure SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,

More information

Correlation key concepts:

Correlation key concepts: CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Review MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) All but one of these statements contain a mistake. Which could be true? A) There is a correlation

More information

An analysis method for a quantitative outcome and two categorical explanatory variables.

An analysis method for a quantitative outcome and two categorical explanatory variables. Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that

More information

Elementary Statistics Sample Exam #3

Elementary Statistics Sample Exam #3 Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to

More information

17. SIMPLE LINEAR REGRESSION II

17. SIMPLE LINEAR REGRESSION II 17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

More information

Scatter Plots with Error Bars

Scatter Plots with Error Bars Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each

More information

Correlation and Regression Analysis: SPSS

Correlation and Regression Analysis: SPSS Correlation and Regression Analysis: SPSS Bivariate Analysis: Cyberloafing Predicted from Personality and Age These days many employees, during work hours, spend time on the Internet doing personal things,

More information

Scatter Plot, Correlation, and Regression on the TI-83/84

Scatter Plot, Correlation, and Regression on the TI-83/84 Scatter Plot, Correlation, and Regression on the TI-83/84 Summary: When you have a set of (x,y) data points and want to find the best equation to describe them, you are performing a regression. This page

More information

AP STATISTICS REVIEW (YMS Chapters 1-8)

AP STATISTICS REVIEW (YMS Chapters 1-8) AP STATISTICS REVIEW (YMS Chapters 1-8) Exploring Data (Chapter 1) Categorical Data nominal scale, names e.g. male/female or eye color or breeds of dogs Quantitative Data rational scale (can +,,, with

More information

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Statistics: Correlation Richard Buxton. 2008. 1 Introduction We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries? Do

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

ABSORBENCY OF PAPER TOWELS

ABSORBENCY OF PAPER TOWELS ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?

More information

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2 Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables

More information

Pearson s Correlation

Pearson s Correlation Pearson s Correlation Correlation the degree to which two variables are associated (co-vary). Covariance may be either positive or negative. Its magnitude depends on the units of measurement. Assumes the

More information

EXCEL Tutorial: How to use EXCEL for Graphs and Calculations.

EXCEL Tutorial: How to use EXCEL for Graphs and Calculations. EXCEL Tutorial: How to use EXCEL for Graphs and Calculations. Excel is powerful tool and can make your life easier if you are proficient in using it. You will need to use Excel to complete most of your

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

Linear Models in STATA and ANOVA

Linear Models in STATA and ANOVA Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

More information

Dealing with Data in Excel 2010

Dealing with Data in Excel 2010 Dealing with Data in Excel 2010 Excel provides the ability to do computations and graphing of data. Here we provide the basics and some advanced capabilities available in Excel that are useful for dealing

More information

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I Index Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1 EduPristine CMA - Part I Page 1 of 11 Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting

More information

A full analysis example Multiple correlations Partial correlations

A full analysis example Multiple correlations Partial correlations A full analysis example Multiple correlations Partial correlations New Dataset: Confidence This is a dataset taken of the confidence scales of 41 employees some years ago using 4 facets of confidence (Physical,

More information

The Big Picture. Correlation. Scatter Plots. Data

The Big Picture. Correlation. Scatter Plots. Data The Big Picture Correlation Bret Hanlon and Bret Larget Department of Statistics Universit of Wisconsin Madison December 6, We have just completed a length series of lectures on ANOVA where we considered

More information

Normality Testing in Excel

Normality Testing in Excel Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

More information

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5 Using Your TI-83/84 Calculator: Linear Correlation and Regression Elementary Statistics Dr. Laura Schultz This handout describes how to use your calculator for various linear correlation and regression

More information

Section 3 Part 1. Relationships between two numerical variables

Section 3 Part 1. Relationships between two numerical variables Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.

More information

Using Excel for Statistical Analysis

Using Excel for Statistical Analysis Using Excel for Statistical Analysis You don t have to have a fancy pants statistics package to do many statistical functions. Excel can perform several statistical tests and analyses. First, make sure

More information

The correlation coefficient

The correlation coefficient The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

FREE FALL. Introduction. Reference Young and Freedman, University Physics, 12 th Edition: Chapter 2, section 2.5

FREE FALL. Introduction. Reference Young and Freedman, University Physics, 12 th Edition: Chapter 2, section 2.5 Physics 161 FREE FALL Introduction This experiment is designed to study the motion of an object that is accelerated by the force of gravity. It also serves as an introduction to the data analysis capabilities

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles. Math 1530-017 Exam 1 February 19, 2009 Name Student Number E There are five possible responses to each of the following multiple choice questions. There is only on BEST answer. Be sure to read all possible

More information

One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate 1 One-Way ANOVA using SPSS 11.0 This section covers steps for testing the difference between three or more group means using the SPSS ANOVA procedures found in the Compare Means analyses. Specifically,

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

Introduction to Linear Regression

Introduction to Linear Regression 14. Regression A. Introduction to Simple Linear Regression B. Partitioning Sums of Squares C. Standard Error of the Estimate D. Inferential Statistics for b and r E. Influential Observations F. Regression

More information

Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data

Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable

More information

Using Excel for inferential statistics

Using Excel for inferential statistics FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

More information

January 26, 2009 The Faculty Center for Teaching and Learning

January 26, 2009 The Faculty Center for Teaching and Learning THE BASICS OF DATA MANAGEMENT AND ANALYSIS A USER GUIDE January 26, 2009 The Faculty Center for Teaching and Learning THE BASICS OF DATA MANAGEMENT AND ANALYSIS Table of Contents Table of Contents... i

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

Point Biserial Correlation Tests

Point Biserial Correlation Tests Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the product-moment correlation calculated between a continuous random variable

More information

Pearson's Correlation Tests

Pearson's Correlation Tests Chapter 800 Pearson's Correlation Tests Introduction The correlation coefficient, ρ (rho), is a popular statistic for describing the strength of the relationship between two variables. The correlation

More information

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

More information

An SPSS companion book. Basic Practice of Statistics

An SPSS companion book. Basic Practice of Statistics An SPSS companion book to Basic Practice of Statistics SPSS is owned by IBM. 6 th Edition. Basic Practice of Statistics 6 th Edition by David S. Moore, William I. Notz, Michael A. Flinger. Published by

More information

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. 277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies

More information

SPSS TUTORIAL & EXERCISE BOOK

SPSS TUTORIAL & EXERCISE BOOK UNIVERSITY OF MISKOLC Faculty of Economics Institute of Business Information and Methods Department of Business Statistics and Economic Forecasting PETRA PETROVICS SPSS TUTORIAL & EXERCISE BOOK FOR BUSINESS

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information