Lecture 21c: Using SPSS for Regression and Correlation

Similar documents
SPSS Guide: Regression Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Simple linear regression

Multiple Regression. Page 24

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Chapter 23. Inferences for Regression

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Mixed 2 x 3 ANOVA. Notes

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

The Dummy s Guide to Data Analysis Using SPSS

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Univariate Regression

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Scatter Plot, Correlation, and Regression on the TI-83/84

Chapter 7: Simple linear regression Learning Objectives

Part 2: Analysis of Relationship Between Two Variables

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Logs Transformation in a Regression Equation

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared.

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

Predictability Study of ISIP Reading and STAAR Reading: Prediction Bands. March 2014

Module 3: Correlation and Covariance

Correlation and Regression Analysis: SPSS

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Moderator and Mediator Analysis

WEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y X

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

Session 7 Bivariate Data and Analysis

Correlation key concepts:

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

Directions for using SPSS

2. Simple Linear Regression

Final Exam Practice Problem Answers

January 26, 2009 The Faculty Center for Teaching and Learning

A STUDY ON ONBOARDING PROCESS IN SIFY TECHNOLOGIES, CHENNAI

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Dealing with Data in Excel 2010

Regression and Correlation

Chapter 5 Analysis of variance SPSS Analysis of variance

1.1. Simple Regression in Excel (Excel 2010).

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

y = a + bx Chapter 10: Horngren 13e The Dependent Variable: The cost that is being predicted The Independent Variable: The cost driver

SPSS Tests for Versions 9 to 13

Simple Regression Theory II 2010 Samuel L. Baker

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there

Linear Models in STATA and ANOVA

Section 3 Part 1. Relationships between two numerical variables

Formula for linear models. Prediction, extrapolation, significance test against zero slope.

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Chapter 3 Quantitative Demand Analysis

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Data Analysis for Marketing Research - Using SPSS

(More Practice With Trend Forecasts)

Moderation. Moderation

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Premaster Statistics Tutorial 4 Full solutions

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

Regression III: Advanced Methods

Curve Fitting in Microsoft Excel By William Lee

Statistical Models in R

2013 MBA Jump Start Program. Statistics Module Part 3

Ordinal Regression. Chapter

Module 5: Multiple Regression Analysis

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Using Excel for Statistical Analysis

Correlation and Simple Linear Regression

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

How To Run Statistical Tests in Excel

Week TSX Index

ABSORBENCY OF PAPER TOWELS

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

Point Biserial Correlation Tests

PLOTTING DATA AND INTERPRETING GRAPHS

Regression Analysis: A Complete Example

SPSS Resources. 1. See website (readings) for SPSS tutorial & Stats handout

Multiple Regression Using SPSS

The importance of graphing the data: Anscombe s regression examples

Using R for Linear Regression

Copyright 2013 by Laura Schultz. All rights reserved. Page 1 of 7

Exercise 1.12 (Pg )

Bill Burton Albert Einstein College of Medicine April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

Lecture 13/Chapter 10 Relationships between Measurement (Quantitative) Variables

Analysing Questionnaires using Minitab (for SPSS queries contact -)

individualdifferences

Solving Quadratic Equations

SPSS TUTORIAL & EXERCISE BOOK

Research Methods & Experimental Design

The correlation coefficient

Association Between Variables

An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression

Transcription:

Statistics 21c_SPSS.pdf Michael Hallstone, Ph.D. hallston@hawaii.edu Lecture 21c: Using SPSS for Regression and Correlation The purpose of this lecture is to illustrate the how to create SPSS output for correlation and regression. You will notice that this document follows the order of the test questions for regression and correlation on the take home exam. Regression Recall simple regression/correlation require two interval/ratio level variables. Thus you will need two of these We will use the data from lecture 21. I have two ratio-level variables: waveht = wave ht. in feet at Waimea Bay & surfers = # of surfers in the water at Waimea Bay. Pretend the data comes from randomly selecting observations from the lifeguards for 7 days in January and February of this winter (n=7). Wave Ht. # of surfers 2 0 5 2 8 10 12 30 15 40 18 45 20 60 How to have SPSS do a scatter plot 1 OF 8

Choose Simple Scatter and select Define. It is VERY important that you do not mix up your two variables in this screen! Recall that the dependent variable (y) is the variable you wish to predict and you will predict it using the independent variable (x). Use the arrows to put your dependent variable (y) [ours is number of surfers in the water] into the y axis box and the independent variable (x) [ours is height of waves] into the x axis box and push OK. 2 OF 8

Below is the output of the scatter plot How to have SPSS produce simple linear regression output 3 OF 8

Again is VERY important that you do not mix up your two variables in this screen! Move your dependent variable (y) into the Dependent box and your independent variable (x) into the Independent box and push OK. Recall our dependent variable is number of surfers and our independent variable is height of waves. 4 OF 8

1 Below is the full output for regression Model Summary Model Adjusted R Std. Error of the R R Square Square Estimate 1.985 a.969.963 4.45003 a. Predictors: (Constant), height of waves ANOVA b Model Sum of Squares df Mean Square F Sig. 1 Regression 3134.415 1 3134.415 158.282.000 a Residual 99.014 5 19.803 Total 3233.429 6 a. Predictors: (Constant), height of waves b. Dependent Variable: number of surfers in water Coefficients a Model Standardized Unstandardized Coefficients Coefficients B Std. Error Beta t Sig. 1 (Constant) -12.102 3.514-3.444.018 height of waves 3.396.270.985 12.581.000 a. Dependent Variable: number of surfers in water TR value for step 6 and 7 The TR value is lower number under the T column: 12.581. The slope (b)= 3.396 and is found as described above. S b is the lower number under the Std. Error column:.270 5 OF 8

TR= b-b Ho = 3.396 - O= 3.396= 12.58 S b 0.270 0.270 p-value for step 7 We find the p-value as the lower number under the Sig. column. If it is less than our level of significance (α) it is significant. If it is greater than our α it IS NOT significant. (Our p-value is not =.000! SPSS ran out of decimal points!) Our p value <.001 which is less than.05 and is SIGNIFICANT!! Regression line or y-hat line equation The values for regression line are also found in the very bottom box of the regression output. The value next to constant is the y-intercept and the slope is directly below that. I have highlighted each in blue like this. So for y-hat=mx+b where m = slope and b = y intercept y-hat= 3.396X +(-12.102) now because plus a minus is actually a minus the correct equation for the line is y-hat=3.396x -12.102 Finding r 2 Look at the second box "Model Summary." For this example our r 2 or coefficient of determination is under the R square column and r 2 =.969 (0.969x100=96.9 or 96.9%). Finally we can explain the r 2 value in English -- relating it to our variables!!! In this case we would say that 96.9% of the total variation in # of surfers in the water at Waimea Bay is explained by variations in wave height. By the way you can also find the r from the SPSS output above; it is the r or coefficient of correlation. In this case r =.985 -- a very strong positive relationship. 6 OF 8

What a correlation tests in plain English Recall that correlations test for association between two variables. As noted on the first page of the main lecture, In correlation we see how closely the two variables under examination are associated. All association means is that changes in one variable are associated with changes in another. We might expect that ability to drive a car is associated with the number of drinks one has. A change in one variable is associated with a change in another variable. So in this case we would be seeing if changes in the height of waves at Waimea Bay are associated with changes in the number of surfers in the water at Waimea Bay To have SPSS create correlation output While you can find all of this information in the SPSS regression output above, SPSS also has separate correlation output. Here is how to get it all in one place: Move both of your variables over into the Variables box the push OK. 7 OF 8

Below is the full SPSS output for a regression analysis: Correlations height of waves number of surfers in water height of waves Pearson Correlation 1.985 ** Sig. (2-tailed).000 N 7 7 number of surfers in water Pearson Correlation.985 ** 1 **. Correlation is significant at the 0.01 level (2-tailed). Sig. (2-tailed).000 N 7 7 The strength of r actually depends upon previous research, so I will give you a contrived or pretend rule that we will use for this class only. But remember for the r or coefficient of correlation to be a meaningful number we must have a "significant correlation" (p<.05 or p< your alpha from step 2)! Recall the sign of r matters! 0.00< r <0.33 = weak relationship (the mathematical sign (+ or -) determines whether or not it s a positive or negative relationship) 0.34< r <0.66 = moderate relationship (the mathematical sign (+ or -) determines whether or not it s a positive or negative relationship) 0.67< r <1.0 = strong relationship (the mathematical sign (+ or -) determines whether or not it s a positive or negative relationship) The R from the SPSS output above is the r or coefficient of correlation. In this case r =.985. We can see that the R=.985 again and most importantly we can see the p-value for the correlation test in the second "Sig. (2-tailed)" box. (Our p- value is not =.000! SPSS ran out of decimal points!) Our p value <.001 which is less than.05 and is SIGNIFICANT!! and.98457 is very close to a perfect correlation of +1 we can say represents "a very strong positive relationship." 8 OF 8