12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Similar documents
1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Univariate Regression

Simple Linear Regression, Scatterplots, and Bivariate Correlation

SPSS Guide: Regression Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Simple linear regression

The importance of graphing the data: Anscombe s regression examples

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Example: Boats and Manatees

Homework 11. Part 1. Name: Score: / null

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Chapter 7: Simple linear regression Learning Objectives

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Premaster Statistics Tutorial 4 Full solutions

2. Simple Linear Regression

Mario Guarracino. Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

Introduction to Linear Regression

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Module 5: Multiple Regression Analysis

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

Introduction to Linear Regression

5. Multiple regression

Analysing Questionnaires using Minitab (for SPSS queries contact -)

Introducing the Linear Model

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Elementary Statistics Sample Exam #3

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

The Dummy s Guide to Data Analysis Using SPSS

General Regression Formulae ) (N-2) (1 - r 2 YX

Introduction to Regression and Data Analysis

Simple Linear Regression Inference

Additional sources Compilation of sources:

Regression Analysis: A Complete Example

Correlation and Regression

Simple Regression Theory II 2010 Samuel L. Baker

Section 13, Part 1 ANOVA. Analysis Of Variance

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

UNDERSTANDING THE TWO-WAY ANOVA

Econometrics Simple Linear Regression

Statistics. Measurement. Scales of Measurement 7/18/2012

Part 2: Analysis of Relationship Between Two Variables

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Chapter 23. Inferences for Regression

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

Outline: Demand Forecasting

Correlation and Regression Analysis: SPSS

Scatter Plot, Correlation, and Regression on the TI-83/84

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Module 3: Correlation and Covariance

Copyright 2013 by Laura Schultz. All rights reserved. Page 1 of 7

Correlation key concepts:

DISCRIMINANT FUNCTION ANALYSIS (DA)

Exercise 1.12 (Pg )

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Part Three. Cost Behavior Analysis

January 26, 2009 The Faculty Center for Teaching and Learning

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Introduction to Data Analysis in Hierarchical Linear Models

Graphing Linear Equations

What does the number m in y = mx + b measure? To find out, suppose (x 1, y 1 ) and (x 2, y 2 ) are two points on the graph of y = mx + b.

Regression III: Advanced Methods

2013 MBA Jump Start Program. Statistics Module Part 3

DATA INTERPRETATION AND STATISTICS

Introduction to Quantitative Methods

When to use Excel. When NOT to use Excel 9/24/2014

Homework 8 Solutions

Multiple Linear Regression

Directions for using SPSS

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared.

Using Excel for Statistical Analysis

y = a + bx Chapter 10: Horngren 13e The Dependent Variable: The cost that is being predicted The Independent Variable: The cost driver

Regression Analysis (Spring, 2000)

Simple Methods and Procedures Used in Forecasting

THE KRUSKAL WALLLIS TEST

Formula for linear models. Prediction, extrapolation, significance test against zero slope.

II. DISTRIBUTIONS distribution normal distribution. standard scores

Eight things you need to know about interpreting correlations:

Multiple Regression: What Is It?

Linear Equations. Find the domain and the range of the following set. {(4,5), (7,8), (-1,3), (3,3), (2,-3)}

An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression

Section 1.1 Linear Equations: Slope and Equations of Lines

1 Simple Linear Regression I Least Squares Estimation

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Association Between Variables

Plot the following two points on a graph and draw the line that passes through those two points. Find the rise, run and slope of that line.

Descriptive Statistics

Data Analysis Tools. Tools for Summarizing Data

Chapter 5 Estimating Demand Functions

Transcription:

PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Understand linear regression with a single predictor Understand how we assess the fit of a regression model Total Sum of Squares Model Sum of Squares Residual Sum of Squares F R 2 Understand how to conduct a simple linear regression using SPSS Understand how to interpret a regression model A way of predicting the value of one variable from another It is a hypothetical model of the relationship between two variables The model used is a linear one Therefore, we describe the relationship using the equation of a straight line 1

Y = A + BX + e A Y-Intercept (value of Y when X = 0) Point at which the regression line crosses the Y-axis (ordinate) B Regression coefficient for the predictor Gradient (slope) of the regression line Direction/Strength of Relationship e Error term for the model This scatterplot shows the association between the size of a spider and the anxiety response it elicited in participants The regression line shows the best fitting line (i.e., the line that minimizes the distance between itself and the data points) The vertical lines show the differences (or residuals) between the actual data and the predicted values 2

The regression line is only a model based on the data This model might not reflect reality We need some way of testing how well the model fits the observed data How? SS Total Total variability: variability between scores and the mean SS Residual Residual/Error variability: variability between the regression model and the actual data SS Model Model variability: difference in variability between the model and the mean SS T Total Variance In The Data SS M Improvement Due to the Model SS R Error in Model If the model results in better prediction than using the mean, then we expect SS M to be much greater than SS R 3

Mean Squared Error Sums of Squares are total values They can be expressed as averages called Mean Squares (MS) To get MS, the SS is divided by its corresponding degrees of freedom MS M = df M is the number of predictor variables in the model MS R = df R is the number of observations minus the number of parameters being estimated (i.e., the number of regression coefficients including the constant) R 2 The proportion of variance accounted for by the regression model A record company boss was interested in predicting record sales from advertising Data 200 different album releases Outcome variable: Sales (CDs and Downloads) in the week after release Predictor variable: The amount (in s) spent promoting the record before release 4

Simple linear regression has a number of distributional assumptions which can be examined by looking at a residual plot (i.e., plot showing the differences between the obtained and predicted values of the criterion variable) 1. The residuals should be normally distributed around the predicted criterion scores (normality assumption) 2. The residuals should have a horizontal-line relationship with predicted criterion scores (linearity assumption) 3. The variance of the residuals should be the same for all predicted criterion scores (homoscedasticity assumption) Plots of predicted values of the criterion variable (Y) against residuals A. Shows that the assumptions of regression are met B. Shows that the data fails to meet the normality assumption C. Shows that the data fails to meet the linearity assumption D. Shows that the data fails to meet the homoscedasticity assumption 5

6

This is the R which reflects the degree of association between our regression model and the criterion variable This is R 2 which represents the amount of variance in the criterion variable that is explained by the model (SS M ) relative to how much variance there was to explain in the first place (SS T ). To express this value as a percentage, you should multiply this value by 100 (which will tell you the percentage of variation in the criterion variable that can be explained by the model). This is Adjusted R 2 which corrects R 2 for the number of predictor variables included in the model. It will always be less than or equal to R 2. It is a more conservative estimate of model fit because it penalizes researchers for including predictor variables that are not strongly associated with the criterion variable 7

This is the standard error of the estimate which is a measure of the accuracy of predictions. The larger the standard error of the estimate, the more error in our regression model SS M MS M SS R MS R SS T The ANOVA tells us whether the model, overall, results in a significantly good degree of prediction of the outcome variable. However, the ANOVA does not tell us about the individual contribution of variables in the model (although in this simple case there is only one variable in the model which allows us to infer that this variable is a good predictor) This is the Y-intercept (i.e., the place where the regression line intercepts the Y-axis). This means that when 0 in advertising is spent, the model predicts that 134,140 records will be sold (unit of measurement is thousands of records ) 8

This t-test compares the value of the Y-intercept with 0. If it is significant, then it means that the value of the Y-intercept (134.14 in this example) is significantly different from 0 This is the unstandardized slope or gradient coefficient. It is the change in the outcome associated with a one unit change in the predictor variable. This means that for every 1 point the advertising budget is increased (unit of measurement is thousands of pounds) the model predicts an increase of 96 records being sold (unit of measurement for record sales was thousands of records) This is the standardized regression coefficient (β). It represents the strength of the association between the predictor and the criterion variable. If there is only one predictor, then β is equal to the Pearson product-moment correlation coefficient (the closer β is to +1 or -1, then the better the prediction of Y from X [or X from Y]) 9

This t-test compares the magnitude of the standardized regression coefficient (β) with 0. If it is significant, then it means that the value of β (0.578 in this example) is significantly different from 0 (i.e., the predictor variable is significantly associated with the criterion variable) Y = A + BX Record Sales = A + B(Advertising Budget) Record Sales = 134.14 + (0.09612 x Advertising Budget) Expected record sales with 0 advertising budget 134.14 + (0.09612 x 0) = 134.14 + 0 = 134.14 = 134,140 records Expected record sales with 100,000 advertising budget 134.14 + (0.09612 x 100) = 134.14 + 9.612 = 143.752 = 143,752 records Expected record sales with 500,000 advertising budget 134.14 + (0.09612 x 500) = 134.14 + 48.06 = 182.2 = 182,200 records 10