Regression 2/8/2016. Regression: Predicting the Future. Regression vs. Correlation. Correlation Regression

Similar documents
Homework 11. Part 1. Name: Score: / null

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

table to see that the probability is (b) What is the probability that x is between 16 and 60? The z-scores for 16 and 60 are: = 1.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Univariate Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

2. Simple Linear Regression

The importance of graphing the data: Anscombe s regression examples

Analysing Questionnaires using Minitab (for SPSS queries contact -)

Chapter 7: Simple linear regression Learning Objectives

Using Excel for Statistical Analysis

Correlation key concepts:

Part Three. Cost Behavior Analysis

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Part 2: Analysis of Relationship Between Two Variables

Worksheet A5: Slope Intercept Form

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Module 3: Correlation and Covariance

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

Regression Analysis: A Complete Example

What does the number m in y = mx + b measure? To find out, suppose (x 1, y 1 ) and (x 2, y 2 ) are two points on the graph of y = mx + b.

Econometrics Simple Linear Regression

Name: Date: Use the following to answer questions 2-3:

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

The Dummy s Guide to Data Analysis Using SPSS

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Describing Relationships between Two Variables

Binary Logistic Regression

DATA INTERPRETATION AND STATISTICS

Additional sources Compilation of sources:

CORRELATION ANALYSIS

Statistics. Measurement. Scales of Measurement 7/18/2012

Simple Regression Theory II 2010 Samuel L. Baker

Scatter Plot, Correlation, and Regression on the TI-83/84

SPSS Guide: Regression Analysis

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

Father s height (inches)

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

2013 MBA Jump Start Program. Statistics Module Part 3

Example: Boats and Manatees

Chapter 9 Descriptive Statistics for Bivariate Data

Statistics E100 Fall 2013 Practice Midterm I - A Solutions

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Module 5: Statistical Analysis

Preview. What is a correlation? Las Cucarachas. Equal Footing. z Distributions 2/12/2013. Correlation

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Logs Transformation in a Regression Equation

Exercise 1.12 (Pg )

Dimensionality Reduction: Principal Components Analysis

Session 7 Bivariate Data and Analysis

Homework 8 Solutions

WEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y X

Simple Predictive Analytics Curtis Seare

Relationships Between Two Variables: Scatterplots and Correlation

CALCULATIONS & STATISTICS

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Formula for linear models. Prediction, extrapolation, significance test against zero slope.

Introduction to Linear Regression

4. Multiple Regression in Practice

Factors affecting online sales

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Lean Six Sigma Analyze Phase Introduction. TECH QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY

THE LEAST SQUARES LINE (other names Best-Fit Line or Regression Line )

Descriptive Statistics

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

Mario Guarracino. Regression

Review Jeopardy. Blue vs. Orange. Review Jeopardy

PLOTTING DATA AND INTERPRETING GRAPHS

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Pennsylvania System of School Assessment

STT 200 LECTURE 1, SECTION 2,4 RECITATION 7 (10/16/2012)

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

4. Simple regression. QBUS6840 Predictive Analytics.

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Simple Linear Regression

Copyright PEOPLECERT Int. Ltd and IASSC

17. SIMPLE LINEAR REGRESSION II

Section 3 Part 1. Relationships between two numerical variables

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Linear Equations. Find the domain and the range of the following set. {(4,5), (7,8), (-1,3), (3,3), (2,-3)}

Introduction to Regression and Data Analysis

Elementary Statistics Sample Exam #3

Moderation. Moderation

5. Linear Regression

Simple linear regression

Measurement & Data Analysis. On the importance of math & measurement. Steps Involved in Doing Scientific Research. Measurement

Introduction to Longitudinal Data Analysis

The correlation coefficient

Part 1: Background - Graphing

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Correlation and Regression

Transcription:

Regression PSYC 381 Statistics Arlo Clark-Foos Regression: Predicting the Future Correlation Regression Examples: Car Insurance Age, Male, Car, Driving History WHO & Avian Flu Spread, Poverty Regression vs. Correlation Regression: Prediction Correlation: Relationship Simple Linear Regression Statistical tool that predicts an individual s score on the DV from the score on one IV Uses a straight line if we know x, we can find y 1

Linear Regression Using z Scores A student who knows they will miss X days What can I tell them about their probable exam grade? Linear Regression Using z Scores z ( r yˆ xy )( z x ) ŷ = y hat (predicted score on variable y) r xy = Correlation between x and y z x = z score for a raw score on variable x Linear Regression Using z Scores Note: Predicted z scores for Y are smaller (i.e., closer to the mean) than the actual z scores for X they are regressing to the mean. 2

Regression to the Mean The tendency of scores that are particularly high or low to drift toward the mean over time Teaching Air Force Training Good and Bad Days Flying Operant Conditioning Reward vs. Punishment Linear Regression Using z Scores Regression to the mean The tendency of scores that are particularly high or low to drift toward the mean over time Predicted z score to predicted raw score z X X z( ) Creating a Regression Line y m( x) b Yˆ a b( X ) a = intercept the value of Y when X = 0 b = slope, the amount of increase in Y for every increase of 1 in X 3

Calculating Intercept (a) 1. Calculate a z score for X = 0 z x 2. Calculate predicted z score for Y z ( r yˆ ( 0 M x) SD )( z 3. Calculate predicted raw score from predicted z score Y xy Yˆ z ( SD ) M x Y x ) Y Calculating Slope (b) Repeat steps for X = 1 Rise y2 y1 Slope Run x x 2 1 How does Y-hat change as X goes from 0 to 1? If positive, then the line goes up to the right. If negative, then the line goes down to the right. Drawing a regression line Calculate several pairs of Y-hat and X, then plot them on your scatter plot and draw a straight line through the points. Standardized Slope (β) When comparing regression equations for variables measured on different scales. β = standardized version of slope in a regression equation (st. deviation (σ)units). β = b SS X SS Y 4

Errors in Prediction Predicting the cost of moving to MI from GA Truck Rental, Gas, Hotels Oops pet fee at hotels, food on the way up, furniture pads for truck Standard Error of the Estimate A statistic indicating the typical distance between regression line and actual data points Effect Size of Regression Proportionate Reduction in Error (r 2 ) AKA: Coefficient of determination Statistic that quantifies how much more accurate our predictions are when we use the regression line instead of the mean as a prediction tool. Goal: How accurate is our regression equation at predicting the future? SS Total Total error we have if we use only the mean to predict SS Total ( Y MY 2 ) 5

SS Total Total error we have if we use only the mean to predict SS Error Total error we have if we use Y-hat from regression equation. SS Error 2 ˆ) ( Y Y SS Error Total error we have if we use Y-hat from regression equation. 6

r 2 ( SSTotal SS SS Total Error ) The amount of variance in DV that is explained by the IV Proportion of variance accounted for Multiple Regression & R 2 Y' i = b 0 + b 1 X 1i + b 2 X 2i Using several variables to predict future scores Orthogonal Variable An IV that makes a separate and distinct contribution in the prediction of a DV Stepwise Multiple Regression Software determines the order in which IVs are included in the regression equation Largest significant r 2 comes first Pros: Good if we have no good theory about our predictions Cons: May ignore nonorthogonal, overlapping, variables implying they are unimportant 7

Hierarchical Multiple Regression Researcher uses theory to determine the order in which IVs are included in the regression equation PSYC 465: Age, Gender, Sleep, Depression Pros: Based on theory so it is less likely to identify bad predictors on accident Cons: Sometimes our theory is lacking 8