Simple Regression and Correlation

Similar documents
" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Brunswick High School has reinstated a summer math curriculum for students Algebra 1, Geometry, and Algebra 2 for the school year.

Simple Regression Theory II 2010 Samuel L. Baker

SPSS Guide: Regression Analysis

What does the number m in y = mx + b measure? To find out, suppose (x 1, y 1 ) and (x 2, y 2 ) are two points on the graph of y = mx + b.

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Linear Equations. Find the domain and the range of the following set. {(4,5), (7,8), (-1,3), (3,3), (2,-3)}

Example: Boats and Manatees

1 Shapes of Cubic Functions

Hypothesis testing - Steps

Module 3: Correlation and Covariance

Simple linear regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Univariate Regression

Scatter Plot, Correlation, and Regression on the TI-83/84

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

2. Simple Linear Regression

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

Graphing Linear Equations in Two Variables

Describing Relationships between Two Variables

Pearson s Correlation

Regression Analysis: A Complete Example

Part Three. Cost Behavior Analysis

Algebra 1 If you are okay with that placement then you have no further action to take Algebra 1 Portion of the Math Placement Test

Chapter 7: Simple linear regression Learning Objectives

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Session 7 Bivariate Data and Analysis

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

The correlation coefficient

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

A synonym is a word that has the same or almost the same definition of

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Math 1526 Consumer and Producer Surplus

The Method of Partial Fractions Math 121 Calculus II Spring 2015

Method To Solve Linear, Polynomial, or Absolute Value Inequalities:

Regression and Correlation

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Notes for EER #4 Graph transformations (vertical & horizontal shifts, vertical stretching & compression, and reflections) of basic functions.

What are the place values to the left of the decimal point and their associated powers of ten?

Determine If An Equation Represents a Function

Relationships Between Two Variables: Scatterplots and Correlation

Formula for linear models. Prediction, extrapolation, significance test against zero slope.

Homework 8 Solutions

Student Activity: To investigate an ESB bill

COLLEGE ALGEBRA. Paul Dawkins

Part 2: Analysis of Relationship Between Two Variables

DATA INTERPRETATION AND STATISTICS

Week TSX Index

ALGEBRA 2: 4.1 Graph Quadratic Functions in Standard Form

Part 1 : 07/27/10 21:30:31

Simple Linear Regression

Elements of a graph. Click on the links below to jump directly to the relevant section

How To Run Statistical Tests in Excel

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Part 1 Expressions, Equations, and Inequalities: Simplifying and Solving

Algebraic expressions are a combination of numbers and variables. Here are examples of some basic algebraic expressions.

Chapter 13 Introduction to Linear Regression and Correlation Analysis

PEARSON R CORRELATION COEFFICIENT

Algebra 1 Course Title

Factoring Polynomials and Solving Quadratic Equations

Comparing Nested Models

Factors affecting online sales

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Graphing Rational Functions

Polynomial and Rational Functions

Simple Linear Regression Inference

CORRELATION ANALYSIS

Linear functions Increasing Linear Functions. Decreasing Linear Functions

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Year 9 set 1 Mathematics notes, to accompany the 9H book.

Graphing Linear Equations

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Graphing - Slope-Intercept Form

Correlation key concepts:

1.3 LINEAR EQUATIONS IN TWO VARIABLES. Copyright Cengage Learning. All rights reserved.

Homework 11. Part 1. Name: Score: / null

Module 5: Multiple Regression Analysis

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)

2 Sample t-test (unequal sample sizes and unequal variances)

Homework #1 Solutions

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?

JUST THE MATHS UNIT NUMBER 1.8. ALGEBRA 8 (Polynomials) A.J.Hobson

table to see that the probability is (b) What is the probability that x is between 16 and 60? The z-scores for 16 and 60 are: = 1.

The importance of graphing the data: Anscombe s regression examples

An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression

Examples of Tasks from CCSS Edition Course 3, Unit 5

Springfield Technical Community College School of Mathematics, Sciences & Engineering Transfer

Chapter 7 - Roots, Radicals, and Complex Numbers

Study Guide for the Final Exam

Algebra 2 Year-at-a-Glance Leander ISD st Six Weeks 2nd Six Weeks 3rd Six Weeks 4th Six Weeks 5th Six Weeks 6th Six Weeks

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

Indiana State Core Curriculum Standards updated 2009 Algebra I

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

3.2. Solving quadratic equations. Introduction. Prerequisites. Learning Outcomes. Learning Style

Zero: If P is a polynomial and if c is a number such that P (c) = 0 then c is a zero of P.

0.8 Rational Expressions and Equations

Section 1.1 Linear Equations: Slope and Equations of Lines

3.3. Solving Polynomial Equations. Introduction. Prerequisites. Learning Outcomes

Transcription:

Simple Regression and Correlation Today, we are going to discuss a powerful statistical technique for examining whether or not two variables are related. Specifically, we are going to talk about the ideas of simple regression and correlation. One reason why regression is powerful is that we can use it to demonstrate causality; that is, we can show that an independent variable causes a change in a dependent variable. Simple Regression and Correlation 1

Scattergrams The simplest thing we can do with two variables that we believe are related is to draw a scattergram. A scattergram is a simple graph that plots values of our dependent variable Y and independent variable X. Normally we plot our dependent variable on the vertical axis and the independent variable on the horizontal axis. Simple Regression and Correlation 2

Shilling for the incontinence industry For example, let s make a scattergram of the following data set: Individual No. of Sodas Consumed No. of Bathroom Trips Rick 1 2 Janice 2 1 Paul 3 3 Susan 3 4 Cindy 4 6 John 5 5 Donald 6 5 Simple Regression and Correlation 3

Eyeballing a regression line Just from our scattergram, we can sometimes get a fairly good idea of the relationship between our variables. In our current scattergram, it looks like a line that slopes up to the right would fit the data pretty well. Essentially, that s all of the math we have to do: figure out the best fit line, i.e. the line that represents an average of our data points. Note that sometimes our data won t be linearly related at all; sometimes, there may be a curvilinear or other nonlinear relationship. If it looks like the data are related, but the regression doesn t fit well, chances are this is the case. Simple Regression and Correlation 4

Simple linear regression While our scattergram gives us a fairly good idea of the relationship between the variables, and even some idea of how the regression line should look, we need to do the math to figure out exactly where it goes. To figure it out, first we need an idea of the general equation for a line. From algebra, any straight line can be described as: Y = a + bx, where a is the intercept and b is the slope Simple Regression and Correlation 5

Figuring out a and b The problem of regression in a nutshell is to figure out what values of a and b to use. To do that, we use the following two formulas: XY ( X)( Y ) n b = and a = ȳ b x X2 ( X) 2 n Again, this looks ugly, but it s all the same simple math we already know and love: just use PEMA, and we will get the right answer. Simple Regression and Correlation 6

Solving our example So, let s revisit our example data and figure out the slope and intercept for the regression line. Individual No. of Sodas Consumed No. of Bathroom Trips Rick 1 2 Janice 2 1 Paul 3 3 Susan 3 4 Cindy 4 6 John 5 5 Donald 6 5 Simple Regression and Correlation 7

Solving our example (cont d) b = = XY ( X)( Y ) n X2 ( X) 2 104 (624/7) 100 (576/7) n = 104 (24)(26) 7 100 (24) 2 = 104 89.142 100 82.285 = 14.86 17.71 = 0.8387 7 Now that we have calculated b, calculating a is pretty simple; we just solve a = (26/7) 0.8387(24/7) = 3.7142 (0.8387)(3.4285) = 3.7142 2.8754 = 0.8388. Simple Regression and Correlation 8

Pearson s r Now that we ve found a and b, we know the intercept and slope of the regression line, and it appears that X and Y are related in some way. But how strong is that relationship? That s where Pearson s r comes in. Pearson s r is a measure of correlation; sometimes, we just call it the correlation coefficient. r tells us how strong the relationship between X and Y is. Simple Regression and Correlation 9

Calculating Pearson s r The formula for Pearson s r is somewhat similar to the formula for the slope (b). It is as follows: XY ( X)( Y ) n r = X2 ( X) 2 /n Y 2 ( Y ) 2 /n We already know the numerator from calculating the slope earlier, so the only hard part is the denominator, where we have to calculate each square root separately, then multiply them together. Simple Regression and Correlation 10

Solving for our example So, from our example: r = = XY ( X)( Y ) n X2 ( X) 2 /n Y 2 ( Y ) 2 /n 104 (24)(26) 7 100 (24)2 /7 104 (26) 2 /7 = 14.8572 17.714319.4286 = 14.8572 (4.2088)(4.4077) = 14.8572 18.551 = 0.8008 Simple Regression and Correlation 11

Correlations and determinations A correlation coefficient of around.8 indicates that the two variables are fairly highly associated. If we square r, we get the coefficient of determination r 2, which tells us how much of the variance in Y is explained by X. In this case, r 2 =.6412, which means that we estimate that 64% of the variance is explained by X, while the remainder is due to error. The only other thing we might want to do is find out if the correlation is statistically significant. Or, to put it in terms of a null hypothesis, we want to test whether H 0 : r = 0 is true. Simple Regression and Correlation 12

Significance testing for t To test whether or not r is significantly different from zero, we use the t test for Pearson s r: n 2 t ob = r 1 r 2 Since this is like any other hypothesis test, we want to compare t ob with t crit. For this test, we use our given alpha level (conventionally,.05 or.01) and df = n 2. In this case, we subtract 2 from our sample size because we have two variables. So, with α =.05, is the correlation significant? Simple Regression and Correlation 13

Example significance test t ob = r n 2 7 2 5 1 r 2 =.8008 1.6412 =.8008.3588 =.8008 13.9353 = (.8008)(3.733) = 2.9893 Now, like in other significance tests, we find our critical value of t from the table (α =.05, df = 5 : 2.571) and compare it to the obtained value. Since 2.571 2.9893, we reject the null hypothesis and conclude that the correlation is statistically significant. Simple Regression and Correlation 14

Homework Most of the time when you see regressions, they ll be more complex than this example. Rather than testing the significance of r, with multiple explanatory variables we test the significance of each independent variable, but the principle is exactly the same. We ll talk more about this on next Monday. There is no class on Wednesday. You can spend the time working on your papers, studying for the final, or frolicking in the sunlight. If you want to practice this material, read through the second example and do question 2 on page 180. Simple Regression and Correlation 15