Pearson s Correlation Coefficient



Similar documents
You buy a TV for $1000 and pay it off with $100 every week. The table below shows the amount of money you sll owe every week. Week

The Big Picture. Correlation. Scatter Plots. Data

Scatter Plot, Correlation, and Regression on the TI-83/84

Chapter 13 Introduction to Linear Regression and Correlation Analysis

So, using the new notation, P X,Y (0,1) =.08 This is the value which the joint probability function for X and Y takes when X=0 and Y=1.

Homework 11. Part 1. Name: Score: / null

5.1. A Formula for Slope. Investigation: Points and Slope CONDENSED

Find the Relationship: An Exercise in Graphing Analysis

Module 3: Correlation and Covariance

LESSON EIII.E EXPONENTS AND LOGARITHMS

Solving Quadratic Equations by Graphing. Consider an equation of the form. y ax 2 bx c a 0. In an equation of the form

Correlation key concepts:

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

Univariate Regression

INVESTIGATIONS AND FUNCTIONS Example 1

Foundations for Functions

Zero and Negative Exponents and Scientific Notation. a a n a m n. Now, suppose that we allow m to equal n. We then have. a am m a 0 (1) a m

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Using Excel for inferential statistics

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

Graphing Linear Equations

Section 3 Part 1. Relationships between two numerical variables

MATH REVIEW SHEETS BEGINNING ALGEBRA MATH 60

Exponential and Logarithmic Functions

Session 7 Bivariate Data and Analysis

Graphing Quadratic Equations

How Does My TI-84 Do That

I think that starting

Mathematics Placement Packet Colorado College Department of Mathematics and Computer Science

STATISTICS AND STANDARD DEVIATION

Measures of Central Tendency and Variability: Summarizing your Data for Others

1.6. Piecewise Functions. LEARN ABOUT the Math. Representing the problem using a graphical model

Section V.2: Magnitudes, Directions, and Components of Vectors

January 26, 2009 The Faculty Center for Teaching and Learning

Direct Variation. 1. Write an equation for a direct variation relationship 2. Graph the equation of a direct variation relationship

Mathematical goals. Starting points. Materials required. Time needed

CORRELATION ANALYSIS

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Zeros of Polynomial Functions. The Fundamental Theorem of Algebra. The Fundamental Theorem of Algebra. zero in the complex number system.

EQUATIONS OF LINES IN SLOPE- INTERCEPT AND STANDARD FORM

Linear functions Increasing Linear Functions. Decreasing Linear Functions

CHAPTER TEN. Key Concepts

The correlation coefficient

How to Verify Performance Specifications

C3: Functions. Learning objectives

Common Core Unit Summary Grades 6 to 8

Designer: Nathan Kimball. Stage 1 Desired Results

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

SAMPLE. Polynomial functions

Pre Calculus Math 40S: Explained!

Exercise 1.12 (Pg )

Math 96--Calculator and Exponent Key and Root Key--page 1

Higher. Polynomials and Quadratics 64

When I was 3.1 POLYNOMIAL FUNCTIONS

Section 7.2 Linear Programming: The Graphical Method

Solving Systems of Equations

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared.

Linear Inequality in Two Variables

How To Factor Quadratic Trinomials

NAME DATE PERIOD. 11. Is the relation (year, percent of women) a function? Explain. Yes; each year is

Core Maths C3. Revision Notes

1.6. Piecewise Functions. LEARN ABOUT the Math. Representing the problem using a graphical model

To Be or Not To Be a Linear Equation: That Is the Question

The Correlation Coefficient

2.6. The Circle. Introduction. Prerequisites. Learning Outcomes

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Chapter 9 Descriptive Statistics for Bivariate Data

Statistics. Measurement. Scales of Measurement 7/18/2012

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

THE POWER RULES. Raising an Exponential Expression to a Power

10.1. Solving Quadratic Equations. Investigation: Rocket Science CONDENSED

SECTION 2.2. Distance and Midpoint Formulas; Circles

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Math 152, Intermediate Algebra Practice Problems #1

5.2 Inverse Functions

Core Maths C2. Revision Notes

Lin s Concordance Correlation Coefficient

SECTION 2-2 Straight Lines

Systems of Linear Equations: Solving by Substitution

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

Chapter 5 Discrete Probability Distribution. Learning objectives

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

Moderation. Moderation

LINEAR INEQUALITIES. less than, < 2x + 5 x 3 less than or equal to, greater than, > 3x 2 x 6 greater than or equal to,

Diagrams and Graphs of Statistical Data

INTRODUCTION TO ERRORS AND ERROR ANALYSIS

7.3 Parabolas. 7.3 Parabolas 505

Mathematics 31 Pre-calculus and Limits

PEARSON R CORRELATION COEFFICIENT

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS INTRODUCTION TO STATISTICS MATH 2050

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

2.6. The Circle. Introduction. Prerequisites. Learning Outcomes

Lean Six Sigma Analyze Phase Introduction. TECH QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY

Vector Fields and Line Integrals

Reliability Overview

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Transcription:

Pearson s Correlation Coefficient In this lesson, we will find a quantitative measure to describe the strength of a linear relationship (instead of using the terms strong or weak). A quantitative measure is important when comparing sets of data. The strength of a linear relationship is an indication of how closel the points in a scatter diagram fit a straight line. A measure of this strength is given b a correlation coefficient. There are several was that this correlation coefficient can be found. We will be using the Pearson s product moment correlation coefficient, which is shortened to Pearson s correlation coefficient. It is represented b r. Important properties of r: 1. r does not depend on the units or which variable is chosen as or. 2. r alwas lies in the range [ 1, 1]. 3. A positive r indicates a positive association between the variables. A negative indicates a negative association between the variables. 4. A perfect linear correlation occurs if r = 1 or r = 1, that is, when the scatter plot points lie on a straight line. 5. r onl measures the strength of a linear association between two variables. Thus, if there is not evidence that a linear relationship does eist, then it does not make sense to calculate r, because r would not be meaningful.

The properties of r and corresponding scatter plots can be summarized as follows: r = +1 r = +0.9 r = +0.5 r = 0 r = 1 r = 0.9 r = 0.5 r = 0 The following table provides a good indication of the qualitative description of the strength of the linear relationship and the qualitative value of r. Value of r Qualitative Description of the Strength 1 perfect negative ( 1, 0.75) strong negative ( 0.75, 0.5) moderate negative ( 0.5, 0.25) weak negative ( 0.25, 0.25) no linear association (0.25, 0.5) weak positive (0.5, 0.75) moderate positive (0.75, 1) strong positive 1 perfect positive Just because two variables are highl correlated does not mean that one necessaril causes the other. Remember, one variable has to be consistentl dependent on the other. Just ask ourself, would changing bring about a change in?

For eample, our fatigue during the summer months ma be influenced b the warm temperature of the da. However, if ou happen to eperience a lot of fatigue on some other da, this will not indicate the temperature level of the da. There are several was to determine the value of r from a data set. Remember that ou don't calculate the Pearson's correlation coefficient unless ou have first determined if it is appropriate. You should first use a scatter plot to establish if the data indicates a linear relationship. If the data indicates that there is a possible linear relationship, then the calculation of the r value is appropriate. To compute the value of r, we can use the following formulas: r = ( )( ) 2 ( ) ( ) 2 or s r = ss s = r s s Even though the first formula is a useful form for manuall finding the value of r, it requires ver tedious work. The GDC can produce a value of r with the push of a few buttons. Eample 1: Find the Pearson correlation coefficient for the following set of data. 2 3 4 6 8 9 10 21 19 18 17 15 13 12 First we input the data into the GDC and draw a scatter plot to determine if it is appropriate to use Pearson's correlation coefficient. Clear the home screen b pressing <2 nd ><mode> Press <2 nd ><0> to access the Catalog menu Locate DiagnosticOn Press <Enter><Enter> Press <Stat>, choose Calc, and then select 4: LinReg(a+b) and enter L1,L2

The screen should now displa the value of r. There is also an r 2 on the calculator screen. We need to look at what this r 2 represents. r 2 is called the coefficient of determination. 2 eplained var iation r = total var iation r 2 is a proportion whereas r is the square root of a proportion. r 2 is more informative when interpreting the magnitude of the relation between two variables, regardless of directionalit. For two linearl related variables, r 2 provides the proportion of variation in one variable that can be eplained b the variation in the other variable. For eample, r 2 = 0.947 or 94.7% means that approimatel 95% of the variation in the variable can be eplained b the variation in the variable. The higher this value is the better. For eample, if r = 0.6, then r 2 =0.36 means that onl 36% of the variation in one variable is eplained b the variation in the other variable, so r = 0.6 is not reall ver good. Eample 2: The following table gives the final eam scores and the final averages for 10 students in biolog class. Final Eam Scores Final Averages 66 73 81 86 75 68 62 71 93 95 82 89 51 48 54 63 50 51 90 94

Enter the data into L1 and L2 in the GDC. When we plot the scatter diagram on the GDC, we find that it does appear reasonable to assume that a linear relationship eists. Therefore, calculating the value of r is appropriate. Remember, if there does not appear to be a linear relationship, then calculating the value of r does not make sense. Net, we need to find r and r 2. Remember that to find r and r 2 : Clear the home screen Go to the catalog menu Turn on Diagnostics Press <Stat>, choose Calc, and select LinReg(a+b) Press <Enter> We see that r = 0.9507. This indicates a ver strong positive relationship between the final eam grade and the final average grade. Also, the r 2 = 0.9037953898 tells us that about 90% of the variation in the final average grade can be accounted for b the variation in the final eam grade. That is, onl approimatel 10% of the variation is attributed to other factors. In the net eample, the covariance, s is given. We are going to use the formula below to calculate the correlation coefficient r. Then, we are going to compare this to the coefficient r found using the graphing calculator. r = s ss s = r s s

Eample 3: The following table represents the final averages of ten students in math and science. Given that the covariance, s = 465.66, calculate the product moment correlation coefficient, r, correct to 2 decimal places. First we need to make sure that there appears to be a linear relationship between math and science grades. We enter the grades in the GDC and use a scatter diagram to decide this. Then, find the standard deviations for and. Use <Stat>, Calc, and then select 2 Var Stats L1,L2. WARNING! We want s and s, but in the calculator we will use the σ and σ. σ σ r s = = ss Now, use the GDC to check this value of r. Use Stat, Calc, and LinReg.

Comments about the findings: There is strong evidence from a scatter diagram that a positive linear relationship eists between the final averages for math and science. The value of the correlation coefficient is. This value of r indicates that there is a strong correlation between math and science final averages. This means that those students who do well in mathematics are likel to also do well in science. The value of r 2 is, which indicates that onl % of the variation in the science averages can be attributed to the variation in the math averages.