A correlation exists between two variables when one of them is related to the other in some way.

Similar documents
Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

Example: Boats and Manatees

Linear Equations. Find the domain and the range of the following set. {(4,5), (7,8), (-1,3), (3,3), (2,-3)}

Plot the following two points on a graph and draw the line that passes through those two points. Find the rise, run and slope of that line.

What does the number m in y = mx + b measure? To find out, suppose (x 1, y 1 ) and (x 2, y 2 ) are two points on the graph of y = mx + b.

Homework 11. Part 1. Name: Score: / null

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Graphing Linear Equations

Regression Analysis: A Complete Example

Module 5: Statistical Analysis

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Graphing Linear Equations in Two Variables

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Slope-Intercept Equation. Example

Chapter 7: Simple linear regression Learning Objectives

Coordinate Plane, Slope, and Lines Long-Term Memory Review Review 1

ch12 practice test SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

Copyright 2013 by Laura Schultz. All rights reserved. Page 1 of 7

MSLC Workshop Series Math Workshop: Polynomial & Rational Functions

Using Excel for inferential statistics

Section 1.5 Linear Models

Section 1.1 Linear Equations: Slope and Equations of Lines

Regression and Correlation

Graphing Rational Functions

Calculator Notes for the TI-Nspire and TI-Nspire CAS

Simple Regression Theory II 2010 Samuel L. Baker

Straightening Data in a Scatterplot Selecting a Good Re-Expression Model

1.3 LINEAR EQUATIONS IN TWO VARIABLES. Copyright Cengage Learning. All rights reserved.

Graphing Quadratic Functions

Example 1. Rise 4. Run Our Solution

Fairfield Public Schools

PEARSON R CORRELATION COEFFICIENT

Solving Equations Involving Parallel and Perpendicular Lines Examples

Relationships Between Two Variables: Scatterplots and Correlation

Elements of a graph. Click on the links below to jump directly to the relevant section

The Dummy s Guide to Data Analysis Using SPSS

Dealing with Data in Excel 2010

ELEMENTARY STATISTICS

Algebra Cheat Sheets

the Median-Medi Graphing bivariate data in a scatter plot

Temperature Scales. The metric system that we are now using includes a unit that is specific for the representation of measured temperatures.

A synonym is a word that has the same or almost the same definition of

Hypothesis testing - Steps

x x y y Then, my slope is =. Notice, if we use the slope formula, we ll get the same thing: m =

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

Univariate Regression

Activity 5. Two Hot, Two Cold. Introduction. Equipment Required. Collecting the Data


table to see that the probability is (b) What is the probability that x is between 16 and 60? The z-scores for 16 and 60 are: = 1.

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

containing Kendall correlations; and the OUTH = option will create a data set containing Hoeffding statistics.

2. Simple Linear Regression

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Part 2: Analysis of Relationship Between Two Variables

The correlation coefficient

Lecture 8 : Coordinate Geometry. The coordinate plane The points on a line can be referenced if we choose an origin and a unit of 20

2.2 Derivative as a Function

is the degree of the polynomial and is the leading coefficient.

Lesson 4: Solving and Graphing Linear Equations

Comparing Means in Two Populations

MTH 140 Statistics Videos

Correlation and Regression

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Vocabulary Words and Definitions for Algebra

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

Exercise 1.12 (Pg )

Pearson's Correlation Tests

AP Physics 1 and 2 Lab Investigations

Indiana State Core Curriculum Standards updated 2009 Algebra I

with functions, expressions and equations which follow in units 3 and 4.

Online 12 - Sections 9.1 and 9.2-Doug Ensley

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Activity 6 Graphing Linear Equations

CHAPTER 1 Linear Equations

THE COST OF COLLEGE EDUCATION PROJECT PACKET

DATA INTERPRETATION AND STATISTICS

Correlational Research

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Worksheet A5: Slope Intercept Form

2.5 Transformations of Functions

Foundations for Functions

Simple Linear Regression Inference

LOGISTIC REGRESSION ANALYSIS

Example SECTION X-AXIS - the horizontal number line. Y-AXIS - the vertical number line ORIGIN - the point where the x-axis and y-axis cross

Basic Graphing Functions for the TI-83 and TI-84

Using R for Linear Regression

1 Functions, Graphs and Limits

Final Graphing Practice #1

(Least Squares Investigation)

SPSS Guide: Regression Analysis

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Module 5: Multiple Regression Analysis

TI-83/84 Plus Graphing Calculator Worksheet #2

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Factors affecting online sales

Linear Equations. 5- Day Lesson Plan Unit: Linear Equations Grade Level: Grade 9 Time Span: 50 minute class periods By: Richard Weber

What are the place values to the left of the decimal point and their associated powers of ten?

Exponential Growth and Modeling

Transcription:

Lecture #10 Chapter 10 Correlation and Regression The main focus of this chapter is to form inferences based on sample data that come in pairs. Given such paired sample data, we want to determine whether there is a relationship between the two variables and, if so, to identify what the relationship is. We call this relationship correlation. 10-2 Correlation The main objective of this section is to analyze a collection of paired sample data (sometimes called bivariate data) and determine whether there appears to be a relationship between the two variables. A correlation exists between two variables when one of them is related to the other in some way. We can often see a relationship between two variables by constructing a graph called a scatterplot, or scatter diagram. A scatterplot is a graph in which the paired (x, y) sample data are plotted with a horizontal x-axis and a vertical y-axis. Each individual (x, y) pair is plotted as a single point. Example 1: a sociologist conducted a study to determine whether there is a linear relationship between family income level (in thousands of dollars) and percent of income donated to charities. The data are listed in the table. Display the data in a scatterplot and determine the type of correlation. Income Level (in 1000 s ), x 42 48 50 59 65 72 Donating Percent, y 9 10 8 5 6 3 Linear Correlation Coefficient Interpreting correlation using a scatterplot can be subjective. A more precise way to measure the type and strength of a linear correlation between two variables is to calculate the linear correlation coefficient.

The linear correlation coefficient, r, measures the strength of its straight line trend and the direction of the association between the paired x- and y-values in a sample. Where n is the number of pairs of data. Round r to three decimal places. Use (rho) for population linear correlation coefficient. Properties of the linear correlation coefficient r: 1. 2. The closer r is to 1, the closer the data points fall to a straight line, and the stronger is the linear association. In this case, we conclude that there is a significant linear correlation between x and y. 3. If r is close to 0, we conclude that there is no significant linear correlation between x and y. 4. A positive correlation indicates a positive association, and a negative correlation indicates a negative association. 5. The value of the correlation does not depend on the variables unit. Example 2: Calculate the linear correlation coefficient for the income level and donating percent data given in example 1. Hypothesis testing for a population correlation coefficient: Once you have calculated the sample linear correlation coefficient, r, you will want to determine whether the population linear correlation,, is significant.

You can do this by performing a hypothesis test. A hypothesis test for can be one tailed or two tailed. The null and alternative hypotheses for these tests are as follows. H 0 : = 0 (No significant correlation) Two-tailed test H 1 : 0 (significant correlation) H 0 : = 0 (No significant correlation) Left-tailed test H 1 : < 0 (significant negative correlation) H 0 : = 0 (No significant correlation) Right-tailed test H 1 : > 0 (significant positive correlation) The t-test for the correlation coefficient A t-test can be used to test whether the correlation between two variables is significant. The test statistic is The sampling distribution for r is a t-distribution with n-2 degrees of freedom. Guidelines: Using the t-test for the correlation coefficient 1. State H 0 and H 1. 2. Specify. 3. Determine the degrees of freedom. d.f. = n -2 4. Find the critical value(s) from table A-3 with n-2 degrees of freedom and identify the rejection region(s). 5. Find the test statistic.

6. Make a decision to reject or fail to reject the null hypothesis. If the absolute value of the test statistic, t is in the rejection region, reject H 0. Otherwise, fail to reject H 0. 7. Reject H 0 and conclude that there is a linear correlation. Fail to reject H 0 and conclude that there is not sufficient evidence to conclude that there is a linear correlation. Example 3: In example 2, we use the data to find r. Test the significance of this correlation coefficient. Use = 0.05. 10-3 Regression The main objective of this section is to describe the relationship between two variables by finding the graph and equation of the straight line that represents the relationship. This straight line is called the regression line, and its equation is called the regression equation. Given a collection of paired sample data, the regression equation =b 0 +b 1 x algebraically describes the relationship between the two variables. The graph of the regression equation is called the regression line (or line of best fit, or leastsquares line). Equation of the regression line: =b 0 +b 1 x Formula: slope: y-intercept: b 0 = Round b 0 and b 1 to three significant digits. Example 4: find the equation of the regression line for the income level and donating percent data used in example 1, 2, 3.

Applications of regression equations: After finding the equation of a regression line, you can use the equation to predict y-values over the range of the data. Example 5: Use the equation in example 4 to predict the expected donation percent for the following income levels in 1000s). a) 52 b) 69 Prediction values are meaningful only for x-values in (or close to) the range of the data. Procedure for Predicting 1. Calculate the value of r and test the hypothesis that =0. 2. Is =0 rejected (so that there is linear correlation)? i) If the answer to step (2) is yes, then use the regression equation to make predictions. Substitute the given value in the regression equation. ii) If the answer to step (2) in no, then the best predicted value for any given value is the sample mean of the other variable. (If there is not a linear correlation, the best predicted y-value is the mean of the y-values.)