CH10: Correlation and Regression. CH10: Correlation and Regression Santorico - Page 410

Similar documents
Scatter Plot, Correlation, and Regression on the TI-83/84

You buy a TV for $1000 and pay it off with $100 every week. The table below shows the amount of money you sll owe every week. Week

What does the number m in y = mx + b measure? To find out, suppose (x 1, y 1 ) and (x 2, y 2 ) are two points on the graph of y = mx + b.

Chapter 7: Simple linear regression Learning Objectives

Describing Relationships between Two Variables

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

Academic Support Center. Using the TI-83/84+ Graphing Calculator PART II

Correlation key concepts:

HIBBING COMMUNITY COLLEGE COURSE OUTLINE

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

table to see that the probability is (b) What is the probability that x is between 16 and 60? The z-scores for 16 and 60 are: = 1.

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

10.1. Solving Quadratic Equations. Investigation: Rocket Science CONDENSED

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Regression and Correlation

2. Simple Linear Regression

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Copyright 2013 by Laura Schultz. All rights reserved. Page 1 of 7

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

Homework 11. Part 1. Name: Score: / null

CURVE FITTING LEAST SQUARES APPROXIMATION

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Module 3: Correlation and Covariance

Foundations for Functions

Section 3 Part 1. Relationships between two numerical variables

Review of Fundamental Mathematics

The Point-Slope Form

TIME SERIES ANALYSIS & FORECASTING

3.3. Solving Polynomial Equations. Introduction. Prerequisites. Learning Outcomes

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

Simple linear regression

Pennsylvania System of School Assessment

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

Univariate Regression

Exercise 1.12 (Pg )

Elements of a graph. Click on the links below to jump directly to the relevant section

STAT 350 Practice Final Exam Solution (Spring 2015)

Polynomial and Rational Functions

Microeconomics Sept. 16, 2010 NOTES ON CALCULUS AND UTILITY FUNCTIONS

Exponential Growth and Modeling

Linear Equations. Find the domain and the range of the following set. {(4,5), (7,8), (-1,3), (3,3), (2,-3)}

Application. Outline. 3-1 Polynomial Functions 3-2 Finding Rational Zeros of. Polynomial. 3-3 Approximating Real Zeros of.

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Outline: Demand Forecasting

Pricing I: Linear Demand

7.7 Solving Rational Equations

What are the place values to the left of the decimal point and their associated powers of ten?

How Does My TI-84 Do That

EQUATIONS and INEQUALITIES

Algebra 1 Course Information

MATH 60 NOTEBOOK CERTIFICATIONS

Linear Approximations ACADEMIC RESOURCE CENTER

RUTHERFORD HIGH SCHOOL Rutherford, New Jersey COURSE OUTLINE STATISTICS AND PROBABILITY

FREE FALL. Introduction. Reference Young and Freedman, University Physics, 12 th Edition: Chapter 2, section 2.5

PLOTTING DATA AND INTERPRETING GRAPHS

Solving Quadratic Equations

Session 7 Bivariate Data and Analysis

Straightening Data in a Scatterplot Selecting a Good Re-Expression Model

High School Algebra Reasoning with Equations and Inequalities Solve systems of equations.

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

ALGEBRA I (Common Core) Thursday, January 28, :15 to 4:15 p.m., only

Algebraic expressions are a combination of numbers and variables. Here are examples of some basic algebraic expressions.

Indiana State Core Curriculum Standards updated 2009 Algebra I

ALGEBRA I (Common Core)

UNIT PLAN: EXPONENTIAL AND LOGARITHMIC FUNCTIONS

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

USING A TI-83 OR TI-84 SERIES GRAPHING CALCULATOR IN AN INTRODUCTORY STATISTICS CLASS

2013 MBA Jump Start Program. Statistics Module Part 3

Absorbance Spectrophotometry: Analysis of FD&C Red Food Dye #40 Calibration Curve Procedure

WEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y X

The first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting

7.1 Graphs of Quadratic Functions in Vertex Form

Example: Boats and Manatees

Lecture 13/Chapter 10 Relationships between Measurement (Quantitative) Variables

Title ID Number Sequence and Duration Age Level Essential Question Learning Objectives. Lead In

Using Excel for inferential statistics

Linear Equations. 5- Day Lesson Plan Unit: Linear Equations Grade Level: Grade 9 Time Span: 50 minute class periods By: Richard Weber

Calculator Notes for the TI-Nspire and TI-Nspire CAS

Math 1. Month Essential Questions Concepts/Skills/Standards Content Assessment Areas of Interaction

Simple Predictive Analytics Curtis Seare

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

Father s height (inches)

Valor Christian High School Mrs. Bogar Biology Graphing Fun with a Paper Towel Lab

with functions, expressions and equations which follow in units 3 and 4.

Algebra I Vocabulary Cards

1. Which of the 12 parent functions we know from chapter 1 are power functions? List their equations and names.

CRLS Mathematics Department Algebra I Curriculum Map/Pacing Guide

2013 MBA Jump Start Program

AP Physics 1 and 2 Lab Investigations

Premaster Statistics Tutorial 4 Full solutions

DATA INTERPRETATION AND STATISTICS

AP Statistics. Chapter 4 Review

Module 5: Multiple Regression Analysis

The Circumference Function

Transcription:

CH10: Correlation and Regression CH10: Correlation and Regression Santorico - Page 410

Section 10-1: Paired Data and Scatter Plots Many times we are interested in determining if there is a relationship between two variables. To do this we can collect data consisting of two measurements that are paired with each other. One variable will be the independent variable, x (explanatory), and the second the dependent variable, y (response). Examples: height and weight of individuals, maximum speed limit of each state versus number of car crash deaths per capita CH10: Correlation and Regression Santorico - Page 411

Once we ve collected all the pairs of data listed as (x, y) we can draw a graph to represent the data. This graph is called a scatter plot. A scatter plot is a graph of ordered pairs of data values that is used to determine if a relationship exists between the two variables. Drawing a scatter plot: Step 1: Draw and label the x and y axes Step 2: Plot the points for pairs of data CH10: Correlation and Regression Santorico - Page 412

Example: Create a scatter plot for the following data set Height Hand span 71 23.5 69 22.0 66 18.5 64 20.5 71 21.0 72 24.0 67 19.5 65 20.5 How would you describe the above relationship? CH10: Correlation and Regression Santorico - Page 413

Analyzing the Scatter Plot A positive linear relationship exists when the points fall approximately in an ascending straight line and both x and y values increase at the same time. As the values for the x variable increase the values for the y variable increase CH10: Correlation and Regression Santorico - Page 414

A negative linear relationship exists when the points fall approximately in a descending straight line from left to right. As the values for the x variable increase the values for the y variable decrease CH10: Correlation and Regression Santorico - Page 415

A nonlinear relationship exists when the points fall in a curved line. The relationship is then described by the nature of the curve (e.g. quadratic, cubic, exponential, etc). CH10: Correlation and Regression Santorico - Page 416

No relationship exists when there is no discernable pattern of the points. CH10: Correlation and Regression Santorico - Page 417

How Can We Summarize Strength of Association? When the data points follow roughly a straight line trend, the variables are said to have an approximately linear relationship. If we have a linear relationship we can use the correlation coefficient to help determine the strength of the association. CH10: Correlation and Regression Santorico - Page 418

Correlation Coefficient Computed from the sample data, measures the strength and direction of a linear relationship between two variables. The symbol for the sample correlation coefficient is r, and the symbol for the population correlation coefficient is ρ. The correlation coefficient takes on values between -1 and +1. A positive value for r: A negative value for r: An r value close to +1 or -1 indicates An r value close to 0 indicates CH10: Correlation and Regression Santorico - Page 419

CH10: Correlation and Regression Santorico - Page 420

Calculating the correlation, r r n xy x y 2 2 2 2 n x x n y y where n is the number of data pairs. You will not be required to compute r manually, but you will need to know how to calculate it using your calculator. CH10: Correlation and Regression Santorico - Page 421

TI-83 and TI-84 Directions To compute the correlation, the diagnostic setting must be turned on. Press 2 nd, then 0, this takes you to the catalog. Scroll down to the DiagnosticOn entry and then press ENTER twice. You will only have to do this once! Computing correlation: Type your x variable into L1 and your y variable into L2. Press STAT, highlight CALC, and select LinReg(ax+b) (or press number 4). Type L1, L2 then press ENTER. CH10: Correlation and Regression Santorico - Page 422

Determine the correlation coefficient for the height and hand span data. Height Hand span 71 23.5 69 22.0 66 18.5 64 20.5 71 21.0 72 24.0 67 19.5 65 20.5 CH10: Correlation and Regression Santorico - Page 423

At what point is r high enough to conclude that there is a significant linear relationship between two variables, or the value of r differs from zero due to chance? We can use a hypothesis test to determine the significance of r. See page 536-539. We will not cover this hypothesis test. You should know that it is possible to test whether the relationship is statistically significant (i.e. r is far enough away from 0). CH10: Correlation and Regression Santorico - Page 424

Correlation does not Imply Causation A correlation between x and y means that a linear relationship exists between the two variables (note that this should be verified with a scatter plot, because the correlation coefficient can always be computed no matter what the relationship between x and y is). A correlation between x and y, does not mean that x causes y. Example: beer sales and ice cream sales There is NO proof of causation WITHOUT manipulation (i.e., randomized experiment). CH10: Correlation and Regression Santorico - Page 425

Lurking Variable is a variable, usually unobserved, that influences the association between the variables of primary interest. Example: What could the lurking variable be for the last example? CH10: Correlation and Regression Santorico - Page 426

Confounding variable or confounder: A confounder is related to both the exposure of interest and the outcome, but is not on the causal pathway. (More commonly used term for a lurking variable) Z = confounder Y = outcome X = exposure of interest In the following, Z creates an association between X and Y; however, if Z was controlled for, this association would disappear. X Z Y X Z Y X Z Y CH10: Correlation and Regression Santorico - Page 427

Is organic food the real cause of the increase in Autism? CH10: Correlation and Regression Santorico - Page 428

Section 10-2: Regression Once we have discovered that a linear relationship exists, we can then determine the equation of the regression line, which is the data s line of best fit. The purpose of the regression line is to enable the researcher to make predictions based on the data. CH10: Correlation and Regression Santorico - Page 429

CH10: Correlation and Regression Santorico - Page 430

Line of Best Fit To find the line of best fit we try to the distance from each point to the regression the line. We need a line of best fit so that we can predict values of y from the values of x. Therefore the closer the points are to the line, the better the fit and the better the prediction. CH10: Correlation and Regression Santorico - Page 431

Determination of the Regression Line Equation Recall from algebra that the equation of a line is usually given as y = mx+b where m is and b is. In statistics the equation of the regression line is written as y a bx where a is and b is. CH10: Correlation and Regression Santorico - Page 432

Formulas for the Regression Line a y x 2 x x n x 2 2 xy b n xy x n x 2 y 2 x where a is the y intercept and b is the slope of the line. We can use the calculator to help us find the regression line without using these formulas. We will use the same process we did to find the correlation coefficient r. Note: Round a and b to 3 decimal places! CH10: Correlation and Regression Santorico - Page 433

TI-83 and TI-84 Directions To compute the regression line equation, the diagnostic setting must be turned on. Press 2 nd, then 0, this takes you to the catalog. Scroll down to the DiagnosticOn entry and then press ENTER twice. You will only have to do this once! Computing the regression line equation: Type your x variable into L1 and your y variable into L2. Press STAT, highlight CALC, and select LinReg(ax+b) (or press number 4). Type L1, L2 then press ENTER. CH10: Correlation and Regression Santorico - Page 434

2 4 6 8 10 14 y Notice the output will be for y=ax+b. So the a reported in calculator is the slope and b is the y-intercept. Example: Age and sick days Age, x 18 26 39 48 53 58 Days, y 16 12 9 5 6 2 Note: linear relationship confirmed by scatterplot. 20 30 40 50 CH10: Correlation and Regression Santorico - Page 435 x

Find the equation of the regression line and the correlation coefficient. CH10: Correlation and Regression Santorico - Page 436

Relationship between r and b If r is positive, then b will be positive (and vice versa) If r is negative, then b will be negative (and vice versa) If r is zero, then b will be zero (and vice versa). Predicting a Response Using the Regression Line To predict the value of a new response for some value of the explanatory variable, we simply plug that value of the explanatory variable into our regression equation. The resulting value is the predicted value. CH10: Correlation and Regression Santorico - Page 437

Find the number of sick days predicted for someone who is 30 years old. Find the number of sick days predicted for someone who is 75 years old. CH10: Correlation and Regression Santorico - Page 438

Extrapolation Is Dangerous! Extrapolation: Using a regression line to predict y values for x values outside the observed range of the data. CH10: Correlation and Regression Santorico - Page 439

Example: Collect a sample of heights and weights from male children aged 0 to 5. What would happen if we predicted the height or weight of an adult male? Example: Suppose we give lab rats various levels of amphetamine and observe their subsequent caloric intake for the next hour. Let y = Caloric intake and x = Amphetamine dosage. Why is it a bad idea to extrapolate in this example? CH10: Correlation and Regression Santorico - Page 440

Comical/sad example of extrapolation: If current trends continue, by 2606 the US diet will be 100 percent sugar. CH10: Correlation and Regression Santorico - Page 441

And with that..we ARE DONE WITH COURSE MATERIAL!!!! Important dates: Fri, 4/26: Exam 3 Study session. Email me if you are interested, and we ll find a time to fit the maximal number of schedules. Wednesday, 5/1: Exam 3, covering Chapters 7-10. Dr. Cribari will proctor the exam. Mon, 5/6, and Wed, 5/8: Project presentations (10 minutes allowed / group). Be sure to place presentation in the Dropbox group. Thursday 5/9: Final exam Study session. Email me if you are interested, and we ll try to find a time to fit the maximal number of schedules. Saturday, May 11, 9-12 Uniform Final Exam in MC-2 (Modular Classroom 2, located between the Tivoli and the Athletic Fields) CH10: Correlation and Regression Santorico - Page 442