Chapter 7: Scatter Plots, Association, and Correlation

Similar documents
Scatter Plot, Correlation, and Regression on the TI-83/84

Relationships Between Two Variables: Scatterplots and Correlation

You buy a TV for $1000 and pay it off with $100 every week. The table below shows the amount of money you sll owe every week. Week

Diagrams and Graphs of Statistical Data

Exercise 1.12 (Pg )

Unit 9 Describing Relationships in Scatter Plots and Line Graphs

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

The Correlation Coefficient

Summarizing and Displaying Categorical Data

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

AP STATISTICS REVIEW (YMS Chapters 1-8)

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Describing Relationships between Two Variables

How Does My TI-84 Do That

Years after US Student to Teacher Ratio

2. Simple Linear Regression

Correlation and Regression

Data exploration with Microsoft Excel: analysing more than one variable

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

Linear functions Increasing Linear Functions. Decreasing Linear Functions

Activity 6 Graphing Linear Equations

Pearson s Correlation Coefficient

Copyright 2013 by Laura Schultz. All rights reserved. Page 1 of 7

MTH 140 Statistics Videos

Part 1: Background - Graphing

Chapter 23. Inferences for Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data

Father s height (inches)

Dealing with Data in Excel 2010

Module 3: Correlation and Covariance

The Point-Slope Form

Chapter 7: Simple linear regression Learning Objectives

Section 3 Part 1. Relationships between two numerical variables

Scientific Graphing in Excel 2010

A Guide to Using Excel in Physics Lab

Tutorial for the TI-89 Titanium Calculator

Scatter Plots with Error Bars

How To Write A Data Analysis

Chapter 7 Scatterplots, Association, and Correlation

Simple linear regression

This activity will show you how to draw graphs of algebraic functions in Excel.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

PLOTTING DATA AND INTERPRETING GRAPHS

Plot the following two points on a graph and draw the line that passes through those two points. Find the rise, run and slope of that line.

Pennsylvania System of School Assessment

AP Statistics. Chapter 4 Review

Homework 11. Part 1. Name: Score: / null

Correlation key concepts:

Homework 8 Solutions

Updates to Graphing with Excel

with functions, expressions and equations which follow in units 3 and 4.

Chapter 2: Frequency Distributions and Graphs

Regression and Correlation

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Interactive Excel Spreadsheets:

Creating an Excel XY (Scatter) Plot

Coins, Presidents, and Justices: Normal Distributions and z-scores

Lab 1: The metric system measurement of length and weight

Linear Equations. Find the domain and the range of the following set. {(4,5), (7,8), (-1,3), (3,3), (2,-3)}

SPSS Explore procedure

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

So, using the new notation, P X,Y (0,1) =.08 This is the value which the joint probability function for X and Y takes when X=0 and Y=1.

USING A TI-83 OR TI-84 SERIES GRAPHING CALCULATOR IN AN INTRODUCTORY STATISTICS CLASS

Outline: Demand Forecasting

Algebra I Vocabulary Cards

There are six different windows that can be opened when using SPSS. The following will give a description of each of them.

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Absorbance Spectrophotometry: Analysis of FD&C Red Food Dye #40 Calibration Curve Procedure

Lesson Using Lines to Make Predictions

Foundations for Functions

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

Data Visualization Techniques

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Lecture 13/Chapter 10 Relationships between Measurement (Quantitative) Variables

FREE FALL. Introduction. Reference Young and Freedman, University Physics, 12 th Edition: Chapter 2, section 2.5

How to make a line graph using Excel 2007

The KaleidaGraph Guide to Curve Fitting

LESSON TITLE: Math in Restaurants (by Deborah L. Ives, Ed.D)

Charts, Tables, and Graphs

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

Formulas, Functions and Charts

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Valor Christian High School Mrs. Bogar Biology Graphing Fun with a Paper Towel Lab

Session 7 Bivariate Data and Analysis

Simple Predictive Analytics Curtis Seare

Using Microsoft Excel to Plot and Analyze Kinetic Data

price quantity q The Supply Function price quantity q

Data Visualization Techniques

Chapter 4 Displaying and Describing Categorical Data

Univariate Regression

Spreadsheets and Laboratory Data Analysis: Excel 2003 Version (Excel 2007 is only slightly different)

Grade level: secondary Subject: mathematics Time required: 45 to 90 minutes

SPSS Manual for Introductory Applied Statistics: A Variable Approach

Graphing Parabolas With Microsoft Excel

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

Transcription:

Chapter 7: Scatter Plots, Association, and Correlation Scatterplots compare two quantitative variables in the same way that segmented bar charts compared two categorical variables. You can start observing the relationship between the two variables and visually notice if there is an association.

When we describe scatter plots, we describe their... -Direction -Form -Strength -Outliers

Direction: -When the pattern of the data runs from the bottom left to the upper right, we say it has a POSITIVE direction -When the pattern runs from the upper left to the bottom right, we say it has a NEGATIVE direction -Relate this to the sign of the slope of a line

Form: -When the pattern of the data roughly follows a straight path we say the data is LINEAR -If the data curves it is NON-LINEAR -If the data curves but maintains the same direction it is possible to transform the data to make it more linear -If the data changes direction there is not much that can be done

Strength: -If the data is clustered closely together and a patter is easy to see, it is stronger -If the data is spread apart and a pattern is difficult to see or (in the extreme case) like a cloud - with no discernable pattern, it is weaker

Outliers: -Data that does not fit the overall pattern -Unusual clusters or subgroups should also raise concerns

Converting units on the variables in a scatter plot does not change any of its features

When describing which variables are on the horizontal axis and the vertical axis, we do NOT call them the "x" and "y" variables (or axes). The PREDICTOR or EXPLANATORY variable is placed on the horizontal axis The RESPONSE variable is placed on the vertical axis You determine which is which by asking which variable is more likely to affect the other. Also use clues from the context of the description of the scenario (the W's) Which is the predictor and which is the response variable? Ex1: When comparing the price of a new product to the # of units sold Ex2: When comparing the number of cars damaged on a street to the number of potholes on the street. Ex3: When comparing the fat content of a candy bar to its sugar content *If the roles are unclear you should just select whichever you feel is most likely the predictor and response... *We avoid the terms "x", "y", "dependent variable", and "independent variable" because we never want to give the impression that there is a CAUSE and EFFECT relationship rather than an association between them.

Correlation: CORRELATION is a numerical measure of the linear relationship between two quantitative variables. Before calculating such a value you must make sure that the following conditions are met. 1. The data must be quantitative 2. The scatter plot looks nearly linear 3. There are no outliers **If there is an outlier, best practice is to calculate the value both with and without the outlier The value we calculate is called the CORRELATION COEFFICIENT and is denoted with a lowercase "r" -It is a value ranging anywhere from -1 to 1 -When r is closer to 0 you have a weaker linear association -When r is closer to +/- 1 you have a stronger linear association -The sign indicates direction -Does not have any units (based on z-scores) -Changing units will not change r -You can calculate a correlation coefficient for any pair of variables but if the relationship is not linear the value will be misleading

There IS a formula for calculating the correlation coefficient, but it would take a very long time to calculate even with a small set of data. We will rely on technology to do the tedious work for us anytime we need to calculate this value. Calculated by standardizing (z-score) for all data

Straightening Scatterplots A section of this chapter talks about re-plotting the data to make a more linear graph. Generally speaking you do 'something' to all values of a certain variable, then graph... i.e. graph (x, y 2 ) instead of (x, y) Chapter 10 is dedicated to this so, for now, just know that it can be done and we'll learn more later.

Correlation vs Causation Though tempting, even when you get a large r, you can never say that the predictor variable CAUSED the response variable to change. Ex: In infants, as vocabulary increases, so does appetite *Words make you hungry Ex: Studies show that as ice cream sales rise, so do shark attacks *Eating ice cream makes you tastier to sharks **Posted comments to an online article Beware of Lurking Variables A LURKING VARIABLE is a third variable that affects the both of the two variables in your scatter plot. - recall: Ice cream Sales and Shark Attacks - The temperature will cause people to eat more ice cream AND make people want to go to the beach

Calculators To enter data into your calculator (same as before): STAT button edit menu 1:Edit option...then enter a set of data into a column (L1, L2...) (remember which is the predictor and which is the response) To Calculate the correlation coefficient:...enter data STAT button calc menu 8:8-Linreg (a+bx) option...back on the home screen input the parameters - i.e. L1, L2 *If r is not shown, you need to enable a feature on the calculator Go to the catalogue - Scroll to 'DiagnosticOn' - Press enter (twice) To make a scatter plot: STAT PLOT button (second function of the Y= button) ZOOM then 9 will graph the data in the "best" viewing window