Chapter 10. The relationship between TWO variables. Response and Explanatory Variables. Scatterplots. Example 1: Highway Signs 2/26/2009


 Bartholomew Stevenson
 1 years ago
 Views:
Transcription
1 Chapter 10 Section 102: Correlation Section 103: Regression Section 104: Variation and Prediction Intervals The relationship between TWO variables So far we have dealt with data obtained from one variable (either categorical or quantitative). In this chapter we will explore the relationship between two quantitative variables. 1 2 Response and Explanatory Variables In most studies involving two variables, each of the variables has a role. We distinguish between: the response variable  the outcome of the study the explanatory variable  the variable that claims to explain, predict or affect the response. Scatterplots In a scatterplot one axis is used to represent each of the variables, and the data are plotted as points on the graph. Typically, the explanatory or independent variable is plotted on the x axis and the response or dependent variable is plotted on the y axis. 4 Example 1: Highway Signs A Pennsylvania research firm conducted a study in which 30 drivers (of ages 18 to 82 years old) were sampled and for each one the maximum distance at which he/she could read a newly designed sign was determined. The goal of this study was to explore the relationship between driver's age and the maximum distance at which signs were legible, and then use the study's findings to improve safety for older drivers. Since the purpose of this study is to explore the effect of age on maximum legibility distance, the explanatory variable is Age, and the response variable is Distance. 32 1
2 Scatterplot Example 2 Here we have two quantitative variables for each of 16 students. How many beers they drank Their blood alcohol level We are interested in the relationship between the two variables: how is one affected by changes in the other one? Student Number of Beers Blood Alcohol Level Scatterplot example Some plots don t have clear explanatory and response variables. Student Beers BAC Response(dependent) variable Explanatory (independent) variable Do calories explain sodium amounts in hot dogs? Describing a Scatterplot Form: general shape linear clusters nonlinear no relationship 2
3 Direction Strength The strength of the relationship is determined by how closely the data follow the form of the relationship. A positive (or increasing) relationship means that an increase in one of the variables is associated with an increase in the other. A negative (or decreasing) relationship means that an increase in one of the variables is associated with a decrease in the other Deviation from the pattern Back to Example 1 Form: linear Direction: negative Outliers Strength: moderately strong do not appear to be any outliers Back to Example 2 Form: linear Direction: positive Strength:strong do not appear to be any outliers This is a weak relationship. For a particular state median household income, you can t predict the state per capita income very well. This is a very strong relationship. The daily amount of gas consumed can be predicted quite accurately for a given temperature value. 17 3
4 How to scale a scatterplot Same data for all four plots: Using an inappropriate scale for a scatterplot can give an incorrect impression. How to scale a scatterplot The straightline pattern in the lower plot appears stronger because of the surrounding space. Both variables should be given a similar amount of space: Plot roughly square Points should occupy all the plot space (no blank space) Example 3 Example 4 Form: linear Direction: positive Strength: weak Outliers: 3 Form: linear Direction: negative Strength: mediumstrong Outliers: no Adding categorical variables to scatterplots + for northeastern states for midwestern states The correlation coefficient, r The correlation coefficient is a measure of the direction and strength of a linear relationship. It is calculated using the mean and the standard deviation of both the x and y variables. The formal name for r is the Pearson product moment correlation coefficient. It is named after the English statistician Karl Pearson ( )
5 Correlation Back to Ex.1 Calculation: r is calculated using the following formula: r = 1 n 1 x x y y sx s y = 1 n 1 z x z y It looks scary, I know, but here s the basic idea: convert x and y to standardized values (zscores), and find their average product (well, almost, divide by (n1)). r ranges from 1 to +1 r quantifies the strength and direction of a linear relationship between two quantitative variables. Caution using correlation Use correlation only for linear relationships. Strength: How closely the points follow a straight line. Direction is positive when individuals with higher x values tend to have higher values of y. Influential points Correlations are calculated using means and standard deviations and thus are NOT resistant to outliers. Just moving one point away from the general trend here decreases the correlation from 0.91 to Properties of r Correlation requires that both variables be quantitative r has no units only measures the strength of a linear relationship ranges from 1 to 1 r is negative if the form of the relationship is negative r is positive if the form of the relationship is positive r is closer to 1 when the correlation is strong r is unchanged if you interchange x and y r is unchanged if you make a linear change of scale (ex. from feet to inches) The correlation is heavily influenced by outliers. 5
6 How to find r using the calculator 1 st step: enter you two lists (explanatory and response variables) STAT EDIT 1: Edit L1: enter your values of the explanatory variable, L2: enter your values of the response variable 2 nd step: find the correlation coefficient STAT CALC 8: LinReg(a+bx) LinReg(a+bx) L1,L2 r is the correlation coefficient BUT association does not imply causation! Even if two variables have a high correlation coefficient, it does not mean that the explanatory variable CAUSED the changes in the response variable Association does not imply causation! Example 1: During the months of March and April of a certain year, the weekly weight increases of a puppy in New York were collected. For the same time frame, the retail price increases of snowshoes in Alaska were collected. The data was examined and was found to have a very strong linear correlation. The weight of a growing puppy in New York (in pounds) The retail price of snowshoes in Alaska (in dollars) So, this must mean that the weight increase of a puppy in New York is causing snowshoe prices in Alaska to increase, or the price increases of snowshoes are causing the puppy's weight to increase. Of course this is not true! The moral of this example is: Be careful what you infer from your statistical analyses. Unfortunately, usually the situation is not as obvious as this one. Be sure your relationship makes sense. Also keep in mind that other factors may be involved in a potential cause and effect relationship. Association does not imply causation! Example 2: In the early 1930s the relationship between the human population (response variable) of Oldenburg, Germany, and number of storks nesting in the town (explanatory variable) was investigated. The correlation coefficient turned out to be Does this mean that storks bring babies? Can you give a possible explanation for this strong association?
7 The thymus example (shocking) The thymus, a gland in your neck, unlike other organs of the body, doesn t get larger as you grow it actually gets smaller. Imagine the situation: many infants are dying of what seem to be respiratory obstructions, so doctors begin to do autopsies on infants who die with respiratory symptoms. They have done many autopsies in the past on adults who died of various causes, so they decide to rely on those autopsy results for comparison. What stands out most when they did autopsies on the infants is that they all have thymus glands that look too big in comparison to their body size. So they concluded that the respiratory problems are caused by an enlarged thymus. It became quite common in the early 1900s for surgeons to treat respiratory problems in children by removing the thymus. In particular, in 1912, Dr. Charles Mayo published an article recommending removal of the thymus. He made this recommendation even though a third of the children who were operated on died. What s the lurking variable in this shocking example? What could be a lurking variable in these examples? There is a strong positive correlation between the foot length of K12 students and reading scores. Students who use tutors have lower test scores than students who don t. A survey shows a strong positive correlation between the percentage of a country's inhabitants that use cell phones and the life expectancy in that country. Important: Association does not imply causation! One of the most common mistakes people make is when they observe a high correlation between two variables and conclude that one must be causing the other. Scatterplots and correlation do NOT demonstrate causation. It s hard to establish the nature and direction of causation, and there is always the risk of overlooking lurking variables Simpson s Paradox A relationship between two variables that holds for each individual value of a third variable can be changed or even reversed when the data for all values of the third variable are combined. This is Simpson s paradox. Simpson s paradox is an example of the effect of lurking variables on an observed association. Simpson s paradox Simpson s paradox is a severe form of confounding in which there is a reversal in the direction of an association caused by a lurking variable. Overall direction of association: positive But when we color different habitats in different colors, the data is separated by a lurking variable (different habitats) into a series of negative linear associations
8 Simpson s Paradox Example: Is acceptance into a college (response variable) predicted by gender (explanatory variable)? Consider these data: Success Failure Total Male Female Proportions accepted by gender: Male success rate = 198 / 360 = 0.55 Female success rate = 88 / 200 = 0.44 Conclude: males were accepted at a higher rate than females. 43 Broken down according to the lurking variable "major " Success Failure Total Male Female Business Success Failure Total Male Female Male proportion = 18 / 120 = 0.15 Female proportion = 24 / 120 = 0.20 Therefore: males were accepted at a lower rate than females. Art Success Failure Total Male Female Male proportion = 180 / 240 = 0.75 Female proportion = 64 / 80 = 0.80 Therefore: males were accepted at a lower rate than females. 44 Summary of causation Association does not imply causation! Association does not imply causation! Association does not imply causation! The issue of lurking variables and Simpson's paradox occur equally in both quantitative and categorical situations. So, in either case, be careful with your conclusion, and remember: Association does not imply causation! Explanatory variables A researcher wants to know if taking increasing amounts of ginkgo biloba will result in increased capacities of memory ability for different students. He administers it to the students in doses of 250 milligrams, 500 milligrams, and 1000 milligrams. What is the explanatory variable in this study? a) Amount of ginkgo biloba given to each student. b) Change in memory ability. c) Size of the student s brain. d) Whether the student takes the ginkgo biloba. 45 Numeric bivariate data The first step in analyzing numeric bivariate data is to a) Measure strength of linear relationship. b) Create a scatterplot. c) Model linear relationship with regression line. Scatterplots Look at the following scatterplot. Choose which description BEST fits the plot. a) Direction: positive, form: linear, strength: strong b) Direction: negative, form: linear, strength: strong c) Direction: positive, form: nonlinear, strength: weak d) Direction: negative, form: nonlinear, strength: weak e) No relationship 8
9 Scatterplots Look at the following scatterplot. Choose which description BEST fits the plot. Scatterplots Look at the following scatterplot. Choose which description BEST fits the plot. a) Direction: positive, form: nonlinear, strength: strong b) Direction: negative, form: linear, strength: strong c) Direction: positive, form: linear, strength: weak d) Direction: positive, form: nonlinear, strength: weak e) No relationship a) Direction: positive, form: nonlinear, strength: strong b) Direction: negative, form: linear, strength: strong c) Direction: positive, form: linear, strength: weak d) Direction: positive, form: nonlinear, strength: weak e) No relationship Scatterplots Which of the following scatterplots displays the stronger linear relationship? Correlation For which of the following situations would it be appropriate to calculate r, the correlation coefficient? a) Plot A b) Plot B a) Time spent studying for statistics exam and score on the exam. b) Income for county employees and their respective counties. c) Eye color and hair color of selected participants. d) Party affiliation of senators and their vote on presidential impeachment. c) Same for both Correlation What is a FALSE statement about r, the correlation coefficient? Correlation Which scatterplot would give a larger value for r? a) It is a product of zscores of X and Y. b) It can range in value from 1 to 1. c) It measures the strength and direction of the linear relationship between X and Y. d) It is measured in units of the X variable. a) Plot A b) Plot B c) It would be the same for both plots. 9
10 Correlation True or False? Computing r as a measure of the strength of the relationship between X and Y is appropriate for the data in the following scatterplot: Correlation tells us about strength (scatter) and direction of the linear relationship between two quantitative variables. a) True b) False In addition, we would like to have a numerical description of how both variables vary together. For instance, is one variable increasing faster than the other one? And we would like to make predictions based on that numerical description. But which line best describes our data? A regression line Example 1 revisited A regression line is a straight line that describes how a response variable y changes as an explanatory variable x changes. We often use a regression line to predict the value of y for a given value of x. Example 1. again Example 2 revisited Which line to use? In most cases, no line will pass exactly through all the points in a scatterplot. Different people will draw different lines by eye. We need a way to draw a regression line that doesn t depend on our guess as to where the line should go. We will call this best line the Leastsquares regression line
11 Leastsquares Regression Line For a set of data points (x,y) the least squares regression line is a line for which the sum of squared errors is as small as possible. Equation of the Leastsquares Regression Line $y = b + b x y ˆ = a+ bx Predicted value 0 1 Book s notation Calculator s notation All we need to do is calculate the intercept a, and the slope b. How to find a and b using the calculator 1 st step: enter you two lists (explanatory and response variables) STAT EDIT 1: Edit L1: enter your values of the explanatory variable, L2: enter your values of the response variable 2 nd step: find the correlation coefficient STAT CALC 8: LinReg(a+bx) LinReg(a+bx) L1,L2 a is the intercept, b is the slope Other way to find a and b: First we calculate the slope of the line, b = r s y b, from statistics we already know: r is the correlation s x s y is the standard deviation of the response variable y s x is the the standard deviation of the explanatory variable x Once we know b, the slope, we can calculate a, the yintercept: a = y bx where x and y are the sample means of the x and y variables 63 Facts about leastsquares regression Ex.1 AGAIN y The distinction between explanatory and response variables is essential in regression. The leastsquares regression line always passes through the point ( x, y) ˆ y = a + bx a = 576 b = 3 $y = 576 3x Distance = 576 feet 3 Age x 65 11
12 Prediction: Interpolation The equation of the leastsquares regression allows you to predict y for any x within the range studied. This is called interpolating. Prediction: Interpolation Predict the maximum distance at which a sign is legible for a 60 year old. Distance = 576 feet 3 Age Predicted distance = 576 feet = feet is our best prediction for the maximum distance at which a sign is legible for a 60 year old Prediction Ex.1 Predict the maximum distance at which a sign is legible for a 90 year old. Distance = 576 feet 3 Age Predicted distance = 576 feet = feet is our best prediction for the maximum distance at which a sign is legible for a 90 year old. BUT But this prediction is NOT RELIABLE. It is called EXTRAPOLATION. 69 Extrapolation Extrapolation is the use of a regression line for predictions outside the range of x values used to obtain the line. This can be a very silly thing to do, as seen here.!!!!!! Example 2 AGAIN y$ = x Nobody in the study drank 6.5 beers, but by finding the value of ŷ from the regression line for x = 6.5, we would expect a blood alcohol content of mg/ml. 12
13 Residuals The distances from each point to the leastsquares regression are called residuals. The sum of these residuals is always 0. Points above the line have a positive residual. Ex.1 AGAIN $y = = 480 $y y Points below the line have a negative residual. ^ Predicted y Observed y dist. ( y yˆ) = residual residual y y$ = = Sum of squared errors Which leastsquares regression line would have a smaller sum of squared errors? a) The line in Plot A. b) The line in Plot B. c) It would be the same for both plots. Slope Look at the following scatterplot. What would be a correct interpretation of the slope? a) As we increase our CO content by 1 mg, we increase the tar content by 1.01 mg. b) As we increase our CO content by 0.66 mg, we increase the tar content by 1.01 mg. c) As we increase our CO content by 0.66 mg, we increase the tar content by 0.66 mg. d) As we increase our CO content by 1 mg, we increase the tar content by 0.66 mg. Residuals Look at the following leastsquares regression line. Compare the residuals from the two Points A and B. a) Point A s would be greater than Point B s. b) Point A s would be less than Point B s. c) Point A s would be equal to Point B s. d) There is not enough information. Residuals Residual equals a) b) c) d) 13
14 Correlation or regression Which of the following measures the direction and strength of the linear association between X and Y? Correlation or regression Which of the following makes no distinction between explanatory and response variables? a) Correlation b) Regression a) Correlation b) Regression Correlation or regression Which of the following is used for prediction? Regression line A regression line always passes through the point a) Correlation b) Regression a) b) c) d) Linear regression The following graph shows the linear relationship between diamond size and price for diamonds size 0.35 carats or less. Using this relationship to predict the price of a diamond that is 1 carat is considered Don t forget, the first test is on next Wednesday, 3/4. It will cover Chapters 1, 2, 3, and 10. a) Extrapolation. b) An influential observation. c) Prediction
Chapter 10  Practice Problems 1
Chapter 10  Practice Problems 1 1. A researcher is interested in determining if one could predict the score on a statistics exam from the amount of time spent studying for the exam. In this study, the
More informationLinear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares
Linear Regression Chapter 5 Regression Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). We can then predict the average response for all subjects
More information, has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results.
BA 275 Review Problems  Week 9 (11/20/0611/24/06) CD Lessons: 69, 70, 1620 Textbook: pp. 520528, 111124, 133141 An SRS of size 100 is taken from a population having proportion 0.8 of successes. An
More informationMind on Statistics. Chapter 3
Mind on Statistics Chapter 3 Section 3.1 1. Which one of the following is not appropriate for studying the relationship between two quantitative variables? A. Scatterplot B. Bar chart C. Correlation D.
More informationChapter 9. Section Correlation
Chapter 9 Section 9.1  Correlation Objectives: Introduce linear correlation, independent and dependent variables, and the types of correlation Find a correlation coefficient Test a population correlation
More informationSection 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationLecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation
Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Display and Summarize Correlation for Direction and Strength Properties of Correlation Regression Line Cengage
More information2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.
Math 1530017 Exam 1 February 19, 2009 Name Student Number E There are five possible responses to each of the following multiple choice questions. There is only on BEST answer. Be sure to read all possible
More informationAP STATISTICS REVIEW (YMS Chapters 18)
AP STATISTICS REVIEW (YMS Chapters 18) Exploring Data (Chapter 1) Categorical Data nominal scale, names e.g. male/female or eye color or breeds of dogs Quantitative Data rational scale (can +,,, with
More informationCorrelation and Regression
Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look
More informationExercise 1.12 (Pg. 2223)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
More informationHomework 8 Solutions
Math 17, Section 2 Spring 2011 Homework 8 Solutions Assignment Chapter 7: 7.36, 7.40 Chapter 8: 8.14, 8.16, 8.28, 8.36 (ad), 8.38, 8.62 Chapter 9: 9.4, 9.14 Chapter 7 7.36] a) A scatterplot is given below.
More informationName: Date: Use the following to answer questions 23:
Name: Date: 1. A study is conducted on students taking a statistics class. Several variables are recorded in the survey. Identify each variable as categorical or quantitative. A) Type of car the student
More informationYiming Peng, Department of Statistics. February 12, 2013
Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More informationElementary Statistics. Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination
Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination What is a Scatter Plot? A Scatter Plot is a plot of ordered pairs (x, y) where the horizontal axis is used
More informationChapter 23. Inferences for Regression
Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily
More information04 Paired Data and Scatter Diagrams
Paired Data and Scatter Diagrams Best Fit Lines: Linear Regressions A runner runs from the College of Micronesia FSM National campus to PICS via the powerplant/nahnpohnmal back road The runner tracks
More informationCorrelation key concepts:
CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)
More informationThe aspect of the data that we want to describe/measure is the degree of linear relationship between and The statistic r describes/measures the degree
PS 511: Advanced Statistics for Psychological and Behavioral Research 1 Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables ( and
More informationSIMPLE REGRESSION ANALYSIS
SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) 
More informationtable to see that the probability is 0.8413. (b) What is the probability that x is between 16 and 60? The zscores for 16 and 60 are: 60 38 = 1.
Review Problems for Exam 3 Math 1040 1 1. Find the probability that a standard normal random variable is less than 2.37. Looking up 2.37 on the normal table, we see that the probability is 0.9911. 2. Find
More informationUnit 11: Fitting Lines to Data
Unit 11: Fitting Lines to Data Summary of Video Scatterplots are a great way to visualize the relationship between two quantitative variables. For example, the scatterplot of temperatures and coral reef
More informationStatistics 151 Practice Midterm 1 Mike Kowalski
Statistics 151 Practice Midterm 1 Mike Kowalski Statistics 151 Practice Midterm 1 Multiple Choice (50 minutes) Instructions: 1. This is a closed book exam. 2. You may use the STAT 151 formula sheets and
More informationCorrelation A relationship between two variables As one goes up, the other changes in a predictable way (either mostly goes up or mostly goes down)
TwoVariable Statistics Correlation A relationship between two variables As one goes up, the other changes in a predictable way (either mostly goes up or mostly goes down) Positive Correlation As one variable
More informationMTH 140 Statistics Videos
MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative
More informationPearson s correlation
Pearson s correlation Introduction Often several quantitative variables are measured on each member of a sample. If we consider a pair of such variables, it is frequently of interest to establish if there
More informationLesson Lesson Outline Outline
Lesson 15 Linear Regression Lesson 15 Outline Review correlation analysis Dependent and Independent variables Least Squares Regression line Calculating l the slope Calculating the Intercept Residuals and
More informationSection 3 Part 1. Relationships between two numerical variables
Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.
More informationExample: Boats and Manatees
Figure 96 Example: Boats and Manatees Slide 1 Given the sample data in Table 91, find the value of the linear correlation coefficient r, then refer to Table A6 to determine whether there is a significant
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationScatter Plot, Correlation, and Regression on the TI83/84
Scatter Plot, Correlation, and Regression on the TI83/84 Summary: When you have a set of (x,y) data points and want to find the best equation to describe them, you are performing a regression. This page
More informationCollege of the Canyons Math 140 Exam 1 Amy Morrow. Name:
Name: Answer the following questions NEATLY. Show all necessary work directly on the exam. Scratch paper will be discarded unread. One point each part unless otherwise marked. 1. Owners of an exercise
More informationStatistiek II. John Nerbonne. March 24, 2010. Information Science, Groningen Slides improved a lot by Harmut Fitz, Groningen!
Information Science, Groningen j.nerbonne@rug.nl Slides improved a lot by Harmut Fitz, Groningen! March 24, 2010 Correlation and regression We often wish to compare two different variables Examples: compare
More informationStats Review Chapters 34
Stats Review Chapters 34 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by Michael Sullivan, III And the corresponding Test
More information2013 MBA Jump Start Program. Statistics Module Part 3
2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just
More information4. Describing Bivariate Data
4. Describing Bivariate Data A. Introduction to Bivariate Data B. Values of the Pearson Correlation C. Properties of Pearson's r D. Computing Pearson's r E. Variance Sum Law II F. Exercises A dataset with
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationAP Statistics Solutions to Packet 3
AP Statistics Solutions to Packet 3 Examining Relationships Scatterplots Correlation LeastSquares Regression HW #15 1, 3, 6, 7, 9, 10 3.1 EPLANATORY AND RESPONSE VARIABLES In each of the following situations,
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Review MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) All but one of these statements contain a mistake. Which could be true? A) There is a correlation
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x  x) B. x 3 x C. 3x  x D. x  3x 2) Write the following as an algebraic expression
More informationAnswer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade
Statistics Quiz Correlation and Regression  ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationIntroductory Statistics Notes
Introductory Statistics Notes Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 354870348 Phone: (205) 3484431 Fax: (205) 3488648 August
More informationRelationships Between Two Variables: Scatterplots and Correlation
Relationships Between Two Variables: Scatterplots and Correlation Example: Consider the population of cars manufactured in the U.S. What is the relationship (1) between engine size and horsepower? (2)
More informationCorrelation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2
Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables
More informationHomework 11. Part 1. Name: Score: / null
Name: Score: / Homework 11 Part 1 null 1 For which of the following correlations would the data points be clustered most closely around a straight line? A. r = 0.50 B. r = 0.80 C. r = 0.10 D. There is
More informationRegents Exam Questions A2.S.8: Correlation Coefficient
A2.S.8: Correlation Coefficient: Interpret within the linear regression model the value of the correlation coefficient as a measure of the strength of the relationship 1 Which statement regarding correlation
More informationChapter 9 Descriptive Statistics for Bivariate Data
9.1 Introduction 215 Chapter 9 Descriptive Statistics for Bivariate Data 9.1 Introduction We discussed univariate data description (methods used to eplore the distribution of the values of a single variable)
More informationSydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.
Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under
More informationSimple Linear Regression in SPSS STAT 314
Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,
More informationCHAPTER 10 REGRESSION AND CORRELATION
CHAPTER 10 REGRESSION AND CORRELATION LINEAR REGRESSION (SECTION 10.1 10.3 OF UNDERSTANDABLE STATISTICS) Important Note: Before beginning this chapter, press y[catalog] (above Ê) and scroll down to the
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationStat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb 2 2015
Stat 412/512 CASE INFLUENCE STATISTICS Feb 2 2015 Charlotte Wickham stat512.cwick.co.nz Regression in your field See website. You may complete this assignment in pairs. Find a journal article in your field
More informationInferential Statistics
Inferential Statistics Sampling and the normal distribution Zscores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are
More informationStatistics 2014 Scoring Guidelines
AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home
More informationThe importance of graphing the data: Anscombe s regression examples
The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 3031, 2008 B. Weaver, NHRC 2008 1 The Objective
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3 Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationDescribing Relationships between Two Variables
Describing Relationships between Two Variables Up until now, we have dealt, for the most part, with just one variable at a time. This variable, when measured on many different subjects or objects, took
More informationChapter 1 Linear Equations and Graphs
Chapter 1 Linear Equations and Graphs Section 1.1  Linear Equations and Inequalities Objectives: The student will be able to solve linear equations. The student will be able to solve linear inequalities.
More informationLecture 13/Chapter 10 Relationships between Measurement (Quantitative) Variables
Lecture 13/Chapter 10 Relationships between Measurement (Quantitative) Variables Scatterplot; Roles of Variables 3 Features of Relationship Correlation Regression Definition Scatterplot displays relationship
More informationSimple Regression and Correlation
Simple Regression and Correlation Today, we are going to discuss a powerful statistical technique for examining whether or not two variables are related. Specifically, we are going to talk about the ideas
More informationSTATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS
STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS Correct answers are in bold italics.. This scenario applies to Questions 1 and 2: A study was done to compare the lung capacity of coal miners
More informationCorrelation & Regression, II. Residual Plots. What we like to see: no pattern. Steps in regression analysis (so far)
Steps in regression analysis (so far) Correlation & Regression, II 9.07 4/6/2004 Plot a scatter plot Find the parameters of the best fit regression line, y =a+bx Plot the regression line on the scatter
More informationWEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y 7.1 1 8.92 X 10.0 0.0 16.0 10.0 1.6
WEB APPENDIX 8A Calculating Beta Coefficients The CAPM is an ex ante model, which means that all of the variables represent beforethefact, expected values. In particular, the beta coefficient used in
More informationSession 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table covariation least squares
More informationThe Big 50 Revision Guidelines for S1
The Big 50 Revision Guidelines for S1 If you can understand all of these you ll do very well 1. Know what is meant by a statistical model and the Modelling cycle of continuous refinement 2. Understand
More informationThe Correlation Coefficient
The Correlation Coefficient Lelys Bravo de Guenni April 22nd, 2015 Outline The Correlation coefficient Positive Correlation Negative Correlation Properties of the Correlation Coefficient Nonlinear association
More informationChapter 7 Scatterplots, Association, and Correlation
78 Part II Exploring Relationships Between Variables Chapter 7 Scatterplots, Association, and Correlation 1. Association. a) Either weight in grams or weight in ounces could be the explanatory or response
More informationCorrelation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers
Correlation Greg C Elvers What Is Correlation? Correlation is a descriptive statistic that tells you if two variables are related to each other E.g. Is your related to how much you study? When two variables
More information2. Simple Linear Regression
Research methods  II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationSimple Regression Theory I 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY I 1 Simple Regression Theory I 2010 Samuel L. Baker Regression analysis lets you use data to explain and predict. A simple regression line drawn through data points In Assignment
More informationCourse Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.
SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationUsing Minitab for Regression Analysis: An extended example
Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to
More informationProject: Linear Correlation and Regression
Project: Linear Correlation and Regression You may very well have studied linear regression before; I know many instructors discuss it in their classes. If the word regression means nothing to you...great!
More informationChapter 5. Regression
Topics covered in this chapter: Chapter 5. Regression Adding a Regression Line to a Scatterplot Regression Lines and Influential Observations Finding the Least Squares Regression Model Adding a Regression
More information17. SIMPLE LINEAR REGRESSION II
17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.
More informatione = random error, assumed to be normally distributed with mean 0 and standard deviation σ
1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.
More informationAP Statistics Semester Exam Review Chapters 13
AP Statistics Semester Exam Review Chapters 13 1. Here are the IQ test scores of 10 randomly chosen fifthgrade students: 145 139 126 122 125 130 96 110 118 118 To make a stemplot of these scores, you
More informationII. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
More informationc. Construct a boxplot for the data. Write a one sentence interpretation of your graph.
MBA/MIB 5315 Sample Test Problems Page 1 of 1 1. An English survey of 3000 medical records showed that smokers are more inclined to get depressed than nonsmokers. Does this imply that smoking causes depression?
More informationLesson 3.2.1 Using Lines to Make Predictions
STATWAY INSTRUCTOR NOTES i INSTRUCTOR SPECIFIC MATERIAL IS INDENTED AND APPEARS IN GREY ESTIMATED TIME 50 minutes MATERIALS REQUIRED Overhead or electronic display of scatterplots in lesson BRIEF DESCRIPTION
More informationLAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE
LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE MAT 119 STATISTICS AND ELEMENTARY ALGEBRA 5 Lecture Hours, 2 Lab Hours, 3 Credits Pre
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Module 7 Test Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. You are given information about a straight line. Use two points to graph the equation.
More informationSimple Predictive Analytics Curtis Seare
Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use
More informationUCLA STAT 13 Statistical Methods  Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates
UCLA STAT 13 Statistical Methods  Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates 1. (a) (i) µ µ (ii) σ σ n is exactly Normally distributed. (c) (i) is approximately Normally
More informationAP Statistics 2001 Solutions and Scoring Guidelines
AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for noncommercial use by AP teachers for course and exam preparation; permission for any other use
More informationAlgebra 1 Course Information
Course Information Course Description: Students will study patterns, relations, and functions, and focus on the use of mathematical models to understand and analyze quantitative relationships. Through
More informationCopyright 2013 by Laura Schultz. All rights reserved. Page 1 of 6
Using Your TINSpire Calculator: Linear Correlation and Regression Dr. Laura Schultz Statistics I This handout describes how to use your calculator for various linear correlation and regression applications.
More informationDescriptive statistics; Correlation and regression
Descriptive statistics; and regression Patrick Breheny September 16 Patrick Breheny STA 580: Biostatistics I 1/59 Tables and figures Descriptive statistics Histograms Numerical summaries Percentiles Human
More information. 58 58 60 62 64 66 68 70 72 74 76 78 Father s height (inches)
PEARSON S FATHERSON DATA The following scatter diagram shows the heights of 1,0 fathers and their fullgrown sons, in England, circa 1900 There is one dot for each fatherson pair Heights of fathers and
More informationIQR Rule for Outliers
1. Arrange data in order. IQR Rule for Outliers 2. Calculate first quartile (Q1), third quartile (Q3) and the interquartile range (IQR=Q3Q1). CO2 emissions example: Q1=0.9, Q3=6.05, IQR=5.15. 3. Compute
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationUNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test March 2014
UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test March 2014 STAB22H3 Statistics I Duration: 1 hour and 45 minutes Last Name: First Name: Student number: Aids
More informationHints for Success on the AP Statistics Exam. (Compiled by Zack Bigner)
Hints for Success on the AP Statistics Exam. (Compiled by Zack Bigner) The Exam The AP Stat exam has 2 sections that take 90 minutes each. The first section is 40 multiple choice questions, and the second
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More information