The importance of graphing the data: Anscombe s regression examples

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "The importance of graphing the data: Anscombe s regression examples"

Transcription

1 The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC

2 The Objective To demonstrate that good graphs are an essential part of linear regression analysis. B. Weaver, NHRC

3 Not this kind of regression analysis B. Weaver, NHRC

4 This kind of regression analysis B. Weaver, NHRC

5 A very brief primer on simple linear regression B. Weaver, NHRC

6 Simple linear regression A model in which X is used to predict Y. Y is a continuous variable with interval scale properties. In the prototypical case, X is also a continuous variable with interval-scale properties. Example: Y = distance in a 6-minute walk test X = FEV1 B. Weaver, NHRC

7 Back to high school Equation for a straight line Y = bx + a SLOPE INTERCEPT b = slope of the line = the rise over the run a = the value of Y when X = 0 B. Weaver, NHRC

8 Example of a straight line Gym membership Annual fee = $100 Fee per visit = $2 Let X = the number of visits to the gym Let Y = the total cost Y = 2X Let X = 200 visits to the gym Total cost = 2(200) = $500 B. Weaver, NHRC

9 What if the relationship is imperfect? Straight line for a perfect relationship: Y = bx + a Straight line for an imperfect relationship: Y = bx + a Y = bx + a Two different symbols for the predicted value of Y B. Weaver, NHRC

10 R-squared R-squared = the proportion of variability in Y that is accounted for by explanatory variables in the model. For a simple linear regression model (i.e., one predictor variable), R-squared = the proportion of the variability in Y that can be accounted for by the linear relationship between X and Y The adjusted R-squared corrects for upward bias in R-squared B. Weaver, NHRC

11 Anscombe s examples (1973) Frank Anscombe devised 4 sets of X-Y pairs He performed simple linear regression for each data set Here are the results B. Weaver, NHRC

12 Means & Standard Deviations X Y Data Set N Mean SD Mean SD The means and SDs for the 4 data sets are identical to two decimals. B. Weaver, NHRC

13 Correlations between X and Y Data Set Pearson r R-squared Adj. R-sq SE Correlations, R-squared, adjusted R- squared, and standard errors are all identical to two decimals. B. Weaver, NHRC

14 ANOVA Summary Tables Data Set Source SS df MS F p Regression Residual Total Regression Residual Total Regression Residual Total Regression Residual Total B. Weaver, NHRC

15 The Regression Coefficients Data Set B SE t p 95% CI Lower Upper Constant X Constant X Constant X Constant X For all 4 models, Y = 0.5(X) + 3 B. Weaver, NHRC

16 Which Model is Best? Judging by everything we ve just seen, it appears that the models are all equally good But if that were true, I wouldn t be doing this talk! It is well known that good graphs are an essential part of data analysis (Tukey, 1977; Tufte, 1997) Let s look at some graphs that show the relationship between X and Y B. Weaver, NHRC

17 Scatter-plot for Data Set 1 10 data points Influential point Not a good model B. Weaver, NHRC

18 Scatter-plot for Data Set 2 Perfect linear relationship except for one outlier Better model than for Data Set 1, but still not great. B. Weaver, NHRC

19 Scatter-plot for Data Set 3 Wrong model! The relationship between X and Y is curvilinear, not linear! The model should include both X and X 2 as predictors. B. Weaver, NHRC

20 Scatter-plot for Data Set 4 This is a good looking plot. No influential points; straight line provides a good fit. B. Weaver, NHRC

21 Summary The usual summary statistics for the 4 regression models were virtually identical Scatter-plots revealed that only one of the 4 data sets gave us a good model Appropriate graphs are an essential part of data analysis B. Weaver, NHRC

22 What about multivariable models? Scatter-plots are useful for simple linear regression models (i.e., only one predictor variable) But often, we have multiple, or multivariable regression models (i.e., 2 or more predictor variables) In that case, it is more common to assess the fit of the model by looking at residual plots B. Weaver, NHRC

23 What is a residual? In linear regression, a residual is an error in prediction Residual = (Y Y ) = (actual score predicted score) B. Weaver, NHRC

24 Set 1: Scatter-plot vs. Residual Plot Scatter-plot Residual Plot Y Residual X Predicted value of Y B. Weaver, NHRC

25 Set 2: Scatter-plot vs. Residual Plot Scatter-plot Residual Plot Residual Predicted value of Y B. Weaver, NHRC

26 Set 3: Scatter-plot vs. Residual Plot Scatter-plot Residual Plot Residual Predicted value of Y Runs of same-sign residuals B. Weaver, NHRC

27 Set 4: Scatter-plot vs. Residual Plot Scatter-plot Residual Plot Residual Predicted value of Y B. Weaver, NHRC

28 Summary The usual summary statistics for the 4 regression models were virtually identical Scatter-plots revealed that only one of the 4 data sets gave us a good model Residual plots reveal the same thing, and have the advantage of being applicable to multivariable regression models Appropriate graphs are an essential part of data analysis B. Weaver, NHRC

29 Questions? I think you should be more explicit here in step 2. B. Weaver, NHRC

30 References Anscombe FJ. (1973). Graphs in statistical analysis. The American Statistician, 27, Tufte ER. (1997). Visual Explanations, Images and Quantities, Evidence and Narrative (3rd Ed.). Graphics Press: Cheshire. Tukey JW. (1977). Exploratory data analysis. Addison-Wesley: Reading, Mass. B. Weaver, NHRC

31 Extra Slides B. Weaver, NHRC

32 Just as one would expect! The experimentalist comes running excitedly into the theorist's office, waving a graph taken off his latest experiment. "Hmmm," says the theorist, "That's exactly where you'd expect to see that peak. Here's the reason (long logical explanation follows)." In the middle of it, the experimentalist says "Wait a minute", studies the chart for a second, and says, "Oops, this is upside down." He fixes it. "Hmmm," says the theorist, "you'd expect to see a dip in exactly that position. Here's the reason...". B. Weaver, NHRC

33 Best-fitting line: Least squares criterion Many lines could be placed on the scatter-plot, but only one of them is considered the best-fitting line. The most common criterion for best-fitting is that the sum of the squared errors in prediction is minimized. This is called the least-squares criterion. B. Weaver, NHRC

34 Illustration of Least Squares Error in prediction B. Weaver, NHRC

35 Illustration of Least Squares Squared error in prediction Error = 0 for this point, so no square Squared error in prediction B. Weaver, NHRC

36 Illustration of Least Squares Sum of squared errors = the sum of the areas of all these squares For any other regression line, the sum of the squared errors would be greater. B. Weaver, NHRC

37 What is a residual plot? Scatter-plot with: X = the fitted (or predicted) value of Y Y = the residual (i.e., the error in prediction) Residuals should be independent of the fitted value of Y There should be no serial correlation in the residuals (e.g., long runs of same-sign residuals) Both of these problems (plus some others) can be detected via residual plots Advantage of residual plots: they can be used in multivariable (i.e., multi-predictor) regression models B. Weaver, NHRC

38 Examples of residual plots Curvilinear relationship Residual Predicted Y Outlier Heteroscedasticity B. Weaver, NHRC

39 Example of a good residual plot B. Weaver, NHRC

40 Example of a zig-zag pattern You do not want to see this kind of zig-zag pattern in the residual plot. B. Weaver, NHRC

41 Simple linear regression & correlation Pearson r = the correlation It measures of the direction and strength of the linear association between X and Y It ranges from -1 to +1 B. Weaver, NHRC

42 Direction of the linear relationship Positive relationship Negative relationship As X increases, Y increases As X increases, Y decreases B. Weaver, NHRC

43 Perfect vs. Imperfect Relationship Perfect relationship Imperfect relationship B. Weaver, NHRC

44 r-squared The square of Pearson r is a measure of how well the regression model fits the observed data It gives the proportion of variability in Y that is accounted for the linear relationship between X and Y. E.g., let r = 0.6 (or -0.6) r 2 = 0.36 So 36% of the variability in the Y-scores is accounted for by the linear relationship between X and Y B. Weaver, NHRC

Relationship of two variables

Relationship of two variables Relationship of two variables A correlation exists between two variables when the values of one are somehow associated with the values of the other in some way. Scatter Plot (or Scatter Diagram) A plot

More information

Introduction to Regression. Dr. Tom Pierce Radford University

Introduction to Regression. Dr. Tom Pierce Radford University Introduction to Regression Dr. Tom Pierce Radford University In the chapter on correlational techniques we focused on the Pearson R as a tool for learning about the relationship between two variables.

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Understand linear regression with a single predictor Understand how we assess the fit of a regression model Total Sum of Squares

More information

Using Minitab for Regression Analysis: An extended example

Using Minitab for Regression Analysis: An extended example Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to

More information

The aspect of the data that we want to describe/measure is the degree of linear relationship between and The statistic r describes/measures the degree

The aspect of the data that we want to describe/measure is the degree of linear relationship between and The statistic r describes/measures the degree PS 511: Advanced Statistics for Psychological and Behavioral Research 1 Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables ( and

More information

Class 6: Chapter 12. Key Ideas. Explanatory Design. Correlational Designs

Class 6: Chapter 12. Key Ideas. Explanatory Design. Correlational Designs Class 6: Chapter 12 Correlational Designs l 1 Key Ideas Explanatory and predictor designs Characteristics of correlational research Scatterplots and calculating associations Steps in conducting a correlational

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

Simple Linear Regression Chapter 11

Simple Linear Regression Chapter 11 Simple Linear Regression Chapter 11 Rationale Frequently decision-making situations require modeling of relationships among business variables. For instance, the amount of sale of a product may be related

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

31. SIMPLE LINEAR REGRESSION VI: LEVERAGE AND INFLUENCE

31. SIMPLE LINEAR REGRESSION VI: LEVERAGE AND INFLUENCE 31. SIMPLE LINEAR REGRESSION VI: LEVERAGE AND INFLUENCE These topics are not covered in the text, but they are important. Leverage If the data set contains outliers, these can affect the leastsquares fit.

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Chapter 2. Looking at Data: Relationships. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Chapter 2. Looking at Data: Relationships. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides Chapter 2 Looking at Data: Relationships Introduction to the Practice of STATISTICS SEVENTH EDITION Moore / McCabe / Craig Lecture Presentation Slides Chapter 2 Looking at Data: Relationships 2.1 Scatterplots

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

AMS7: WEEK 8. CLASS 1. Correlation Monday May 18th, 2015

AMS7: WEEK 8. CLASS 1. Correlation Monday May 18th, 2015 AMS7: WEEK 8. CLASS 1 Correlation Monday May 18th, 2015 Type of Data and objectives of the analysis Paired sample data (Bivariate data) Determine whether there is an association between two variables This

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Module 7 Test Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. You are given information about a straight line. Use two points to graph the equation.

More information

Residuals. Residuals = ª Department of ISM, University of Alabama, ST 260, M23 Residuals & Minitab. ^ e i = y i - y i

Residuals. Residuals = ª Department of ISM, University of Alabama, ST 260, M23 Residuals & Minitab. ^ e i = y i - y i A continuation of regression analysis Lesson Objectives Continue to build on regression analysis. Learn how residual plots help identify problems with the analysis. M23-1 M23-2 Example 1: continued Case

More information

Elementary Statistics. Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination

Elementary Statistics. Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination What is a Scatter Plot? A Scatter Plot is a plot of ordered pairs (x, y) where the horizontal axis is used

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

where b is the slope of the line and a is the intercept i.e. where the line cuts the y axis.

where b is the slope of the line and a is the intercept i.e. where the line cuts the y axis. Least Squares Introduction We have mentioned that one should not always conclude that because two variables are correlated that one variable is causing the other to behave a certain way. However, sometimes

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

Lecture 18 Linear Regression

Lecture 18 Linear Regression Lecture 18 Statistics Unit Andrew Nunekpeku / Charles Jackson Fall 2011 Outline 1 1 Situation - used to model quantitative dependent variable using linear function of quantitative predictor(s). Situation

More information

, then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients (

, then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients ( Multiple regression Introduction Multiple regression is a logical extension of the principles of simple linear regression to situations in which there are several predictor variables. For instance if we

More information

CHAPTER 2 AND 10: Least Squares Regression

CHAPTER 2 AND 10: Least Squares Regression CHAPTER 2 AND 0: Least Squares Regression In chapter 2 and 0 we will be looking at the relationship between two quantitative variables measured on the same individual. General Procedure:. Make a scatterplot

More information

, has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results.

, has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results. BA 275 Review Problems - Week 9 (11/20/06-11/24/06) CD Lessons: 69, 70, 16-20 Textbook: pp. 520-528, 111-124, 133-141 An SRS of size 100 is taken from a population having proportion 0.8 of successes. An

More information

SIMPLE REGRESSION ANALYSIS

SIMPLE REGRESSION ANALYSIS SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two

More information

Example: Boats and Manatees

Example: Boats and Manatees Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

More information

Simple Linear Regression

Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression Statistical model for linear regression Estimating

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

UNDERSTANDING MULTIPLE REGRESSION

UNDERSTANDING MULTIPLE REGRESSION UNDERSTANDING Multiple regression analysis (MRA) is any of several related statistical methods for evaluating the effects of more than one independent (or predictor) variable on a dependent (or outcome)

More information

Correlation key concepts:

Correlation key concepts: CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)

More information

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

More information

Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Centre for Excellence in Epidemiology University of Manchester 08/11/2016 Structure This Week What is a linear model? How

More information

Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology

Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology Regression in SPSS Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology John P. Bentley Department of Pharmacy Administration University of

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

Statistiek II. John Nerbonne. March 24, 2010. Information Science, Groningen Slides improved a lot by Harmut Fitz, Groningen!

Statistiek II. John Nerbonne. March 24, 2010. Information Science, Groningen Slides improved a lot by Harmut Fitz, Groningen! Information Science, Groningen j.nerbonne@rug.nl Slides improved a lot by Harmut Fitz, Groningen! March 24, 2010 Correlation and regression We often wish to compare two different variables Examples: compare

More information

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ 1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

More information

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

More information

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares Linear Regression Chapter 5 Regression Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). We can then predict the average response for all subjects

More information

Chapter 4 Describing the Relation between Two Variables

Chapter 4 Describing the Relation between Two Variables Chapter 4 Describing the Relation between Two Variables 4.1 Scatter Diagrams and Correlation The response variable is the variable whose value can be explained by the value of the explanatory or predictor

More information

Data and Regression Analysis. Lecturer: Prof. Duane S. Boning. Rev 10

Data and Regression Analysis. Lecturer: Prof. Duane S. Boning. Rev 10 Data and Regression Analysis Lecturer: Prof. Duane S. Boning Rev 10 1 Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance (ANOVA) 2. Multivariate Analysis of Variance Model forms 3.

More information

Regression Analysis: Basic Concepts

Regression Analysis: Basic Concepts The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance

More information

Regression, least squares

Regression, least squares Regression, least squares Joe Felsenstein Department of Genome Sciences and Department of Biology Regression, least squares p.1/24 Fitting a straight line X Two distinct cases: The X values are chosen

More information

SELF-TEST: SIMPLE REGRESSION

SELF-TEST: SIMPLE REGRESSION ECO 22000 McRAE SELF-TEST: SIMPLE REGRESSION Note: Those questions indicated with an (N) are unlikely to appear in this form on an in-class examination, but you should be able to describe the procedures

More information

Chapter 8. Linear Regression. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Chapter 8. Linear Regression. Copyright 2012, 2008, 2005 Pearson Education, Inc. Chapter 8 Linear Regression Copyright 2012, 2008, 2005 Pearson Education, Inc. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King

More information

Directions: Answer the following questions on another sheet of paper

Directions: Answer the following questions on another sheet of paper Module 3 Review Directions: Answer the following questions on another sheet of paper Questions 1-16 refer to the following situation: Is there a relationship between crime rate and the number of unemployment

More information

Chapter 10 Correlation and Regression. Overview. Section 10-2 Correlation Key Concept. Definition. Definition. Exploring the Data

Chapter 10 Correlation and Regression. Overview. Section 10-2 Correlation Key Concept. Definition. Definition. Exploring the Data Chapter 10 Correlation and Regression 10-1 Overview 10-2 Correlation 10- Regression Overview This chapter introduces important methods for making inferences about a correlation (or relationship) between

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Correlation and Regression

Correlation and Regression Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look

More information

MTH 140 Statistics Videos

MTH 140 Statistics Videos MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative

More information

Chapter 3: Describing Relationships

Chapter 3: Describing Relationships Chapter 3: Describing Relationships The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Chapter 3 2 Describing Relationships 3.1 Scatterplots and Correlation 3.2 Learning Targets After

More information

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Understand when to use multiple Understand the multiple equation and what the coefficients represent Understand different methods

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

More information

Outline: Demand Forecasting

Outline: Demand Forecasting Outline: Demand Forecasting Given the limited background from the surveys and that Chapter 7 in the book is complex, we will cover less material. The role of forecasting in the chain Characteristics of

More information

DATA INTERPRETATION AND STATISTICS

DATA INTERPRETATION AND STATISTICS PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

More information

LEARNING OBJECTIVES SCALES OF MEASUREMENT: A REVIEW SCALES OF MEASUREMENT: A REVIEW DESCRIBING RESULTS DESCRIBING RESULTS 8/14/2016

LEARNING OBJECTIVES SCALES OF MEASUREMENT: A REVIEW SCALES OF MEASUREMENT: A REVIEW DESCRIBING RESULTS DESCRIBING RESULTS 8/14/2016 UNDERSTANDING RESEARCH RESULTS: DESCRIPTION AND CORRELATION LEARNING OBJECTIVES Contrast three ways of describing results: Comparing group percentages Correlating scores Comparing group means Describe

More information

STT 200 LECTURE 1, SECTION 2,4 RECITATION 7 (10/16/2012)

STT 200 LECTURE 1, SECTION 2,4 RECITATION 7 (10/16/2012) STT 200 LECTURE 1, SECTION 2,4 RECITATION 7 (10/16/2012) TA: Zhen (Alan) Zhang zhangz19@stt.msu.edu Office hour: (C500 WH) 1:45 2:45PM Tuesday (office tel.: 432-3342) Help-room: (A102 WH) 11:20AM-12:30PM,

More information

Lecture 5: Correlation and Linear Regression

Lecture 5: Correlation and Linear Regression Lecture 5: Correlation and Linear Regression 3.5. (Pearson) correlation coefficient The correlation coefficient measures the strength of the linear relationship between two variables. The correlation is

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

Scatter Plot, Correlation, and Regression on the TI-83/84

Scatter Plot, Correlation, and Regression on the TI-83/84 Scatter Plot, Correlation, and Regression on the TI-83/84 Summary: When you have a set of (x,y) data points and want to find the best equation to describe them, you are performing a regression. This page

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a

In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a Math 143 Inference on Regression 1 Review of Linear Regression In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a bivariate data set (i.e., a list of cases/subjects

More information

11/20/2014. Correlational research is used to describe the relationship between two or more naturally occurring variables.

11/20/2014. Correlational research is used to describe the relationship between two or more naturally occurring variables. Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection

More information

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand

More information

Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0.

Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0. Statistical analysis using Microsoft Excel Microsoft Excel spreadsheets have become somewhat of a standard for data storage, at least for smaller data sets. This, along with the program often being packaged

More information

1. ε is normally distributed with a mean of 0 2. the variance, σ 2, is constant 3. All pairs of error terms are uncorrelated

1. ε is normally distributed with a mean of 0 2. the variance, σ 2, is constant 3. All pairs of error terms are uncorrelated STAT E-150 Statistical Methods Residual Analysis; Data Transformations The validity of the inference methods (hypothesis testing, confidence intervals, and prediction intervals) depends on the error term,

More information

Chapter 12 : Linear Correlation and Linear Regression

Chapter 12 : Linear Correlation and Linear Regression Number of Faculty Chapter 12 : Linear Correlation and Linear Regression Determining whether a linear relationship exists between two quantitative variables, and modeling the relationship with a line, if

More information

San Jose State University Engineering 10 1

San Jose State University Engineering 10 1 KY San Jose State University Engineering 10 1 Select Insert from the main menu Plotting in Excel Select All Chart Types San Jose State University Engineering 10 2 Definition: A chart that consists of multiple

More information

Section 1.5 Linear Models

Section 1.5 Linear Models Section 1.5 Linear Models Some real-life problems can be modeled using linear equations. Now that we know how to find the slope of a line, the equation of a line, and the point of intersection of two lines,

More information

Chapter 11: Two Variable Regression Analysis

Chapter 11: Two Variable Regression Analysis Department of Mathematics Izmir University of Economics Week 14-15 2014-2015 In this chapter, we will focus on linear models and extend our analysis to relationships between variables, the definitions

More information

ST 311 Evening Problem Session Solutions Week 11

ST 311 Evening Problem Session Solutions Week 11 1. p. 175, Question 32 (Modules 10.1-10.4) [Learning Objectives J1, J3, J9, J11-14, J17] Since 1980, average mortgage rates have fluctuated from a low of under 6% to a high of over 14%. Is there a relationship

More information

psyc3010 lecture 8 standard and hierarchical multiple regression last week: correlation and regression Next week: moderated regression

psyc3010 lecture 8 standard and hierarchical multiple regression last week: correlation and regression Next week: moderated regression psyc3010 lecture 8 standard and hierarchical multiple regression last week: correlation and regression Next week: moderated regression 1 last week this week last week we revised correlation & regression

More information

ID X Y

ID X Y Dale Berger SPSS Step-by-Step Regression Introduction: MRC01 This step-by-step example shows how to enter data into SPSS and conduct a simple regression analysis to develop an equation to predict from.

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Lesson Lesson Outline Outline

Lesson Lesson Outline Outline Lesson 15 Linear Regression Lesson 15 Outline Review correlation analysis Dependent and Independent variables Least Squares Regression line Calculating l the slope Calculating the Intercept Residuals and

More information

What is correlational research?

What is correlational research? Key Ideas Purpose and use of correlational designs How correlational research developed Types of correlational designs Key characteristics of correlational designs Procedures used in correlational studies

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

More information

Inference for Regression

Inference for Regression Simple Linear Regression Inference for Regression The simple linear regression model Estimating regression parameters; Confidence intervals and significance tests for regression parameters Inference about

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.7 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Linear Regression Other Regression Models References Introduction Introduction Numerical prediction is

More information

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p. Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under

More information

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

More information

Interpreting Multiple Regression

Interpreting Multiple Regression Fall Semester, 2001 Statistics 621 Lecture 5 Robert Stine 1 Preliminaries Interpreting Multiple Regression Project and assignments Hope to have some further information on project soon. Due date for Assignment

More information

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996) MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL by Michael L. Orlov Chemistry Department, Oregon State University (1996) INTRODUCTION In modern science, regression analysis is a necessary part

More information

AP * Statistics Review. Linear Regression

AP * Statistics Review. Linear Regression AP * Statistics Review Linear Regression Teacher Packet Advanced Placement and AP are registered trademark of the College Entrance Examination Board. The College Board was not involved in the production

More information

RNR / ENTO Assumptions for Simple Linear Regression

RNR / ENTO Assumptions for Simple Linear Regression 74 RNR / ENTO 63 --Assumptions for Simple Linear Regression Statistical statements (hypothesis tests and CI estimation) with least squares estimates depends on 4 assumptions:. Linearity of the mean responses

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

The scatterplot indicates a positive linear relationship between waist size and body fat percentage:

The scatterplot indicates a positive linear relationship between waist size and body fat percentage: STAT E-150 Statistical Methods Multiple Regression Three percent of a man's body is essential fat, which is necessary for a healthy body. However, too much body fat can be dangerous. For men between the

More information

Section I: Multiple Choice Select the best answer for each question.

Section I: Multiple Choice Select the best answer for each question. Chapter 15 (Regression Inference) AP Statistics Practice Test (TPS- 4 p796) Section I: Multiple Choice Select the best answer for each question. 1. Which of the following is not one of the conditions that

More information

Bivariate Data Cleaning

Bivariate Data Cleaning Bivariate Data Cleaning Bivariate Outliers In Simple Correlation/Regression Analyses Imagine we are interested in the correlation between two variables. Being schooled about outliers we examine the distribution

More information

Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Calculate counts, means, and standard deviations Produce

More information

A. Karpinski

A. Karpinski Chapter 3 Multiple Linear Regression Page 1. Overview of multiple regression 3-2 2. Considering relationships among variables 3-3 3. Extending the simple regression model to multiple predictors 3-4 4.

More information

Multiple Regression: What Is It?

Multiple Regression: What Is It? Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in

More information

TIME SERIES ANALYSIS & FORECASTING

TIME SERIES ANALYSIS & FORECASTING CHAPTER 19 TIME SERIES ANALYSIS & FORECASTING Basic Concepts 1. Time Series Analysis BASIC CONCEPTS AND FORMULA The term Time Series means a set of observations concurring any activity against different

More information

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Simple Linear Regression, Scatterplots, and Bivariate Correlation 1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.

More information

0.1 Multiple Regression Models

0.1 Multiple Regression Models 0.1 Multiple Regression Models We will introduce the multiple Regression model as a mean of relating one numerical response variable y to two or more independent (or predictor variables. We will see different

More information