Introduction to Linear Regression and Correlation Analysis

Size: px
Start display at page:

Download "Introduction to Linear Regression and Correlation Analysis"

Transcription

1 Introduction to Linear Regression and Correlation Analsis

2 Goals After this, ou should be able to: Calculate and interpret the simple correlation between two variables Determine whether the correlation is significant Calculate and interpret the simple linear regression equation for a set of data Understand the assumptions behind regression analsis Determine whether a regression model is significant

3 Goals After this, ou should be able to: Calculate and interpret confidence intervals for the regression coefficients Recognize regression analsis applications for purposes of prediction and description Recognize some potential problems if regression analsis is used incorrectl Recognize nonlinear relationships between two variables

4 Scatter Plots and Correlation A scatter plot (or scatter diagram) is used to show the relationship between two variables Correlation analsis is used to measure strength of the association (linear relationship) between two variables Onl concerned with strength of the relationship No causal effect is implied

5 Scatter Plot Eamples Linear relationships Curvilinear relationships

6 Scatter Plot Eamples Strong relationships Weak relationships

7 Scatter Plot Eamples No relationship

8 Correlation Coefficient The population correlation coefficient ρ (rho) measures the strength of the association between the variables The sample correlation coefficient r is an estimate of ρ and is used to measure the strength of the linear relationship in the sample observations

9 Features of ρ and r Unit free Range between -1 and 1 The closer to -1, the stronger the negative linear relationship The closer to 1, the stronger the positive linear relationship The closer to 0, the weaker the linear relationship

10 Eamples of Approimate r Values r = -1 r = -.6 r = 0 r = +.3 r = +1

11 Calculating the Correlation Coefficient Sample correlation coefficient: r = [ ( ( )( ) ) ][ ( ) ] where: or the algebraic equivalent: r = [n( r = Sample correlation coefficient n = Sample size = Value of the independent variable = Value of the dependent variable n ) ( ) ][n( ) ( ) ]

12 Calculation Eample Trunk Diameter Tree Height Σ=713 Σ=14111 Σ=314 Σ=73 Σ=

13 Calculation Eample Tree Height, 70 r = n [n( ) ( ) ][n( ) ( ) ] = 8(314) (73)(31) [8(713) (73) ][8(14111) (31) ] 30 0 = Trunk Diameter, r = relativel strong positive linear association between and

14 Significance Test for Correlation Hpotheses H 0 : ρ = 0 H A : ρ 0 (no correlation) (correlation eists) Test statistic t = r (with n degrees of freedom) 1 r n

15 Eample: Produce Stores Is there evidence of a linear relationship between tree height and trunk diameter at the.05 level of significance? H 0 : ρ = 0 (No correlation) H 1 : ρ 0 (correlation eists) α =.05, df = 8 - = 6 t = r 1 r n = = 4.68

16 Eample: Test Solution r t = 1 r n d.f. = 8- = 6 α/=.05 = = 4.68 α/=.05 Decision: Reject H 0 Conclusion: There is evidence of a linear relationship at the 5% level of significance Reject H 0 Reject H t 0 α/ Do not reject H -t 0 α/

17 Introduction to Regression Analsis Regression analsis is used to: Predict the value of a dependent variable based on the value of at least one independent variable Eplain the impact of changes in an independent variable on the dependent variable Dependent variable: the variable we wish to eplain Independent variable: the variable used to eplain the dependent variable

18 Simple Linear Regression Model Onl one independent variable, Relationship between and is described b a linear function Changes in are assumed to be caused b changes in

19 Tpes of Regression Models Positive Linear Relationship Relationship NOT Linear Negative Linear Relationship No Relationship

20 Population Linear Regression The population regression model: Dependent Variable Population intercept Population Slope Coefficient Independent Variable = β + β ε Random Error term, or residual Linear component Random Error component

21 Linear Regression Assumptions Error values (ε) are statisticall independent Error values are normall distributed for an given value of The probabilit distribution of the errors is normal The probabilit distribution of the errors has constant variance The underling relationship between the variable and the variable is linear

22 Population Linear Regression (continued) Observed Value of for i = β + β ε Predicted Value of for i ε i Random Error for this value Slope = β 1 Intercept = β 0 i

23 Estimated Regression Model The sample regression line provides an estimate of the population regression line Estimated (or predicted) value Estimate of the regression intercept Estimate of the regression slope ŷ = b + b i 0 1 Independent variable The individual random error terms e i have a mean of zero

24 Least Squares Criterion b 0 and b 1 are obtained b finding the values of b 0 and b 1 that minimize the sum of the squared residuals e = ( ŷ) = ( (b 0 + b 1 ))

25 The Least Squares Equation The formulas for b 1 and b 0 are: b 1 ( = ( )( ) ) algebraic equivalent: b = n 1 ( ) n and b0 = b1

26 Interpretation of the Slope and the Intercept b 0 is the estimated average value of when the value of is zero b 1 is the estimated change in the average value of as a result of a oneunit change in

27 Finding the Least Squares Equation The coefficients b 0 and b 1 will usuall be found using computer software Other regression measures will also be computed as part of computer-based regression analsis

28 Simple Linear Regression Eample A real estate agent wishes to eamine the relationship between the selling price of a home and its size (measured in square feet) A random sample of 10 houses is selected Dependent variable () = house price in $1000s Independent variable () = square feet

29 Sample Data for House Price Model House Price in $1000s () Square Feet ()

30 Graphical Presentation House price model: scatter plot and regression line Intercept = House Price ($1000s) Square Feet Slope = house price = (square feet)

31 Interpretation of the Intercept, b 0 house price = (square feet) b 0 is the estimated average value of Y when the value of X is zero (if = 0 is in the range of observed values) Here, no houses had 0 square feet, so b 0 = just indicates that, for houses within the range of sizes observed, $98,48.33 is the portion of the house price not eplained b square feet

32 Interpretation of the Slope Coefficient, b 1 house price = (square feet) b 1 measures the estimated change in the average value of Y as a result of a one-unit change in X Here, b 1 = tells us that the average value of a house increases b.10977($1000) = $109.77, on average, for each additional one square foot of size

33 Least Squares Regression Properties The sum of the residuals from the least squares regression line is 0 ( ) The sum of the squared residuals is a minimum (minimized ( ˆ) ) The simple regression line alwas passes through the mean of the variable and the mean of the variable The least squares coefficients are unbiased estimates of β 0 and β 1 ( ˆ) = 0

34 Eplained and Uneplained Variation Total variation is made up of two parts: SST = SSE + SSR Total sum of Squares Sum of Squares Error Sum of Squares Regression = SST ( ) SSE = ( ŷ) SSR = (ŷ ) where: = Average value of the dependent variable = Observed values of the dependent variable ŷ = Estimated value of for the given value

35 Eplained and Uneplained Variation SST = total sum of squares Measures the variation of the i values around their mean SSE = error sum of squares Variation attributable to factors other than the relationship between and SSR = regression sum of squares Eplained variation attributable to the relationship between and

36 Eplained and Uneplained Variation i SST = ( i - ) SSE = ( i - i ) _ SSR = ( i - ) _ X i

37 Coefficient of Determination, R The coefficient of determination is the portion of the total variation in the dependent variable that is eplained b variation in the independent variable The coefficient of determination is also called R-squared and is denoted as R R = SSR where 0 R 1 SST

38 Coefficient of Determination, R Coefficient of determination SSR R = = SST sum of squares eplained b regression total sum of squares Note: In the single independent variable case, the coefficient of determination is R = r where: R = Coefficient of determination r = Simple correlation coefficient

39 Eamples of Approimate R Values R = 1 R = 1 Perfect linear relationship between and : 100% of the variation in is eplained b variation in R = +1

40 Eamples of Approimate R Values 0 < R < 1 Weaker linear relationship between and : Some but not all of the variation in is eplained b variation in

41 Eamples of Approimate R Values R = 0 No linear relationship between and : R = 0 The value of Y does not depend on. (None of the variation in is eplained b variation in )

42 Calculating R Values There are several equivalent was to calculate R in matri notation.

43 Estimate of Regression Error Variance The residuals from the OLS regression are the estimates of the population errors. It follows that the estimate of the population error variance σ would be the estimated residual variance, s. Where SSE = Sum of squared errors n = Sample size k = number of independent variables in the model

44 Standard Error of Estimate The standard deviation of the variation of observations around the regression line is estimated b s ε = n SSE k 1 Where SSE = Sum of squared errors n = Sample size k = number of independent variables in the model

45 The Standard Deviation of the Regression Slope The standard error of the regression coefficient estimates (b) is estimated b where: s b1 = Estimate of the standard error of the least squares slope s ε = SSE n = Sample standard error of the estimate

46 Comparing Standard Errors Variation of observed values from the regression line Variation in the slope of regression lines from different possible samples small s ε small s b1 large s ε large s b1

47 Inference on the OLS estimates t test for a population slope Is there a linear relationship between and? Null and alternative hpotheses H 0 : β 1 = 0 (no linear relationship) H 1 : β 1 0 (linear relationship does eist) Test statistic t = d.f. b1 β s = b 1 n 1 where: b 1 = Sample regression slope coefficient β 1 = Hpothesized slope s b1 = Estimator of the standard error of the slope

48 Inference about the Slope: t Test House Price in $1000s () Square Feet () Estimated Regression Equation: house price = (sq.ft.) The slope of this model is Does square footage of the house affect its sales price?

49 Inferences about the Slope: t Test Eample H 0 : β 1 = 0 H A : β 1 0 Test Statistic: t = 3.39 From Ecel output: Coefficients Intercept b 1 Standard Error sb 1 t Stat t P-value Square Feet d.f. = 10- = 8 α/=.05 α/=.05 Reject H 0 Reject H t 0 α/ Do not reject H -t 0 α/ Decision: Reject H 0 Conclusion: There is sufficient evidence that square footage affects house price

50 Calculating a Confidence Interval Confidence Interval Estimate of the Slope: b ± t s 1 α/ b 1 d.f. = n - Ecel Printout for House Prices: Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept Square Feet At 95% level of confidence, the confidence interval for the slope is (0.0337, )

51 Regression Analsis for Description Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept Square Feet Since the units of the house price variable is $1000s, we are 95% confident that the average impact on sales price is between $33.70 and $ per square foot of house size This 95% confidence interval does not include 0. Conclusion: There is a significant relationship between house price and square feet at the.05 level of significance

52 Confidence Interval for the Average, Given Confidence interval estimate for the mean of given a particular p Size of interval varies according to distance awa from mean, ŷ 1 (p ) ± t α /sε + n ( )

53 Confidence Interval for an Individual, Given Confidence interval estimate for an Individual value of given a particular p ŷ 1 ( ) ± t s 1+ + p α / ε n ( ) This etra term adds to the interval width to reflect the added uncertaint for an individual case

54 Interval Estimates for Different Values of Prediction Interval for an individual, given p = b 0 + b 1 Confidence Interval for the mean of, given p p

55 Eample: House Prices House Price in $1000s () Square Feet () Estimated Regression Equation: house price = (sq.ft.) Predict the price for a house with 000 square feet

56 Eample: House Prices Predict the price for a house with 000 square feet: house price = (sq.ft.) = (000) = The predicted price for a house with 000 square feet is ($1,000s) = $317,850

57 Confidence Interval for Prediction of Average Confidence Interval Estimate for E() p Find the 95% confidence interval for the average price of,000 square-foot houses Predicted Price Y i = ($1,000s) 1 (p ) tα/ sε + = ± n ( ) ŷ ± 37.1 The confidence interval endpoints are , or from $80, $354,900

58 Confidence Interval for Prediction of Specific Prediction Interval Estimate for p Find the 95% confidence interval for an individual house with,000 square feet Predicted Price Y i = ($1,000s) 1 (p ) tα/ sε 1+ + = ± 10.8 n ( ) ŷ ± The prediction interval endpoints are , or from $15, $40,070

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Module 7 Test Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. You are given information about a straight line. Use two points to graph the equation.

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

The Big Picture. Correlation. Scatter Plots. Data

The Big Picture. Correlation. Scatter Plots. Data The Big Picture Correlation Bret Hanlon and Bret Larget Department of Statistics Universit of Wisconsin Madison December 6, We have just completed a length series of lectures on ANOVA where we considered

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

So, using the new notation, P X,Y (0,1) =.08 This is the value which the joint probability function for X and Y takes when X=0 and Y=1.

So, using the new notation, P X,Y (0,1) =.08 This is the value which the joint probability function for X and Y takes when X=0 and Y=1. Joint probabilit is the probabilit that the RVs & Y take values &. like the PDF of the two events, and. We will denote a joint probabilit function as P,Y (,) = P(= Y=) Marginal probabilit of is the probabilit

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

17. SIMPLE LINEAR REGRESSION II

17. SIMPLE LINEAR REGRESSION II 17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Section 1: Simple Linear Regression

Section 1: Simple Linear Regression Section 1: Simple Linear Regression Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Simple Methods and Procedures Used in Forecasting

Simple Methods and Procedures Used in Forecasting Simple Methods and Procedures Used in Forecasting The project prepared by : Sven Gingelmaier Michael Richter Under direction of the Maria Jadamus-Hacura What Is Forecasting? Prediction of future events

More information

Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480

Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480 1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500

More information

The correlation coefficient

The correlation coefficient The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative

More information

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

More information

Solving Quadratic Equations by Graphing. Consider an equation of the form. y ax 2 bx c a 0. In an equation of the form

Solving Quadratic Equations by Graphing. Consider an equation of the form. y ax 2 bx c a 0. In an equation of the form SECTION 11.3 Solving Quadratic Equations b Graphing 11.3 OBJECTIVES 1. Find an ais of smmetr 2. Find a verte 3. Graph a parabola 4. Solve quadratic equations b graphing 5. Solve an application involving

More information

Pearson s Correlation Coefficient

Pearson s Correlation Coefficient Pearson s Correlation Coefficient In this lesson, we will find a quantitative measure to describe the strength of a linear relationship (instead of using the terms strong or weak). A quantitative measure

More information

2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or

2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

1 Simple Linear Regression I Least Squares Estimation

1 Simple Linear Regression I Least Squares Estimation Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and

More information

Regression step-by-step using Microsoft Excel

Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

More information

Coefficient of Determination

Coefficient of Determination Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

Chapter 9 Descriptive Statistics for Bivariate Data

Chapter 9 Descriptive Statistics for Bivariate Data 9.1 Introduction 215 Chapter 9 Descriptive Statistics for Bivariate Data 9.1 Introduction We discussed univariate data description (methods used to eplore the distribution of the values of a single variable)

More information

Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

Regression and Correlation

Regression and Correlation Regression and Correlation Topics Covered: Dependent and independent variables. Scatter diagram. Correlation coefficient. Linear Regression line. by Dr.I.Namestnikova 1 Introduction Regression analysis

More information

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Section 3 Part 1. Relationships between two numerical variables

Section 3 Part 1. Relationships between two numerical variables Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.

More information

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

More information

Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

More information

Elements of statistics (MATH0487-1)

Elements of statistics (MATH0487-1) Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -

More information

CHAPTER TEN. Key Concepts

CHAPTER TEN. Key Concepts CHAPTER TEN Ke Concepts linear regression: slope intercept residual error sum of squares or residual sum of squares sum of squares due to regression mean squares error mean squares regression (coefficient)

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Ch. Correlation and Regression. Correlation Interpret Scatter Plots and Correlations MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4

4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4 4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Non-linear functional forms Regression

More information

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares Linear Regression Chapter 5 Regression Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). We can then predict the average response for all subjects

More information

Zeros of Polynomial Functions. The Fundamental Theorem of Algebra. The Fundamental Theorem of Algebra. zero in the complex number system.

Zeros of Polynomial Functions. The Fundamental Theorem of Algebra. The Fundamental Theorem of Algebra. zero in the complex number system. _.qd /7/ 9:6 AM Page 69 Section. Zeros of Polnomial Functions 69. Zeros of Polnomial Functions What ou should learn Use the Fundamental Theorem of Algebra to determine the number of zeros of polnomial

More information

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2 Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

Notes on Applied Linear Regression

Notes on Applied Linear Regression Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:

More information

Using Excel for Statistical Analysis

Using Excel for Statistical Analysis Using Excel for Statistical Analysis You don t have to have a fancy pants statistics package to do many statistical functions. Excel can perform several statistical tests and analyses. First, make sure

More information

Correlation key concepts:

Correlation key concepts: CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5 Using Your TI-83/84 Calculator: Linear Correlation and Regression Elementary Statistics Dr. Laura Schultz This handout describes how to use your calculator for various linear correlation and regression

More information

Factors affecting online sales

Factors affecting online sales Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

More information

1. a. standard form of a parabola with. 2 b 1 2 horizontal axis of symmetry 2. x 2 y 2 r 2 o. standard form of an ellipse centered

1. a. standard form of a parabola with. 2 b 1 2 horizontal axis of symmetry 2. x 2 y 2 r 2 o. standard form of an ellipse centered Conic Sections. Distance Formula and Circles. More on the Parabola. The Ellipse and Hperbola. Nonlinear Sstems of Equations in Two Variables. Nonlinear Inequalities and Sstems of Inequalities In Chapter,

More information

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship

More information

WEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y 7.1 1 8.92 X 10.0 0.0 16.0 10.0 1.6

WEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y 7.1 1 8.92 X 10.0 0.0 16.0 10.0 1.6 WEB APPENDIX 8A Calculating Beta Coefficients The CAPM is an ex ante model, which means that all of the variables represent before-thefact, expected values. In particular, the beta coefficient used in

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

Estimation of σ 2, the variance of ɛ

Estimation of σ 2, the variance of ɛ Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated

More information

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS. SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed

More information

DATA INTERPRETATION AND STATISTICS

DATA INTERPRETATION AND STATISTICS PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

More information

table to see that the probability is 0.8413. (b) What is the probability that x is between 16 and 60? The z-scores for 16 and 60 are: 60 38 = 1.

table to see that the probability is 0.8413. (b) What is the probability that x is between 16 and 60? The z-scores for 16 and 60 are: 60 38 = 1. Review Problems for Exam 3 Math 1040 1 1. Find the probability that a standard normal random variable is less than 2.37. Looking up 2.37 on the normal table, we see that the probability is 0.9911. 2. Find

More information

Causal Forecasting Models

Causal Forecasting Models CTL.SC1x -Supply Chain & Logistics Fundamentals Causal Forecasting Models MIT Center for Transportation & Logistics Causal Models Used when demand is correlated with some known and measurable environmental

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

More information

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name: Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

More information

The Distance Formula and the Circle

The Distance Formula and the Circle 10.2 The Distance Formula and the Circle 10.2 OBJECTIVES 1. Given a center and radius, find the equation of a circle 2. Given an equation for a circle, find the center and radius 3. Given an equation,

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables. SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation

More information

Using R for Linear Regression

Using R for Linear Regression Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

More information

SIMPLE LINEAR REGRESSION

SIMPLE LINEAR REGRESSION CHAPTER 2 SIMPLE LINEAR REGRESSION 2.1 I NTRO DU CTlO N We start with the simple case of studying the relationship between a response variable Y and a predictor variable XI. Since we have only one predictor

More information

Example: Boats and Manatees

Example: Boats and Manatees Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

More information

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

More information

Introduction to Linear Regression

Introduction to Linear Regression 14. Regression A. Introduction to Simple Linear Regression B. Partitioning Sums of Squares C. Standard Error of the Estimate D. Inferential Statistics for b and r E. Influential Observations F. Regression

More information

APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING

APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING Sulaimon Mutiu O. Department of Statistics & Mathematics Moshood Abiola Polytechnic, Abeokuta, Ogun State, Nigeria. Abstract

More information

International Statistical Institute, 56th Session, 2007: Phil Everson

International Statistical Institute, 56th Session, 2007: Phil Everson Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

More information

1.6. Piecewise Functions. LEARN ABOUT the Math. Representing the problem using a graphical model

1.6. Piecewise Functions. LEARN ABOUT the Math. Representing the problem using a graphical model . Piecewise Functions YOU WILL NEED graph paper graphing calculator GOAL Understand, interpret, and graph situations that are described b piecewise functions. LEARN ABOUT the Math A cit parking lot uses

More information

INVESTIGATIONS AND FUNCTIONS 1.1.1 1.1.4. Example 1

INVESTIGATIONS AND FUNCTIONS 1.1.1 1.1.4. Example 1 Chapter 1 INVESTIGATIONS AND FUNCTIONS 1.1.1 1.1.4 This opening section introduces the students to man of the big ideas of Algebra 2, as well as different was of thinking and various problem solving strategies.

More information

ax 2 by 2 cxy dx ey f 0 The Distance Formula The distance d between two points (x 1, y 1 ) and (x 2, y 2 ) is given by d (x 2 x 1 )

ax 2 by 2 cxy dx ey f 0 The Distance Formula The distance d between two points (x 1, y 1 ) and (x 2, y 2 ) is given by d (x 2 x 1 ) SECTION 1. The Circle 1. OBJECTIVES The second conic section we look at is the circle. The circle can be described b using the standard form for a conic section, 1. Identif the graph of an equation as

More information

5.1. A Formula for Slope. Investigation: Points and Slope CONDENSED

5.1. A Formula for Slope. Investigation: Points and Slope CONDENSED CONDENSED L E S S O N 5.1 A Formula for Slope In this lesson ou will learn how to calculate the slope of a line given two points on the line determine whether a point lies on the same line as two given

More information

A Primer on Forecasting Business Performance

A Primer on Forecasting Business Performance A Primer on Forecasting Business Performance There are two common approaches to forecasting: qualitative and quantitative. Qualitative forecasting methods are important when historical data is not available.

More information

Chapter 6 Quadratic Functions

Chapter 6 Quadratic Functions Chapter 6 Quadratic Functions Determine the characteristics of quadratic functions Sketch Quadratics Solve problems modelled b Quadratics 6.1Quadratic Functions A quadratic function is of the form where

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

Interaction between quantitative predictors

Interaction between quantitative predictors Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors

More information

Worksheet A5: Slope Intercept Form

Worksheet A5: Slope Intercept Form Name Date Worksheet A5: Slope Intercept Form Find the Slope of each line below 1 3 Y - - - - - - - - - - Graph the lines containing the point below, then find their slopes from counting on the graph!.

More information

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA We Can Early Learning Curriculum PreK Grades 8 12 INSIDE ALGEBRA, GRADES 8 12 CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA April 2016 www.voyagersopris.com Mathematical

More information

GRADES 7, 8, AND 9 BIG IDEAS

GRADES 7, 8, AND 9 BIG IDEAS Table 1: Strand A: BIG IDEAS: MATH: NUMBER Introduce perfect squares, square roots, and all applications Introduce rational numbers (positive and negative) Introduce the meaning of negative exponents for

More information