Lesson Lesson Outline Outline

Size: px
Start display at page:

Download "Lesson Lesson Outline Outline"

Transcription

1 Lesson 15 Linear Regression

2 Lesson 15 Outline Review correlation analysis Dependent and Independent variables Least Squares Regression line Calculating l the slope Calculating the Intercept Residuals and Residual Plots Identifying significant relationship: t-test test of the slope R 2 : coefficient of determination Using the regression line for Prediction of Y from X Relationship between correlation coefficient and linear regression 2

3 Linear Regression and Correlation Both Linear Regression ess and Correlation o Analysis s can be used to explore the linear relationship between two continuous (quantitative) random variables. Correlation analysis is used when the interest is in identifying if a relationship exists and quantifying the strength of the relationship Regression Analysis is used to identify a relationship AND to predict the value of one variable given a value of the other variable(s). 3

4 Review: Correlation Analysis 1. Plot the data using a scatter plot to get a visual idea of the relationship 2. Calculate the correlation coefficient 1. Use Pearson s correlation coefficient if both variables are continuous 2. Use Spearman rank correlation coefficient if both variables are ordinal or one is ordinal and the other continuous. 4

5 Review: Scatter Plots and Association i Plot the 2 variables in a scatter plot (EXCEL) The pattern of the dots in the plot indicates the statistical relationship between the variables (the strength th and the direction) Positive relationship pattern goes from lower left to upper right. Negative relationship pattern goes from upper left to lower right. The more the dots cluster around a straight line with a positive or negative direction the stronger the linear relationship. 5

6 Review: Correlation Coefficient r ( x x )( y y ) [ ( x x ) 2 ][ ( y y) 2 ] The statistic r is called the Correlation Coefficient r estimated the population correlation coefficient: (the Greek letter r ) The correlation coefficient provides a measure of the linear association between two variables r is always between 1 and 1 6

7 Review: Correlation Coefficient i in Excel Use the CORREL function to find the correlation coefficient If data for one variable are in cells A1:A12 and data for other variable are in cells B1:B12, =CORREL(A1:A12,B1:B12) will return the Pearson correlation coefficient. Correlation coefficients i closer to 1 or 1 1i indicate a stronger linear relationship. Correlation coefficients close to 0 indicate a weak linear relationship. However there could be a nonlinear relationship when the correlation coefficient is close to 0. 7

8 Simple Linear Regression Like correlation analysis, Linear regression analysis is a technique that is used to explore the relationship between two continuous random variables that have a linear relationship. Regression analysis allows us to investigate the change in one variable that corresponds to a given change in the other variable. If only ONE variable is used to predict the value of the other variable, the analysis is called simple linear regression. When two or more variables are used to predict the value of the other variable, the analysis is called multiple linear regression (not covered in this course). 8

9 Linear Regression: Background Regression is from a Latin root meaning going back Linear regression as a statistical method was first described by Sir Francis Galton in his paper "Regression Towards Mediocrity in Hereditary Stature published in The Journal of the Anthropological Institute, 1886 Galton described the relationship between mid-parent height (Mid- parent height = the average of the 2 parent s height) and the height of their offspring Taller mid-parent height had children with heights closer to the average height Shorter mid-parent height had children with heights closer to the average height Galton called this phenomenon regression towards mediocrity 9

10 Sir Francis Galton: Regression When mid-parents are taller than mediocrity, their children tend to be shorter than they and When mid-parents are shorter than mediocrity, it their children tend to be taller than they 10

11 Variables in Simple Linear Regression Analysis Dependent or response variable- a variable to be predicted from or explained by the other variable The response variable is typically labeled Y Y is a continuous variable in simple linear regression Independent or explanatory variable the variable used to predict the dependant variable. This variable is typically labeled l X X can also be called the predictive variable or the regressor variable For simple linear regression X is a continuous variable For multiple linear regression X can be continuous or categorical 11

12 Identifying independent and dependent variables. In regression analysis, it s important to correctly identify the dependent d (Y) and independent d (X) variables. The study description should provide you with information about which is the dependent variable and which is the independent variable. If the study description states that the goal is to predict variable 1 from variable 2, 2 then variable 1 is the dependent variable (Y) and variable 2 is the independent variable (X). Typically, if the variables are separated in time, the variable collected first is the independent variable (X) )andthevariable collected later is the dependent variable (Y). In Galton s regression analysis, the mid-parent height was the independent variable and the offspring height was the dependent variable 12

13 Linear Regression Overview Look at a scatter plot of the data Plot Y on the y-axis and X on the x-axis Does the relationship appear to be linear? Estimate the regression line equation Find the slope and intercept of the regression line Check residuals Is the relationship statistically significant? Use a t-test test of the slope to determine significance How well does the estimated regression line equation fit the data? Calculate R 2 - the coefficient of determination Use the estimated regression line equation to predict values of fth the dependent d variable (Y)f for specified values of fth the independent variable (X). 13

14 Simple Linear Regression: An Example Is there a linear relationship between body weight and plasma volume that can be used to predict plasma volume from weight? Plasma volume is the dependent variable Y since we are interested in predicting this from body weight, the independent variable X. Body Plasma Subject Weight(kg) Volume(l)

15 Scatter plot of the Data There is a positive relationship between plasma volume and body weight. With this small number of data points it is difficult to see the linear relationship but there is a general linear trend to the data We want to identify a line that has a good fit to the data. This isn t a deterministic relationship so the points won t fall perfectly on the line. 4 Volume (liter rs) Plasma Body Weight (kg) 15

16 Estimate the Regression Line Equation A few of the many possible lines through the data points are illustrated t in the plot. How do we decide which h line best fits the data? 4 Pla asma Volum me (liters) Body Weight (kg) 16

17 Least Squares Regression Line The linear regression line is the line that gets closest to all of the points. This is called the least squares regression line. The least squares regression line minimizes the sum of the squares of the vertical distance between each observed data point (y i ) and the line minimize n ( y i 1 2 i point on linei) 17

18 Vertical distances between each observed Y (y i ) and the line are in red. The sum of these distances squared is minimized by the least squares regression line 4 Plasma a Volume (L) Body Weight (kg) 18

19 Least Squares Regression Line Equation The equation for a line requires a slope and an intercept In regression analysis, we estimate the population regression line with the least squares regression line calculated l from sample data: the sample regression line The notation for the slope and intercept in the population regression line are Greek letters for the intercept for the slope The notation for the slope and intercept in the sample regression line are Roman letters a for the intercept t b for the slope 19

20 The Population Regression Line 0 is the y -intercept of the line is the slope of the regression line 1 is the error term - the difference between the observed Y and the regression line Y X 20

21 Sample Regression Line 0 and ad 1 are aepopulation o parameters a Sample estimates for the regression parameters are : a is the estimate for b is the estimate for Y a bx is the regression line calculated from sample dt data Y is the predicted value of Y 21

22 Least Squares Regression Line aand and b are estimates of the regression coefficients and The regression coefficients are estimated from the sample data by the least squares method The intercept a is the estimated expected value of Y when X= 0 The slope b is the estimated expected change in Y corresponding to a 1 unit increase in X Y is the expected (or predicted) value of y, the point on the line. It is called the fitted value of y The following slide illustrates the least squares regression ession line 22

23 The Equation of a Regression y y Line Y a bx b a intercept 0 One-unit Change in X slope x 23

24 Interpretation of predicted values of Y The predicted value of y is the expected y-value Since not all observed data points are exactly on the regression line, there is a range of possible y-values (a distribution) for each x-value. In regression analysis the distribution of y-values for each x-value is assumed to be a normal distribution. The predicted values of y represent the mean values of the distributions of y for each specified value of x. The following slide illustrates this for 3 values of X: notice that t the mean of each distribution ib ti is on the regression line equation (the predicted value of y) and that the distribution of y-values are normal distributions. 24

25 Simple Linear Regression Model Illustrated 25

26 Assumptions for Regression Analysis There are several assumptions that should be met for regression analysis: For each value of X, the Y variable is assumed to have a normal distribution the mean of the normal distribution is the predicted value, Y The normal distributions are assumed to have equal variance across the entire range of X values. This assumption is called homogeneity or homoscedasticity. The predicted values of Y fall on the regression line representing the linear relationship between X and Y The Y observations are assumed to be independent The observations are from a random sample 26

27 Interpretation of the Slope of the Regression line The slope b is the expected change in Y corresponding to a 1 unit increase in X b = 0: There is no linear association between Y and X b > 0: There is a Positive linear association between Y and X (as X increases the expected value of Y increases) b < 0: There is a Negative linear association between Y and X (as X increases the expected value of Y decreases) The following slide illustrates a positive, negative and 0 slope. 27

28 Illustration of Negative, Positive slopes y and slope = 0 y b >0 b = 0 b < 0 0 x 28

29 Calculating the Slope of the Regression Line The formula to calculate the slope of the least squares regression line is given below b n ( x x )( y y ) i 1 i i n ( ) x x i i Notice that the numerator is the same as the numerator in the formula for the correlation coefficient. 29

30 b for plasma (Y) and body weight (X) example X Y (X- Xbar) (Y-Ybar) (X-Xbar)(Y-Ybar) (X-Xbar) Mean SUM

31 Slope of regression line From the previous slide the sum of (X-X)(Y-Y) Y) = The sum of (X-X) X) 2 = b = / = Interpretation of the slope: For every one unit increase in X, the expected increase in Y is units (rounded to 4 decimal places) Plasma volume increases liters for every one kg increase in body weight. The slope is positive indicating that as body weight (X) increases, plasma volume (Y) also increases 31

32 Calculating the Intercept of the regression line The intercept a of the regression line is the estimated value of Y when X = 0 a is calculated from the average value of Y, the average value of X and the estimated t slope b by the following formula: a Y bx 32

33 Intercept for Plasma Volume Example X Y b a * The intercept is the estimated expected value of Y when X = 0. Intercepts do not always have realistic interpretations. In this example, plasma volume is predicted to be liters when body weight = 0 kg. which h is not a possibility. 33

34 Regression Line Equation Once the slope and the intercept have been calculated the regression equation can be constructed: t Y a bx Y X This is the equation that will be used to predict plasma volume (l) from body weight (kg). The regression equation calculated from sample data is an estimate of the true population regression equation. 34

35 Regression Line Equation and interpretation i of the slope A 1 unit increase in X for this data = 1 kg so the interpretation of the slope in this regression line equation is: For each 1 kg increase in body weight, the expected increase in plasma volume is.0436 liters. What is the expected plasma volume increase for a 10 kg increase in body weight? For a 10 kilogram increase in body weight, the expected increase in plasma volume = 10* = liters. 35

36 What if the slope of the regression line is negative? If the slope of the regression line is negative we would expect a decrease in Y with each unit increase in X. The slope is a measure of the expected change in Y for each 1-unit increase in X If the slope is positive, the expected change in Y is an increase If the slope is negative, the expected change in Y is a decrease. 36

37 Regression Coefficients in Excel Excel has functions to calculate the slope and the intercept of the least squares regression line: The SLOPE function returns b - the slope =SLOPE(y-range, x-range) The INTERCEPT function returns a -the intercept =INTERCEPT(y-range, x-range) For both of these functions enter the y-range of fd data first and dth then the x-range of fth the data. 37

38 Plasma Volume Example in Excel The Lesson 15 Excel Module works through h the Plasma Volume / body weight regression example: Create a scatterplot of the data work through the calculations of the Slope and Intercept of the regression line Use the Excel Slope and Intercept functions After you ve worked through the calculations once, use the Excel functions to find the slope and intercept for future regression problems 38

39 Residuals The residual is st the ed difference ee cebet between ee the observed (Y) and the expected (Y ) value of Y Residual = Y Y Y is the observed Y for any X Y is the Y-value on the regression line for that t value of X The residual is the component of Y that is not predicted by X The least squares regression line is the line that minimizes the squared residuals 39

40 Residuals for Plasma Volume Example X Y Y' Residual Which point is closest to the regression line? Which point is furthest from the regression line? Calculate Y, the expected value of Y, using the regression line equation. The residual is the difference between Y and Y (74, 3.37) has the smallest residual (70.5, 3.49) has the largest residual 40

41 Regression Line and Residuals Largest residual Plasm ma Volume (L) Body Weight (kg) Smallest residual 41

42 Analysis of Residuals A Residual plot is a plot of the residual values on the Y- axis and the x-values on the X-axis If there is a linear relationship between X and Y, the correlation between X and the residuals should equal 0. The scatterplot will be a random scatter of points with no evident linear pattern. A nonlinear relationship between X and Y will be more evident in the residual plot of the (X, residual) data than in the scatterplot of the original (X, Y) data The Excel Regression analysis tool has an option for selecting the Residual plot. The Residual plot for the plasma volume example is on the following slide. 42

43 Residual Plot for Plasma Volume Body weight data body weight (kg) Residual Plot Re esiduals body weight (kg) No evidence of nonlinearity. The points are equally distributed around the value 0 with no evident positive or negative slope 43

44 (X, Y) Scatterplot for a nonlinear (or curvilinear) relationship When there is a curvilinear relationship between X and Y, the least squares regression line does not represent the relationship 44

45 Residual Plot for Curvilinear Relationship X Residual Plot 6 4 Residuals X This is the residual plot for the relationship on the previous slide. It illustrates that the relationship is not linear. The residual plot points aren t evenly distributed around the value 0. 45

46 Regression analysis for curvilinear relationships Simple linear regression analysis should not be used when X and Y have a curvilinear relationship There are several strategies for dealing with a curvilinear relationship between X and Y One option is to try a logarithmic transformation of the data to see if this improves the linear relationship Another option is to use piecewise regression fit one regression line to the increasing portion of the curve and a second regression line to the decreasing portion of the curve Athid third option is to include X 2 or X 3 in the regression equation (covered in PubH 6415 with multiple regression models). 46

47 Linear Regression Procedure Look at a scatter plot of the data Plot Y on the y-axis and X on the x-axis Add the trend line to the plot Estimate the regression line equation Find the slope and intercept of the regression line Check Residuals Is the relationship between X and Y statistically significant? Use a t-test test t of the slope to determine significance ifi How well does the estimated regression line equation fit the data? Calculate R 2 - the coefficient of determination Use the estimated regression line equation to predict values of the dependent variable (Y) for specified values of the independent variable (X). 47

48 Is the relationship between X and Y significant? ifi If the slope of the regression line = 0, this indicates there is no linear relationship between the variables. If there is no linear relationship the variables are considered to be independent Att t-test test t of the slope estimate t can be done to test t for independence between the X and Y variables Null hypothesis: slope = 0 The null hypothesis states t that t the variables are independent d Alternative hypothesis: slope 0 The alternative hypothesis is that there is a significant relationship between the variables If the t-test test of the slope result is significant (p-value < ), reject the null hypothesis and conclude that there is a statistically significant relationship between the two variables. 48

49 Notation for Population slope and Intercept As in any hypothesis test, the null and alternative hypotheses are stated about the population parameters, not about the estimates. The population parameters for the slope and intercept t of the regression line for the population are the Greek letters 1 and 0 1 is the population parameter for the slope 0 is the population parameter for the intercept The statistic for the t-test test of the slope will use the estimated value of the slope (b) that is calculated from the data. 49

50 t-test test of the Slope 1. State the Hypotheses Null hypothesis: = 0 Alternative hypothesis: 0 2. A t-test test will be used to test the hypothesis 3. Significance level = The degrees of freedom for a t-test test of the slope are n-2 where n=sample size The critical values of the t-test test are found using TINV(0.05, 05 df). For the plasma volume example, n = 8 so the critical values = TINV(0.05, 6) = and

51 t-test test of the slope 5. Calculate the test statistic the slope estimate divided by the standard error of the slope t b 1 SE( b 1 ) The formula for the SE of the slope is complicated so we will use the Excel Data Analysis Tool to do this t- test. The Data Analysis Tool provides the t-statistic and the p-value of the t-test test of the slope 6. State the conclusion. If the test statistic is more extreme than the critical values reject the null hypothesis and conclude that there is a significant relationship between the variables. 51

52 T-test of the Slope in Excel Data Analysis Tool output for the weight / plasma volume example: The t-statistic and p-value for the t-test of the slope are highlighted SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted d R Square Standard Error Observations 8 ANOVA df SS MS F Significance F Regression Residual Total Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept Body weight P-value for t-test test = so reject the null hypothesis and conclude that there is a significant relationship between weight and plasma volume 52

53 Regression Analysis in Excel In Excel Module 15 use the Data Analysis Tool to obtain the Regression Analysis results select Regression under the Data Analysis Tool. Enter the plasma volume data for Y-range and the weight data for X-range Check labels if you highlight the column headers Also check Residuals and Residual Plot Identify the t-statistic t ti ti and the p-value for the t-test test t of the slope. Also identify the slope and the intercept on the output table These are under the Coefficients column 95% confidence intervals for the coefficients are also provided if the Confidence Level box is checked 53

54 T-test of the Intercept The Data Analysis Tool also provides results of a t-test test of the Intercept. The Null hypothesis of this test is that the intercept = 0: = 0 The Alternative ti hypothesis of this test t is that t the intercept 0: 0 Usually there is not much interest in the t-test test of the intercept because testing whether the intercept = 0 does not provide information about the relationship between the two variables. From the Regression Table, you can see that the null hypothesis for the intercept = 0 is not rejected because the p-value = This result does not affect the significant result of the t-test test of the slope. 54

55 Linear Regression Procedure Look at a scatter plot of the data Plot Y on the y-axis and X on the x-axis Add the trend line to the plot Estimate the regression line equation Find the slope and intercept of the regression line Is the relationship statistically significant? Use a t-test test of the slope to determine significance How well does the estimated t regression line equation fit the data? Calculate R 2 - the coefficient of determination Use the estimated regression line equation to predict values of the dependent variable (Y) for specified values of the independent variable (X). 55

56 How well does the regression line equation fit the data? r 2 is st the notation otato for the ecoe coefficient ce to of determination r 2 is equal to the correlation coefficient (r) squared. It can range from 0 to 1. Interpretation of r 2 r 2 is proportion of variation in the dependent d variable (Y) that is explained by the estimated least squares regression equation. Larger values of r 2 indicate a better fit of the regression line to the data which indicates a more useful predictive model. 56

57 Calculating r 2 In Excel, you can use the CORREL function to find the correlation coefficient and square this value to find the coefficient of determination For the plasma / weight data, r = so r 2 = = Or you can find r 2 on the Data Analysis Tool Output: Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 8 Multiple R = the correlation coefficient R square = coefficient of determination (r 2 ) 57

58 Interpretation of r 2 For the plasma volume example r 2 = Interpretation: 57.6% of the variation in plasma volume is explained by the regression line equation with weight as the explanatory variable. Since only 57.6% of the variation in plasma volume is explained by body weight, there are likely other variables that explain some of the variation in plasma volume. Multiple l regression analysis uses more than one explanatory variable to predict the dependent variable This is covered in PubH 6415 If there are other explanatory variables significantly related to plasma volume in a multiple regression model, r 2 will increase 58

59 Linear Regression Procedure Look at a scatter plot of the data we have done this Plot Y on the y-axis and X on the x-axis Does the relationship appear to be linear? Estimate the regression line equation we have done this Find the slope and intercept of the regression line Is the relationship statistically significant? Use a t-test test of the slope to determine significance How well does the estimated t regression line equation fit the data? We have done this Calculate R 2 - the coefficient of determination Use the estimated regression line equation to predict values of the dependent variable (Y) for specified values of the independent variable (X). 59

60 Using the Regression Line equation for Prediction i The regression line equation for the weight and plasma volume data is: Y X For a given value of weight (X), the plasma volume (Y) can be predicted. What is the expected plasma volume for an individual who weighs 60 kg? Insert 60 in the equation in place of X and solve for Y: Y * lite liters 60

61 Predicting plasma volume for weight = 60 kg Plasma a Volume (liters) Body Weight (kg) The predicted plasma volume for weight = 60 kg is the point on the regression line corresponding to x = 60. This point is 2.7 liters. 61

62 Appropriate Applications of the Regression Line Equation Predictions using regression line equations are only valid within the range of x-values in the collected data. For the example data, the range of weight is from kgs. It would not be appropriate to use this regression line equation to predict plasma volume for an individual weighing 100 kg or an individual weighing 25 kg. There may be a different relationship between weight and plasma volume beyond the values of the collected data so the relationship identified by the regression line equation should not be extrapolated much beyond the range of the X values. 62

63 More cautions about application of Regression line predictions Predictions using Regression line equations are only valid for the population represented by the sample data. For Example, if data for a regression analysis are collected for girls age 10-18, predictions using the equation are not necessarily valid for boys, adults or girls younger than 10. You can t assume that the relationship between two variables in one population is the same in other populations. Read the study description carefully to identify the population that was sampled. Regression analysis inferences are valid for this population but not necessarily other populations. 63

64 What if there isn t a significant relationship between the variables? If regression analysis reveals that there is NOT a significant relationship between the two variables (that is if the p-value for the t-test test of the slope > ) )the ) regression equation is not useful for predicting values of the dependent variable from the independent variable. If the t-test test of the slope is NOT significant, end the regression analysis procedure and do not use the regression line equation for prediction. Prediction using the regression line equation is only useful if the null hypothesis of independence between the variables is rejected. 64

65 Relationship between Correlation and Regression The correlation coefficient and the slope of the regression line are related. For a given set of data: They will both have the same sign indicating the direction of the relationship (positive or negative). There is a mathematical ti relationship between the slope and the correlation coefficient: the slope of the regression line is equal to the correlation coefficient times the standard deviation of y divided by the standard deviation of x: b 1 rs y s x 65

66 Hypothesis Test of population correlation coefficient: i We can set up a hypothesis test of independence for the population correlation: Null Hypothesis: no significant linear association between the variables Alternative Hypothesis: 0 significant linear association between the variables The test statistic is a t-statistic with n-2 df After finding the t-statistic,,y you can use EXCEL to find the p-value = TDIST(t, n-2, 2) t r n 1 r

67 T-test of the correlation coefficient i For a given sample data, the t-test test for and the t-test test for the slope, 1, will have the same t-statistic t ti ti and p-value. For the plasma volume data, the t-statistic for the test of the population correlation coefficient = which is the same as the t-statistic t ti ti for the slope of the regression line You can work through the equation in EXCEL to confirm this P-value = TDIST( , 6, 2) = The same conclusion is reached from either hypothesis test: t there is a significant ifi relationship between the two variables The p-value < 0.05 so the null hypothesis of independence e is rejected at significance n level el

68 Linear Regression and Correlation: which to use? Both Linear Regression and Correlation Analysis can be used to explore the linear relationship between two continuous (quantitative) random variables Use Correlation analysis when the interest is primarily in identifying whether a relationship exists. Use the t-test test of the correlation coefficient to determine if the relationship is significant. Use Regression ession Analysis to identify a relationship AND to predict the value of one variable given a value of the other variable. Use the t-test test of the slope to determine if the relationship is significant Regression analysis is most useful when there is an identified interest in predicting one variable from the other(s). If prediction doesn t make sense, use correlation analysis. 68

69 Readings and Assignments Reading Chapter 8 pgs , 194, Complete the Lesson 15 Practice Exercises Lesson 15 Excel Modules Excel Module 15: Plasma Volume works through the example in this Lesson Excel Module 15: BMI works through the example in the text (pages , 206, ) 209) Complete OPTIONAL Homework 11: Use the Data Analysis Tool for the Linear Regression problems 69

A correlation exists between two variables when one of them is related to the other in some way.

A correlation exists between two variables when one of them is related to the other in some way. Lecture #10 Chapter 10 Correlation and Regression The main focus of this chapter is to form inferences based on sample data that come in pairs. Given such paired sample data, we want to determine whether

More information

Chapter 10 Correlation and Regression

Chapter 10 Correlation and Regression Chapter 10 Correlation and Regression 10-1 Review and Preview 10-2 Correlation 10-3 Regression 10-4 Prediction Intervals and Variation 10-5 Multiple Regression 10-6 Nonlinear Regression Section 10.1-1

More information

Chapter 12 Relationships Between Quantitative Variables: Regression and Correlation

Chapter 12 Relationships Between Quantitative Variables: Regression and Correlation Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 12 Relationships Between Quantitative Variables: Regression and Correlation

More information

Lesson 4 Part 1. Relationships between. two numerical variables. Correlation Coefficient. Relationship between two

Lesson 4 Part 1. Relationships between. two numerical variables. Correlation Coefficient. Relationship between two Lesson Part Relationships between two numerical variables Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear between two numerical variables Relationship

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

STA-3123: Statistics for Behavioral and Social Sciences II. Text Book: McClave and Sincich, 12 th edition. Contents and Objectives

STA-3123: Statistics for Behavioral and Social Sciences II. Text Book: McClave and Sincich, 12 th edition. Contents and Objectives STA-3123: Statistics for Behavioral and Social Sciences II Text Book: McClave and Sincich, 12 th edition Contents and Objectives Initial Review and Chapters 8 14 (Revised: Aug. 2014) Initial Review on

More information

SIMPLE REGRESSION ANALYSIS

SIMPLE REGRESSION ANALYSIS SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two

More information

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2 Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables

More information

Chapter 27. Inferences for Regression. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Chapter 27. Inferences for Regression. Copyright 2012, 2008, 2005 Pearson Education, Inc. Chapter 27 Inferences for Regression Copyright 2012, 2008, 2005 Pearson Education, Inc. An Example: Body Fat and Waist Size Our chapter example revolves around the relationship between % body fat and waist

More information

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION In this lab you will learn how to use Excel to display the relationship between two quantitative variables, measure the strength and direction of the

More information

Chapters 2 and 10: Least Squares Regression

Chapters 2 and 10: Least Squares Regression Chapters 2 and 0: Least Squares Regression Learning goals for this chapter: Describe the form, direction, and strength of a scatterplot. Use SPSS output to find the following: least-squares regression

More information

R egression is perhaps the most widely used data

R egression is perhaps the most widely used data Using Statistical Data to Make Decisions Module 4: Introduction to Regression Tom Ilvento, University of Delaware, College of Agriculture and Natural Resources, Food and Resource Economics R egression

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

More information

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Understand linear regression with a single predictor Understand how we assess the fit of a regression model Total Sum of Squares

More information

Chapter 13 Correlational Statistics: Pearson s r. All of the statistical tools presented to this point have been designed to compare two or

Chapter 13 Correlational Statistics: Pearson s r. All of the statistical tools presented to this point have been designed to compare two or Chapter 13 Correlational Statistics: Pearson s r All of the statistical tools presented to this point have been designed to compare two or more groups in an effort to determine if differences exist between

More information

The scatterplot indicates a positive linear relationship between waist size and body fat percentage:

The scatterplot indicates a positive linear relationship between waist size and body fat percentage: STAT E-150 Statistical Methods Multiple Regression Three percent of a man's body is essential fat, which is necessary for a healthy body. However, too much body fat can be dangerous. For men between the

More information

Chapter 10 Correlation and Regression

Chapter 10 Correlation and Regression Weight Chapter 10 Correlation and Regression Section 10.1 Correlation 1. Introduction Independent variable (x) - also called an explanatory variable or a predictor variable, which is used to predict the

More information

CHAPTER 2 AND 10: Least Squares Regression

CHAPTER 2 AND 10: Least Squares Regression CHAPTER 2 AND 0: Least Squares Regression In chapter 2 and 0 we will be looking at the relationship between two quantitative variables measured on the same individual. General Procedure:. Make a scatterplot

More information

Chapter 8. Linear Regression. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Chapter 8. Linear Regression. Copyright 2012, 2008, 2005 Pearson Education, Inc. Chapter 8 Linear Regression Copyright 2012, 2008, 2005 Pearson Education, Inc. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression Simple linear regression models the relationship between an independent variable (x) and a dependent variable (y) using an equation that expresses y as a linear function of x,

More information

Section 10-3 REGRESSION EQUATION 2/3/2017 REGRESSION EQUATION AND REGRESSION LINE. Regression

Section 10-3 REGRESSION EQUATION 2/3/2017 REGRESSION EQUATION AND REGRESSION LINE. Regression Section 10-3 Regression REGRESSION EQUATION The regression equation expresses a relationship between (called the independent variable, predictor variable, or explanatory variable) and (called the dependent

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p. Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under

More information

Soc Final Exam Correlation and Regression (Practice)

Soc Final Exam Correlation and Regression (Practice) Soc 102 - Final Exam Correlation and Regression (Practice) Multiple Choice Identify the choice that best completes the statement or answers the question. 1. A Pearson correlation of r = 0.85 indicates

More information

Simple Regression and Correlation

Simple Regression and Correlation Simple Regression and Correlation Today, we are going to discuss a powerful statistical technique for examining whether or not two variables are related. Specifically, we are going to talk about the ideas

More information

Spring 2014 Math 263 Deb Hughes Hallett. Class 23: Regression and Hypothesis Testing (Text: Sections 10.1)

Spring 2014 Math 263 Deb Hughes Hallett. Class 23: Regression and Hypothesis Testing (Text: Sections 10.1) Class 23: Regression and Hypothesis Testing (Text: Sections 10.1) Review of Regression (from Chapter 2) We fit a line to data to make projections. The Tower of Pisa is leaning more each year. The measurements

More information

Chapter 4 Describing the Relation between Two variables. How can we explore the association between two quantitative variables?

Chapter 4 Describing the Relation between Two variables. How can we explore the association between two quantitative variables? Chapter 4 Describing the Relation between Two variables How can we explore the association between two quantitative variables? An association exists between two variables if a particular value of one variable

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Chapter 3: Describing Relationships

Chapter 3: Describing Relationships Chapter 3: Describing Relationships The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Chapter 3 2 Describing Relationships 3.1 Scatterplots and Correlation 3.2 Learning Targets After

More information

Simple Linear Regression Models

Simple Linear Regression Models Simple Linear Regression Models 14-1 Overview 1. Definition of a Good Model 2. Estimation of Model parameters 3. Allocation of Variation 4. Standard deviation of Errors 5. Confidence Intervals for Regression

More information

The statistical procedures used depend upon the kind of variables (categorical or quantitative):

The statistical procedures used depend upon the kind of variables (categorical or quantitative): Math 143 Correlation and Regression 1 Review: We are looking at methods to investigate two or more variables at once. bivariate: multivariate: The statistical procedures used depend upon the kind of variables

More information

In last class, we learned statistical inference for population mean. Meaning. The population mean. The sample mean

In last class, we learned statistical inference for population mean. Meaning. The population mean. The sample mean RECALL: In last class, we learned statistical inference for population mean. Problem. Notation Populati on Notation X σ Meaning The population mean The sample mean The population standard deviation s The

More information

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ 1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Simple Linear Regression

Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression Statistical model for linear regression Estimating

More information

USING MINITAB: A SHORT GUIDE VIA EXAMPLES

USING MINITAB: A SHORT GUIDE VIA EXAMPLES USING MINITAB: A SHORT GUIDE VIA EXAMPLES The goal of this document is to provide you, the student in Math 112, with a guide to some of the tools of the statistical software package MINITAB as they directly

More information

Chapter 13 In this chapter, you learn: Multiple Regression Model with k Independent Variables:

Chapter 13 In this chapter, you learn: Multiple Regression Model with k Independent Variables: Chapter 4 4- Business Statistics: A First Course Fifth Edition Chapter 3 Multiple Regression Business Statistics: A First Course, 5e 9 Prentice-Hall, Inc. Chap 3- Learning Objectives In this chapter, you

More information

Introduction to Linear Regression and Correlation Analysis

Introduction to Linear Regression and Correlation Analysis Introduction to Linear Regression and Correlation Analsis Goals After this, ou should be able to: Calculate and interpret the simple correlation between two variables Determine whether the correlation

More information

Unit 6 - Simple linear regression

Unit 6 - Simple linear regression Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Display and Summarize Correlation for Direction and Strength Properties of Correlation Regression Line Cengage

More information

Simple Linear Regression Chapter 11

Simple Linear Regression Chapter 11 Simple Linear Regression Chapter 11 Rationale Frequently decision-making situations require modeling of relationships among business variables. For instance, the amount of sale of a product may be related

More information

Correlations & Linear Regressions. Block 3

Correlations & Linear Regressions. Block 3 Correlations & Linear Regressions Block 3 Question You can ask other questions besides Are two conditions different? What relationship or association exists between two or more variables? Positively related:

More information

Section 3 Part 1. Relationships between two numerical variables

Section 3 Part 1. Relationships between two numerical variables Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.

More information

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE EXAMINATION MODULE 4

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE EXAMINATION MODULE 4 THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE EXAMINATION NEW MODULAR SCHEME introduced from the examinations in 007 MODULE 4 SPECIMEN PAPER A AND SOLUTIONS The time for the examination is 1½ hours.

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

AMS7: WEEK 8. CLASS 1. Correlation Monday May 18th, 2015

AMS7: WEEK 8. CLASS 1. Correlation Monday May 18th, 2015 AMS7: WEEK 8. CLASS 1 Correlation Monday May 18th, 2015 Type of Data and objectives of the analysis Paired sample data (Bivariate data) Determine whether there is an association between two variables This

More information

Relationship of two variables

Relationship of two variables Relationship of two variables A correlation exists between two variables when the values of one are somehow associated with the values of the other in some way. Scatter Plot (or Scatter Diagram) A plot

More information

SECTION 5 REGRESSION AND CORRELATION

SECTION 5 REGRESSION AND CORRELATION SECTION 5 REGRESSION AND CORRELATION 5.1 INTRODUCTION In this section we are concerned with relationships between variables. For example: How do the sales of a product depend on the price charged? How

More information

Association between Two or More Variables

Association between Two or More Variables Chapter 4 Association between Two or More Variables Very frequently social scientists want to determine the strength of the association of two or more variables. For example, one might want to know if

More information

Homework 11. Part 1. Name: Score: / null

Homework 11. Part 1. Name: Score: / null Name: Score: / Homework 11 Part 1 null 1 For which of the following correlations would the data points be clustered most closely around a straight line? A. r = 0.50 B. r = -0.80 C. r = 0.10 D. There is

More information

Mind on Statistics. Chapter Which expression is a regression equation for a simple linear relationship in a population?

Mind on Statistics. Chapter Which expression is a regression equation for a simple linear relationship in a population? Mind on Statistics Chapter 14 Sections 14.1-14.3 1. Which expression is a regression equation for a simple linear relationship in a population? A. ŷ = b 0 + b 1 x B. ŷ = 44 + 0.60 x C. ( Y) x D. E 0 1

More information

Chapter 2. Looking at Data: Relationships. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Chapter 2. Looking at Data: Relationships. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides Chapter 2 Looking at Data: Relationships Introduction to the Practice of STATISTICS SEVENTH EDITION Moore / McCabe / Craig Lecture Presentation Slides Chapter 2 Looking at Data: Relationships 2.1 Scatterplots

More information

The F distribution

The F distribution 10-5.1 The F distribution 11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis is a statistical technique

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

Example: Boats and Manatees

Example: Boats and Manatees Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

More information

EXPERIMENT 6: HERITABILITY AND REGRESSION

EXPERIMENT 6: HERITABILITY AND REGRESSION BIO 184 Laboratory Manual Page 74 EXPERIMENT 6: HERITABILITY AND REGRESSION DAY ONE: INTRODUCTION TO HERITABILITY AND REGRESSION OBJECTIVES: Today you will be learning about some of the basic ideas and

More information

7. Tests of association and Linear Regression

7. Tests of association and Linear Regression 7. Tests of association and Linear Regression In this chapter we consider 1. Tests of Association for 2 qualitative variables. 2. Measures of the strength of linear association between 2 quantitative variables.

More information

Inference for Regression

Inference for Regression Inference for Regression IPS Chapter 10 10.1: Simple Linear Regression 10.: More Detail about Simple Linear Regression 01 W.H. Freeman and Company Inference for Regression 10.1 Simple Linear Regression

More information

Statistics Advanced Placement G/T Essential Curriculum

Statistics Advanced Placement G/T Essential Curriculum Statistics Advanced Placement G/T Essential Curriculum UNIT I: Exploring Data employing graphical and numerical techniques to study patterns and departures from patterns. The student will interpret and

More information

Correlation and Regression 07/10/09

Correlation and Regression 07/10/09 Correlation and Regression Eleisa Heron 07/10/09 Introduction Correlation and regression for quantitative variables - Correlation: assessing the association between quantitative variables - Simple linear

More information

Correlation and regression

Correlation and regression Applied Biostatistics Correlation and regression Martin Bland Professor of Health Statistics University of York http://www-users.york.ac.uk/~mb55/msc/ Correlation Example: Muscle strength and height in

More information

Ismor Fischer, 5/29/ POPULATION Random Variables X, Y: numerical Definition: Population Linear Correlation Coefficient of X, Y

Ismor Fischer, 5/29/ POPULATION Random Variables X, Y: numerical Definition: Population Linear Correlation Coefficient of X, Y Ismor Fischer, 5/29/2012 7.2-1 7.2 Linear Correlation and Regression POPULATION Random Variables X, Y: numerical Definition: Population Linear Correlation Coefficient of X, Y ρ = σ XY σ X σ Y FACT: 1 ρ

More information

Lecture 18 Linear Regression

Lecture 18 Linear Regression Lecture 18 Statistics Unit Andrew Nunekpeku / Charles Jackson Fall 2011 Outline 1 1 Situation - used to model quantitative dependent variable using linear function of quantitative predictor(s). Situation

More information

DATA INTERPRETATION AND STATISTICS

DATA INTERPRETATION AND STATISTICS PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

More information

Lab 11: Simple Linear Regression

Lab 11: Simple Linear Regression Lab 11: Simple Linear Regression Objective: In this lab, you will examine relationships between two quantitative variables using a graphical tool called a scatterplot. You will interpret scatterplots in

More information

Guide to the Summary Statistics Output in Excel

Guide to the Summary Statistics Output in Excel How to read the Descriptive Statistics results in Excel PIZZA BAKERY SHOES GIFTS PETS Mean 83.00 92.09 72.30 87.00 51.63 Standard Error 9.47 11.73 9.92 11.35 6.77 Median 80.00 87.00 70.00 97.50 49.00 Mode

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Statistiek II. John Nerbonne. March 24, 2010. Information Science, Groningen Slides improved a lot by Harmut Fitz, Groningen!

Statistiek II. John Nerbonne. March 24, 2010. Information Science, Groningen Slides improved a lot by Harmut Fitz, Groningen! Information Science, Groningen j.nerbonne@rug.nl Slides improved a lot by Harmut Fitz, Groningen! March 24, 2010 Correlation and regression We often wish to compare two different variables Examples: compare

More information

Section 2.3: Regression

Section 2.3: Regression Section 2.3: Regression Idea: If there is a known linear relationship between two variables x and y (given by the correlation, r), we want to predict what y might be if we know x. The stronger the correlation,

More information

A correlation exists between two quantitative variables when there is a statistical relationship between them.

A correlation exists between two quantitative variables when there is a statistical relationship between them. Number of Faculty Last Name First Name Class Time Chapter 12-1 CHAPTER 12: LINEAR REGRESSION AND CORRELATION These notes are intended to supplement the lecture and textbook, not to replace the chapter

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

Regressions. Economics 250 Regression notes

Regressions. Economics 250 Regression notes Economics 250 Regression notes Regressions Often the questions we care about in economics isn t the mean of variables, but rather the relationship between variables. For example: How much does an extra

More information

Using Minitab for Regression Analysis: An extended example

Using Minitab for Regression Analysis: An extended example Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression 1 Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1

More information

Ordinary Least Squares Regression Vartanian: SW 540

Ordinary Least Squares Regression Vartanian: SW 540 Ordinary Least Squares Regression Vartanian: SW 540 When to Use Ordinary Least Squares Regression Analysis A. Variable types 1. When you have an interval/ratio scale dependent variable. 2. When your independent

More information

Introduction to Regression. Dr. Tom Pierce Radford University

Introduction to Regression. Dr. Tom Pierce Radford University Introduction to Regression Dr. Tom Pierce Radford University In the chapter on correlational techniques we focused on the Pearson R as a tool for learning about the relationship between two variables.

More information

Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Centre for Excellence in Epidemiology University of Manchester 08/11/2016 Structure This Week What is a linear model? How

More information

Chapter 11: Two Variable Regression Analysis

Chapter 11: Two Variable Regression Analysis Department of Mathematics Izmir University of Economics Week 14-15 2014-2015 In this chapter, we will focus on linear models and extend our analysis to relationships between variables, the definitions

More information

Lesson 21: Multiple Linear Regression Analysis

Lesson 21: Multiple Linear Regression Analysis Lesson 21: Multiple Linear Regression Analysis Motivation and Objective: We ve spent a lot of time discussing simple linear regression, but simple linear regression is, well, simple in the sense that there

More information

Correlation. Scatterplots of Paired Data:

Correlation. Scatterplots of Paired Data: 10.2 - Correlation Objectives: 1. Determine if there is a linear correlation 2. Conduct a hypothesis test to determine correlation 3. Identify correlation errors Overview: In Chapter 9 we presented methods

More information

CHAPTER 11: Multiple Regression

CHAPTER 11: Multiple Regression CHAPTER : Multiple Regression With multiple linear regression, more than one explanatory variable is used to explain or predict a single response variable. Introducing several explanatory variables leads

More information

Simple Linear Regression

Simple Linear Regression 1 Excel Manual Simple Linear Regression Chapter 13 This chapter discusses statistics involving the linear regression. Excel has numerous features that work well for comparing quantitative variables both

More information

Introduction to Statistical Quality Control, 6 th Edition by Douglas C. Montgomery. Copyright (c) 2009 John Wiley & Sons, Inc.

Introduction to Statistical Quality Control, 6 th Edition by Douglas C. Montgomery. Copyright (c) 2009 John Wiley & Sons, Inc. 1 2 Learning Objectives Chapter 4 3 4.1 Statistics and Sampling Distributions Statistical inference is concerned with drawing conclusions about populations (or processes) based on sample data from that

More information

STA Module 5 Regression and Correlation

STA Module 5 Regression and Correlation STA 2023 Module 5 Regression and Correlation Learning Objectives Upon completing this module, you should be able to: 1. Define and apply the concepts related to linear equations with one independent variable.

More information

Correlation and Regression

Correlation and Regression Correlation and Regression Cal State Northridge Ψ47 Ainsworth Major Points - Correlation Questions answered by correlation Scatterplots An example The correlation coefficient Other kinds of correlations

More information

Chapter 4 Describing the Relation between Two Variables

Chapter 4 Describing the Relation between Two Variables Chapter 4 Describing the Relation between Two Variables 4.1 Scatter Diagrams and Correlation The response variable is the variable whose value can be explained by the value of the explanatory or predictor

More information

Soc Final Exam (Practice)

Soc Final Exam (Practice) Class: Date: Soc 102 - Final Exam (Practice) Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The null hypothesis for an ANOVA states that. a. there are

More information

Lecture 10: Chapter 10

Lecture 10: Chapter 10 Lecture 10: Chapter 10 C C Moxley UAB Mathematics 31 October 16 10.1 Pairing Data In Chapter 9, we talked about pairing data in a natural way. In this Chapter, we will essentially be discussing whether

More information

Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

More information

(a) Write a paragraph summarizing the key characteristics of this study. In particular, identify:

(a) Write a paragraph summarizing the key characteristics of this study. In particular, identify: Data Collection: (a) Write a paragraph summarizing the key characteristics of this study. In particular, identify: Observational units and sample size The explanatory variable and response variable (as

More information

0.1 Multiple Regression Models

0.1 Multiple Regression Models 0.1 Multiple Regression Models We will introduce the multiple Regression model as a mean of relating one numerical response variable y to two or more independent (or predictor variables. We will see different

More information

SELF-TEST: SIMPLE REGRESSION

SELF-TEST: SIMPLE REGRESSION ECO 22000 McRAE SELF-TEST: SIMPLE REGRESSION Note: Those questions indicated with an (N) are unlikely to appear in this form on an in-class examination, but you should be able to describe the procedures

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Statistical Functions in Excel

Statistical Functions in Excel Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.

More information