Section 14 Simple Linear Regression: Introduction to Least Squares Regression


 Jack Washington
 2 years ago
 Views:
Transcription
1 Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship between two variables. If the researcher is working with numeric measures and supposes a linear relationship between these two variables, the appropriate measure of association is correlation. Additionally, if a particular set of assumptions is met, we can predict one of the two variables (an outcome) based on the other variable (a predictor ); this is called simple linear regression. Further, a researcher may wish to understand the relationships among more than two variables. This can be done with an extension of simple linear regression, called multiple linear regression. Recall, any statistical hypothesis test is a method for quantifying how much evidence constitutes enough evidence to declare a significant outcome in a research study. The hypothesis being tested by a correlation, and also by simple linear regression, is whether two variables have a significant linear association with each other.
2 Slide 2 Linear Regression: Examples Is higher wine consumption associated with lower rates of hear disease? What is the nature of this relationship? Is the relationship linear? What is the relationship between the number of people living on farms and the passing of time from 1935 to How fast did the number of people living on farms in the US decrease? What is the relationship between plasma volume in the blood and body weight? Do these two measures have a linear relationship? Does estriol level of a mother have a linear relationship with the birthweight of her baby? Can we predict birthweight of a baby from a mother s estriol level? Does the age at which a child first begins talking predict a score of mental ability later in childhood? Is there a linear relationship between systolic blood pressure and age? 2 We learned when we have a measure of two continuous variables we can describe this relationship visually with a scatterplot. In addition, if that relationship appears to be linear, we can measure the strength and direction of the linear association. Finally, if certain assumptions are met, we may be able to predict the value of one measure from another measure. For example, is higher wine consumption associated with lower rates of hear disease? What is the nature of this relationship? Is the relationship linear? What is the relationship between the number of people living on farms and the passing of time from 1935 to In other words, how fast did the number of people living on farms in the US decrease from 1935 to 1990? What is the relationship between plasma volume in the blood and body weight? Do these two measures have a linear relationship? Can we predict plasma volume in the blood from a person s body weight? How well? Does estriol level of a mother have a linear relationship with the birthweight of her baby? Can we predict birthweight of a baby from a mother s estriol level? If so, can we anticipate a low birthweight baby from estriol levels? Does the age at which a child first begins talking predict a score of mental ability later in childhood? Is there a linear relationship between systolic blood pressure and age? In all of these examples, we are investigating the relationship between two quantitative variables. We may begin this investigation with a scatterplot followed by a correlation analysis. We will now take our investigation further by introducing simple linear regression.
3
4 Slide 3 Simple Linear Regression Simple Linear Regression(SLR) analysis is used to quantify the linear relationship between two quantitative variables. In this way, it is like correlation, but regression goes farther: It allows us to draw the line that best describes the linear relationship between X and Y. It allows us to predict the value of the outcome Y for a specified value of X. It allows us to quantify how much of a change in the value of Y is seen with a specified change in the value of X. In other studies the goal is to assess the relationships among a set of variables. 3 Simple linear regression analysis is used to quantify the linear relationship between two quantitative variables. In this way, it is like correlation, but regression goes farther: It allows us to draw the line that best describes the linear relationship between X and Y. It allows us to predict the value of the outcome Y for a specified value of X. It allows us to quantify how much of a change in the value of Y is seen with a specified change in the value of X.
5 Slide 4 Variable (X) and Variable (Y) We can describe the relationship or association between two quantitative variables using: Scatterplot Correlation Simple linear regression Usually we identify one variable as the outcome of interest, what we have been mostly thinking of as a disease variable so far. This is often called the response, or dependent, variable. The other variable is the predictor of interest, what we have been mostly thinking of as an exposure variable so far. This is often called the explanatory, or independent, variable. 4 Recall, usually we identify one variable as the outcome of interest, what we have been mostly thinking of as a disease variable so far. This is often called the response, or dependent, variable. The other variable is the predictor of interest, what we have been mostly thinking of as an exposure variable so far. This is often called the explanatory, or independent, variable. When each unit (person) has two measures we usually call one x and one y. If one variable can help predict the value of the other variable we call this variable x. It is also called the predictor, explanatory or independent variables. The other variable, y, is called the outcome, response variable or dependent variable. Sometimes we cannot tell which is the predictor and which is the outcome. Simple linear regression requires we pick one variable as the outcome.
6 Slide 5 Wine Consumption and Heart Disease Is higher wine consumption associated with lower rates of hear disease? What is the nature of this relationship? Is the relationship linear? Moore and McCabe, Introduction to the Practice of Statistics 4 th Edition, W. H. Freeman & Co., New York.. 5 Here is some data on wine consumption and heart disease deaths. Does this data suggest a linear relationship between these two variables?
7 Slide 6 Wine Consumption and Heart Disease 6 The data suggest a negative trend. Can we estimate how much lower heart disease rates are for each extra liter per person per year? How would we draw a line through this data to help us with this estimate? What can we say about the precision of this regression line? How much of the variability in heart disease deaths is explained by the regression line? Do you think these data come from a random sample? What assumptions are we making when using linear regression to make predictions? What confounders must we consider? These are all concepts we will investigate with linear regression.
8 Slide 7 Population Living on Farms What is the relationship between the number of people living on farms and the passing of time from 1935 to How fast did the number of people living on farms in the US decrease? 7 What is the relationship between the number of people living on farms and the passing of time from 1935 to How fast did the number of people living on farms in the US decrease? Does this data suggest a linear relationship between these two variables?
9 Slide 8 Population Living on Farms. How fast did the number of people living on farms in the US decrease? 8 We can see a strong negative trend that appears fairly linear. How might we draw a line through this data? Is there a best way to draw this line?
10 Slide 9 Plasma Volume and Body Weight What is the relationship between plasma volume in the blood and body weight? Do these two measures have a linear relationship? Body Plasma Subject Weight(kg) Volume(l) Consider the association between bodyweight in kilograms and plasma volume in the blood in liters for eight randomly selected people. Do heavier people have more plasma? If so, how much more? Is this relationship linear?
11 Slide 10 Simple Linear Regression Y, plasma volume (liters) Pearson s correlation = X, body weight (kg) 10 When we plot the data we can see a positive relationship between bodyweight and plasma. The data do not fall perfectly in a line. The correlation value when calculated is of We could calculate the value of correlation to help us understand the strength of the linear relationship. We may want to draw a line through this data, thus giving us a mathematical model to estimate plasma volume from weight, but which is the best line? The white line, the green line or the purple line? The technique of least squares regression will help us pick the line of best fit.
12 Slide 11 How Do We Choose the Best Line? The least squares regression line is the line which gets closest to all of the points How do we measure closeness to more than one point? minimize n (y i point_on_line i ) 2 i=1 11 The line of best fit is the regression line is the line that gets `closest' to all the data points. `Closeness' is measured as the vertical distance from the line to the data points. Specifically, the regression line is the one that minimizes the sum of all the squared vertical distances, hence estimation of this line is called least squares and the line is called the least square regression line.
13 Slide 12 Simple Linear Regression 12 Visually, we find the line that minimizes the squares of the vertical distances and the positive measures (points above the line) and the negative measures (points below the line), sum to zero. This could be very difficult to achieve by trial and error. We have some mathematical formulas that help us determine this exact line.
14 Slide 13 Equation of a Line Definition A line is defined by The intercept a (where the line crosses the vertical axis, the value of Y when X = 0), and The slope b (`rise over run,' how much y changes for each 1 unit change in x). y = a + bx 13 Before we move further with linear regression, let s review the equation of a line. That is, how do we represent a line with a mathematical function. A line is defined by the intercept a (where the line crosses the vertical axis, the value of Y when X = 0), and the slope b (`rise over run,' how much y changes for each 1 unit change in x). We write this as y = a + bx.
15 Slide 14 Equation of a Line 14 We can see the line crosses the vertical axis at the value a, when x = 0. We also see that for every one unit increase in x, y will change by the amount b.
16 Slide 15 Equation of a Line: Statistical Notation b b 0 1 = intercept = slope ˆ = b + b x y In statistics, the symbol for the intercept is b knot and the symbol for the slope is b sub one. Then we write the line as : y hat equals b0 + b1x. The reason we use yhat instead of y is to differentiate between the real data value y and our predicted value yhat given a value of x.
17 Slide 16 Equation of a Line: Statistical Notation y ˆ = b + b x y 0 1 b 0 b 1 slope intercept 0 x 16 Using statistical notation, we have the same picture as before. Here the line crosses the vertical axis at the value b knot, when x = 0. We also see that for every one unit increase in x, yhat will change by the amount b sub 1.
18 Slide 17 Estimating Intercept and Slope b b 0 1 = y b x s = r s y x 1 yˆ = b + b x The least squares line minimizes the sum of squared vertical distances. This translates into: b knot equal ybar slope times xbar. The slope is the correlation times the ratio of the standard deviation of the observed y values divided by the standard deviation of the observed x values. In this way, we see the slope and the correlation are related to one another. The correlation depends on both the slope and the precision. The equations are obtained using mathematics beyond this course. It is enough to understand that these are the equations to help us determine the least squares regression line, y hat = b not plus b sub 1 times x.
19 Slide 18 y y Slope and Correlation b >0 1 b 1 = 0 b 1 < 0 0 x 18 Notice if the slope is positive then the correlation is positive. If the slope is zero then the correlation is zero. If the slope is negative then the correlation is negative.
20 Slide 19 Simple Linear Regression Y, plasma volume (liters) Pearson s correlation = X, body weight (kg) 19 The data points are represented as the dots in our scatterplot, but the data points don't fall exactly on the line. How do we compute (and write) the least squares line for this data? Once we have the line, for any x value within the range of those values in our dataset, yhat is the point that will fall exactly on the least squares line, not the data value for y. Thus every x value can be plugged into this equation to calculate a predicted y value which we denote yhat.
21 Slide 20 Estimating Intercept and Slope sy b1 = r = s x = b = y b x = (66.875) = yˆ = x 20 Using the equations for estimating the slope and intercept for the least squares regression line, we get an intercept of and a slope of We must calculate the slope first because the equation for the intercept requires the use of the estimate of the slope. Generally, we do not do these calculations by hand. We use software to compute these values.
22 Slide 21 Plasma Volume and Weight yˆ = x 21 Using R we plot the least square regression line. This means for every one kilogram increase in body weight there is on average a liter increase in plasma volume. The intercept is the estimated plasma volume for a person who weighs zero kilograms. This estimate does not make biological sense. In this way, the intercept for this model is merely used to help us determine the line, not make a prediction at x = 0. The only meaningful estimates are within the range of our x values. That is weights from about 55 to 75 kilograms.
23 Slide 22 Plasma Volume and Weight Measurement of plasma volume very time consuming Body weight easy to measure: use equation and body weight to estimate plasma volume yˆ = x = (60) = Measuring plasma volume is very time consuming. We may want to estimate the plasma volume of a person outside this study based on the person s weight. For example, what on average would you expect plasma volume to be in liters for a 60 kilogram man? We would put 60 kilograms in for x and then calculate the estimated value to be 2.7 liters. That is, yhat equals * 60. Be very careful only to make estimates within the range of the data that was used to estimate the regression line. Also, be aware that measurement unit is meaningful. We would not want to insert values in pounds when the regression line is based on kilograms.
24 Slide 23 RSQUARE The square of the correlation (r 2= RSQUARE) is the fraction of the variation in the values of y that is explained by the least squares regression of y on x. r 2 variance of predicted values ŷ = variance of observed values of y = SSM SST 23 Recall Pearson s correlation: It measures the strength of the linear relationship between two quantitative variables. There is another measure called the coefficient of determination. It s value is Pearson s correlation squared. For this reason, it is often denoted RSQUARE. When using least squares regression typically the value of the coefficient of determination is used to help understand the amount of total variation that is explained by the regression of y on x. In fact, RSQUARE = SSM/SST. This is the sum of the squares of the model divided by the sum of the squares total. Those values will come from the ANOVA table in the linear regression output from the software. We will discuss the ANOVA table at length in a later lesson.
25 Slide 24 Plasma Volume and Weight This means 57.6% of the variation in plasma volume is explained by the least squares regression line of plasma volume on body weight. r 2 = 2 (0.759) = Recall, the correlation between plasma volume and weight is It we square this value, we have the coefficient of determination. The value is This means 57.6% of the variation in plasma volume is explained by the least squares regression line of plasma volume on body weight. When RSQUARE is close to 1, the regression line (the yhat values) is representing the original data (the Y values) well. When RSQUARE is close to 0, the regression line is not representing the original data well.
26 Slide 25 Simple Linear Regression: Residuals 25 When we draw the least squares regression line, the line of best fit, the line does not fall directly on all the data points. That is, the yhat values are different than the actual y values for the data. We call these vertical distances Residuals.
27 Slide 26 Residuals Model ˆ = b + b x y 0 1 ε = i y i yˆ i ε i =difference between observed and predicted value of response for each value of x => Called the residual. 26 y yhat for each piece of data is the residual for that point. This value is often denoted with epsilon sub i. We can calculate the value at any x in our dataset by taking the observed y value minus the predicted value, yhat from the model. If the residual is positive, it means the data value is above the line. If the residual is negative, the data value is below the line. We will use residuals and residual plots in our next lesson to investigate how well the linear model is fitting the data observed.
28 Slide 27 Estriol and Infant Birthweight Obstetricians sometimes order tests for estriol levels from 24hour urine specimens taken from pregnant women who are near term. The level of estriol (mg/24 hours) has been found to be positively related to the birthweight (grams/100) of the infant. Thus, the test can provide indirect evidence of an abnormally small fetus. [Bernard Rosner, Fundamentals of Biostatistics, page 425] 27 Let s do an another example. Obstetricians sometimes order tests for estriol levels from 24hour urine specimens taken from pregnant women who are near term, since the level of estriol has been found to be related to the birthweight of the infant. The test may provide indirect evidence of an abnormally small fetus.
29 Slide 28 Estriol and Infant Birthweight Pearson' s Correlation, r = Here is the scatterplot of birthweight and Estriol for 31 women and babies. We can see that there is a positive relationship between estriol level and birthweight. The relationship is not perfect, but linear regression may still help with predictions. The Pearson s correlation value is Notice that birthweight is in g/100. We will want to know this unit later for our calculations.
30 Slide 29 Estriol and Infant Birthweight yˆ = x 29 The values of the slope and intercept can be calculated using software, or by using the equations given in earlier slides. The prediction line shown on the scatterplot is yhat = x. This means for every one unit increase in estriol level the birthweight of the infant is on average g/100 higher, about 60 grams.
31 Slide 30 Estriol and Infant Birthweight Using estriol level to predict infant birthweight when estriol level is 10mg. yˆ = x = (10) = 27.6 grams/ Suppose we want to estimate the birthweight of a baby whose mother has an estriol level of 10 mg. Before we begin, we verify 10 mg is in the range of the original data. We can do this by looking at the scatterplot of the data. We can then put 10 mg in the least squares regression equation for x and calculate an estimated weight of 27.6 g/100. This is 2,760 grams.
32 Slide 31 Estriol and Infant Birthweight Using estriol level to predict infant birthweight when estriol level is 30mg. 31 Suppose we want to estimate the birthweight of a baby whose mother has an estriol level of 30 mg. Before we begin, we verify 30 mg is in the range of the original data. We can do this by looking at the scatterplot of the data. We see that 30mg is NOT in the range of the x data for our study. We should not use the regression line to estimate infant birthweight!
33 Slide 32 Estriol and Infant Birthweight Now let's go in the reverse direction: Low birthweight may be defined as infant birthweight less than 2500 grams. For what estriol level is the predicted infant birthweight equal to 2500 grams? (First convert to the correct units: 2500 grams = 25 grams/100.) 25 = x = 0.608x = x = x 32 Now let's go in the reverse direction: Low birthweight may be defined as infant birthweight less than 2500 grams. For what estriol level is the predicted infant birthweight equal to 2500 grams? First we must convert to the correct units: 2500 grams = 25 grams/100. If you set 25 = x and then solve for x, you will find the estriol level that predicts a low birthweight baby. The value of x is 5.72 mg.
34 Slide 33 Assumptions L = linear relationship between y and x. I = independence between values of y. (Value of one y does not affect value of another y). N = normality of error around each value of y. E= equality of variance around y for each value of x. 33 Linear regression requires we make some assumptions. Conveniently, these assumptions follow the acronym LINE. These assumptions are: L = = linear relationship between y and x. I = independence between values of y. One value of y does not affect another value of y. N = normality of error around each value of y. E= equality of variance around y for each value of x. Our next lesson will explore techniques to evaluate each of these assumptions.
35 Slide 34 Cautions Predicted values should only be computed for X values that fall within the range of X values in the original data. Just like a correlation, a regression line only summarizes the linear relationship between X and Y. If the relationship is truly nonlinear, then using the regression line can be misleading. Seeing a relationship (an association) between X and Y does not imply causation: that changes in X will cause changes in Y. 34 In addition to evaluating linear regression assumptions, we must take caution with the interpretation of our results. Predicted values should only be computed for X values that fall within the range of X values in the original data. Just like a correlation, a regression line only summarizes the linear relationship between X and Y. If the relationship is truly nonlinear, then using the regression line can be misleading. Seeing a relationship (an association) between X and Y does not imply causation: that changes in X will cause changes in Y.
36 Slide 35 Cautions In the regression context, a lurking variable is a third variable that may influence the relationship between X and Y. Outliers and skewed data can impact the regression line, just like they can impact the correlation. Either X or Y or both could have outliers or skewness. If including a particular data point changes the regression line compared to when it is not included, the data point is called influential. 35 In the regression context, a lurking variable is a third variable that may influence the relationship between X and Y. Outliers and skewed data can impact the regression line, just like they can impact the correlation. Either X or Y or both could have outliers or skewness. If including a particular data point changes the regression line compared to when it is not included, the data point is called influential. Does that seem like many `cautions'? It is: as we learn methods that are more complicated, there will often be more limits on their use and interpretation.
Lesson Lesson Outline Outline
Lesson 15 Linear Regression Lesson 15 Outline Review correlation analysis Dependent and Independent variables Least Squares Regression line Calculating l the slope Calculating the Intercept Residuals and
More informationLinear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares
Linear Regression Chapter 5 Regression Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). We can then predict the average response for all subjects
More informationSection 3 Part 1. Relationships between two numerical variables
Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.
More informationRegression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur. Lecture  2 Simple Linear Regression
Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur Lecture  2 Simple Linear Regression Hi, this is my second lecture in module one and on simple
More information, has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results.
BA 275 Review Problems  Week 9 (11/20/0611/24/06) CD Lessons: 69, 70, 1620 Textbook: pp. 520528, 111124, 133141 An SRS of size 100 is taken from a population having proportion 0.8 of successes. An
More informationLecture 5: Correlation and Linear Regression
Lecture 5: Correlation and Linear Regression 3.5. (Pearson) correlation coefficient The correlation coefficient measures the strength of the linear relationship between two variables. The correlation is
More information2. Simple Linear Regression
Research methods  II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationExercise 1.12 (Pg. 2223)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
More informationwhere b is the slope of the line and a is the intercept i.e. where the line cuts the y axis.
Least Squares Introduction We have mentioned that one should not always conclude that because two variables are correlated that one variable is causing the other to behave a certain way. However, sometimes
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) 
More informationSimple Linear Regression Chapter 11
Simple Linear Regression Chapter 11 Rationale Frequently decisionmaking situations require modeling of relationships among business variables. For instance, the amount of sale of a product may be related
More informationSimple Regression Theory I 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY I 1 Simple Regression Theory I 2010 Samuel L. Baker Regression analysis lets you use data to explain and predict. A simple regression line drawn through data points In Assignment
More informationCorrelation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2
Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables
More informationStudy Resources For Algebra I. Unit 1C Analyzing Data Sets for Two Quantitative Variables
Study Resources For Algebra I Unit 1C Analyzing Data Sets for Two Quantitative Variables This unit explores linear functions as they apply to data analysis of scatter plots. Information compiled and written
More informationElementary Statistics. Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination
Scatter Plot, Regression Line, Linear Correlation Coefficient, and Coefficient of Determination What is a Scatter Plot? A Scatter Plot is a plot of ordered pairs (x, y) where the horizontal axis is used
More informationLesson 4 Part 1. Relationships between. two numerical variables. Correlation Coefficient. Relationship between two
Lesson Part Relationships between two numerical variables Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear between two numerical variables Relationship
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More information, then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients (
Multiple regression Introduction Multiple regression is a logical extension of the principles of simple linear regression to situations in which there are several predictor variables. For instance if we
More informationCorrelation key concepts:
CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)
More informationRegression Analysis: Basic Concepts
The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationYiming Peng, Department of Statistics. February 12, 2013
Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop
More informationExample: Boats and Manatees
Figure 96 Example: Boats and Manatees Slide 1 Given the sample data in Table 91, find the value of the linear correlation coefficient r, then refer to Table A6 to determine whether there is a significant
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationAMS7: WEEK 8. CLASS 1. Correlation Monday May 18th, 2015
AMS7: WEEK 8. CLASS 1 Correlation Monday May 18th, 2015 Type of Data and objectives of the analysis Paired sample data (Bivariate data) Determine whether there is an association between two variables This
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationRelationships Between Two Variables: Scatterplots and Correlation
Relationships Between Two Variables: Scatterplots and Correlation Example: Consider the population of cars manufactured in the U.S. What is the relationship (1) between engine size and horsepower? (2)
More informationSIMPLE REGRESSION ANALYSIS
SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationCorrelation and Regression
Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look
More information17. SIMPLE LINEAR REGRESSION II
17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.
More informationChapter 9. Section Correlation
Chapter 9 Section 9.1  Correlation Objectives: Introduce linear correlation, independent and dependent variables, and the types of correlation Find a correlation coefficient Test a population correlation
More informationChapter 10  Practice Problems 1
Chapter 10  Practice Problems 1 1. A researcher is interested in determining if one could predict the score on a statistics exam from the amount of time spent studying for the exam. In this study, the
More information12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2
PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Understand linear regression with a single predictor Understand how we assess the fit of a regression model Total Sum of Squares
More informationThe aspect of the data that we want to describe/measure is the degree of linear relationship between and The statistic r describes/measures the degree
PS 511: Advanced Statistics for Psychological and Behavioral Research 1 Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables ( and
More informationBasic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
More informationChapter 11: Two Variable Regression Analysis
Department of Mathematics Izmir University of Economics Week 1415 20142015 In this chapter, we will focus on linear models and extend our analysis to relationships between variables, the definitions
More informationMind on Statistics. Chapter 3
Mind on Statistics Chapter 3 Section 3.1 1. Which one of the following is not appropriate for studying the relationship between two quantitative variables? A. Scatterplot B. Bar chart C. Correlation D.
More informationStatistiek II. John Nerbonne. March 24, 2010. Information Science, Groningen Slides improved a lot by Harmut Fitz, Groningen!
Information Science, Groningen j.nerbonne@rug.nl Slides improved a lot by Harmut Fitz, Groningen! March 24, 2010 Correlation and regression We often wish to compare two different variables Examples: compare
More informationPractice 3 SPSS. Partially based on Notes from the University of Reading:
Practice 3 SPSS Partially based on Notes from the University of Reading: http://www.reading.ac.uk Simple Linear Regression A simple linear regression model is fitted when you want to investigate whether
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationAlgebra I: Lesson 54 (5074) SAS Curriculum Pathways
TwoVariable Quantitative Data: Lesson Summary with Examples Bivariate data involves two quantitative variables and deals with relationships between those variables. By plotting bivariate data as ordered
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationIntroduction to Regression. Dr. Tom Pierce Radford University
Introduction to Regression Dr. Tom Pierce Radford University In the chapter on correlational techniques we focused on the Pearson R as a tool for learning about the relationship between two variables.
More informationLecture 18 Linear Regression
Lecture 18 Statistics Unit Andrew Nunekpeku / Charles Jackson Fall 2011 Outline 1 1 Situation  used to model quantitative dependent variable using linear function of quantitative predictor(s). Situation
More informationSPSS Guide: Regression Analysis
SPSS Guide: Regression Analysis I put this together to give you a stepbystep guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 15 scale to 0100 scores When you look at your report, you will notice that the scores are reported on a 0100 scale, even though respondents
More informationDescribing Relationships between Two Variables
Describing Relationships between Two Variables Up until now, we have dealt, for the most part, with just one variable at a time. This variable, when measured on many different subjects or objects, took
More informationUNDERSTANDING MULTIPLE REGRESSION
UNDERSTANDING Multiple regression analysis (MRA) is any of several related statistical methods for evaluating the effects of more than one independent (or predictor) variable on a dependent (or outcome)
More informationHYPOTHESIS TESTING: CONFIDENCE INTERVALS, TTESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, TTESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
More informationHomework 8 Solutions
Math 17, Section 2 Spring 2011 Homework 8 Solutions Assignment Chapter 7: 7.36, 7.40 Chapter 8: 8.14, 8.16, 8.28, 8.36 (ad), 8.38, 8.62 Chapter 9: 9.4, 9.14 Chapter 7 7.36] a) A scatterplot is given below.
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationThe importance of graphing the data: Anscombe s regression examples
The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 3031, 2008 B. Weaver, NHRC 2008 1 The Objective
More informatione = random error, assumed to be normally distributed with mean 0 and standard deviation σ
1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.
More informationCorrelation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers
Correlation Greg C Elvers What Is Correlation? Correlation is a descriptive statistic that tells you if two variables are related to each other E.g. Is your related to how much you study? When two variables
More information17.0 Linear Regression
17.0 Linear Regression 1 Answer Questions Lines Correlation Regression 17.1 Lines The algebraic equation for a line is Y = β 0 + β 1 X 2 The use of coordinate axes to show functional relationships was
More informationOutline. Correlation & Regression, III. Review. Relationship between r and regression
Outline Correlation & Regression, III 9.07 4/6/004 Relationship between correlation and regression, along with notes on the correlation coefficient Effect size, and the meaning of r Other kinds of correlation
More informationAP STATISTICS 2006 SCORING GUIDELINES. Question 2
2006 SCING GUIDELINES Question 2 Intent of Question The primary goal of this question is to assess a student s ability to identify the estimated regression line and to identify and interpret important
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More information1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ
STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material
More informationLEARNING OBJECTIVES SCALES OF MEASUREMENT: A REVIEW SCALES OF MEASUREMENT: A REVIEW DESCRIBING RESULTS DESCRIBING RESULTS 8/14/2016
UNDERSTANDING RESEARCH RESULTS: DESCRIPTION AND CORRELATION LEARNING OBJECTIVES Contrast three ways of describing results: Comparing group percentages Correlating scores Comparing group means Describe
More informationSTAT 350 Practice Final Exam Solution (Spring 2015)
PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects
More informationSimple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression Statistical model for linear regression Estimating
More informationStatistics II Final Exam  January Use the University stationery to give your answers to the following questions.
Statistics II Final Exam  January 2012 Use the University stationery to give your answers to the following questions. Do not forget to write down your name and class group in each page. Indicate clearly
More informationRegression III: Advanced Methods
Lecture 5: Linear leastsquares Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Simple Linear Regression
More informationNotes 5: More on regression and residuals ECO 231W  Undergraduate Econometrics
Notes 5: More on regression and residuals ECO 231W  Undergraduate Econometrics Prof. Carolina Caetano 1 Regression Method Let s review the method to calculate the regression line: 1. Find the point of
More informationSimple Linear Regression in SPSS STAT 314
Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3 Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More information7. Tests of association and Linear Regression
7. Tests of association and Linear Regression In this chapter we consider 1. Tests of Association for 2 qualitative variables. 2. Measures of the strength of linear association between 2 quantitative variables.
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationUsing Minitab for Regression Analysis: An extended example
Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to
More informationChapter 9 Descriptive Statistics for Bivariate Data
9.1 Introduction 215 Chapter 9 Descriptive Statistics for Bivariate Data 9.1 Introduction We discussed univariate data description (methods used to eplore the distribution of the values of a single variable)
More information2013 MBA Jump Start Program. Statistics Module Part 3
2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just
More informationEXPERIMENT 6: HERITABILITY AND REGRESSION
BIO 184 Laboratory Manual Page 74 EXPERIMENT 6: HERITABILITY AND REGRESSION DAY ONE: INTRODUCTION TO HERITABILITY AND REGRESSION OBJECTIVES: Today you will be learning about some of the basic ideas and
More informationPrentice Hall Mathematics: Algebra 1 2007 Correlated to: Michigan Merit Curriculum for Algebra 1
STRAND 1: QUANTITATIVE LITERACY AND LOGIC STANDARD L1: REASONING ABOUT NUMBERS, SYSTEMS, AND QUANTITATIVE SITUATIONS Based on their knowledge of the properties of arithmetic, students understand and reason
More informationInfinite Algebra 1 supports the teaching of the Common Core State Standards listed below.
Infinite Algebra 1 Kuta Software LLC Common Core Alignment Software version 2.05 Last revised July 2015 Infinite Algebra 1 supports the teaching of the Common Core State Standards listed below. High School
More informationChapter 12 : Linear Correlation and Linear Regression
Number of Faculty Chapter 12 : Linear Correlation and Linear Regression Determining whether a linear relationship exists between two quantitative variables, and modeling the relationship with a line, if
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More informationAnswer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade
Statistics Quiz Correlation and Regression  ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements
More informationSimple Predictive Analytics Curtis Seare
Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use
More information11. Analysis of Casecontrol Studies Logistic Regression
Research methods II 113 11. Analysis of Casecontrol Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationAlgebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard
Academic Content Standards Grade Eight and Grade Nine Ohio Algebra 1 2008 Grade Eight STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express
More informationStat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb 2 2015
Stat 412/512 CASE INFLUENCE STATISTICS Feb 2 2015 Charlotte Wickham stat512.cwick.co.nz Regression in your field See website. You may complete this assignment in pairs. Find a journal article in your field
More informationPASS Sample Size Software. Linear Regression
Chapter 855 Introduction Linear regression is a commonly used procedure in statistical analysis. One of the main objectives in linear regression analysis is to test hypotheses about the slope (sometimes
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationAP STATISTICS REVIEW (YMS Chapters 18)
AP STATISTICS REVIEW (YMS Chapters 18) Exploring Data (Chapter 1) Categorical Data nominal scale, names e.g. male/female or eye color or breeds of dogs Quantitative Data rational scale (can +,,, with
More informationSELFTEST: SIMPLE REGRESSION
ECO 22000 McRAE SELFTEST: SIMPLE REGRESSION Note: Those questions indicated with an (N) are unlikely to appear in this form on an inclass examination, but you should be able to describe the procedures
More information4. Describing Bivariate Data
4. Describing Bivariate Data A. Introduction to Bivariate Data B. Values of the Pearson Correlation C. Properties of Pearson's r D. Computing Pearson's r E. Variance Sum Law II F. Exercises A dataset with
More informationMeans, standard deviations and. and standard errors
CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard
More informationRegression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology
Regression in SPSS Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology John P. Bentley Department of Pharmacy Administration University of
More informationSimple Regression and Correlation
Simple Regression and Correlation Today, we are going to discuss a powerful statistical technique for examining whether or not two variables are related. Specifically, we are going to talk about the ideas
More informationThe Simple Linear Regression Model: Specification and Estimation
Chapter 3 The Simple Linear Regression Model: Specification and Estimation 3.1 An Economic Model Suppose that we are interested in studying the relationship between household income and expenditure on
More informationST 311 Evening Problem Session Solutions Week 11
1. p. 175, Question 32 (Modules 10.110.4) [Learning Objectives J1, J3, J9, J1114, J17] Since 1980, average mortgage rates have fluctuated from a low of under 6% to a high of over 14%. Is there a relationship
More informationLecture  32 Regression Modelling Using SPSS
Applied Multivariate Statistical Modelling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur Lecture  32 Regression Modelling Using SPSS (Refer
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x  x) B. x 3 x C. 3x  x D. x  3x 2) Write the following as an algebraic expression
More informationLecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation
Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Display and Summarize Correlation for Direction and Strength Properties of Correlation Regression Line Cengage
More information5.4 The Quadratic Formula
Section 5.4 The Quadratic Formula 481 5.4 The Quadratic Formula Consider the general quadratic function f(x) = ax + bx + c. In the previous section, we learned that we can find the zeros of this function
More informationCORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREERREADY FOUNDATIONS IN ALGEBRA
We Can Early Learning Curriculum PreK Grades 8 12 INSIDE ALGEBRA, GRADES 8 12 CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREERREADY FOUNDATIONS IN ALGEBRA April 2016 www.voyagersopris.com Mathematical
More informationRegression in ANOVA. James H. Steiger. Department of Psychology and Human Development Vanderbilt University
Regression in ANOVA James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 30 Regression in ANOVA 1 Introduction 2 Basic Linear
More information