Notes 5: More on regression and residuals ECO 231W - Undergraduate Econometrics

Similar documents
x 2 + y 2 = 1 y 1 = x 2 + 2x y = x 2 + 2x + 1

3.2. Solving quadratic equations. Introduction. Prerequisites. Learning Outcomes. Learning Style

Vieta s Formulas and the Identity Theorem

1 Shapes of Cubic Functions

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Session 7 Bivariate Data and Analysis

Chapter 11: r.m.s. error for regression

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Grade Level Year Total Points Core Points % At Standard %

c 2008 Je rey A. Miron We have described the constraints that a consumer faces, i.e., discussed the budget constraint.

Graphing Rational Functions

The Method of Partial Fractions Math 121 Calculus II Spring 2015

1 Solving LPs: The Simplex Algorithm of George Dantzig

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

2.2 Derivative as a Function

Simple linear regression

3.1. Solving linear equations. Introduction. Prerequisites. Learning Outcomes. Learning Style

Finding Rates and the Geometric Mean

Module 3: Correlation and Covariance

WRITING PROOFS. Christopher Heil Georgia Institute of Technology

What does the number m in y = mx + b measure? To find out, suppose (x 1, y 1 ) and (x 2, y 2 ) are two points on the graph of y = mx + b.

Correlation key concepts:

Simple Regression Theory II 2010 Samuel L. Baker

Graphing Linear Equations in Two Variables

Normal distribution. ) 2 /2σ. 2π σ

7.6 Approximation Errors and Simpson's Rule

Chapter 27: Taxation. 27.1: Introduction. 27.2: The Two Prices with a Tax. 27.2: The Pre-Tax Position

= δx x + δy y. df ds = dx. ds y + xdy ds. Now multiply by ds to get the form of the equation in terms of differentials: df = y dx + x dy.

Tom wants to find two real numbers, a and b, that have a sum of 10 and have a product of 10. He makes this table.

Math 1526 Consumer and Producer Surplus

Microeconomics Sept. 16, 2010 NOTES ON CALCULUS AND UTILITY FUNCTIONS

3. Mathematical Induction

Math 4310 Handout - Quotient Vector Spaces

Econometrics Simple Linear Regression

This is a square root. The number under the radical is 9. (An asterisk * means multiply.)

1 Review of Least Squares Solutions to Overdetermined Systems

CURVE FITTING LEAST SQUARES APPROXIMATION

Kenken For Teachers. Tom Davis June 27, Abstract

MATH Fundamental Mathematics IV

2.5 Zeros of a Polynomial Functions

Lies My Calculator and Computer Told Me

1.7 Graphs of Functions

The Point-Slope Form

To give it a definition, an implicit function of x and y is simply any relationship that takes the form:

Year 9 set 1 Mathematics notes, to accompany the 9H book.

Calculate Highest Common Factors(HCFs) & Least Common Multiples(LCMs) NA1

Review of Fundamental Mathematics

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

Separable First Order Differential Equations

6.3 Conditional Probability and Independence

EQUATIONS and INEQUALITIES

Zero: If P is a polynomial and if c is a number such that P (c) = 0 then c is a zero of P.

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

Domain of a Composition

Determine If An Equation Represents a Function

2. Linear regression with multiple regressors

5544 = = = Now we have to find a divisor of 693. We can try 3, and 693 = 3 231,and we keep dividing by 3 to get: 1

Father s height (inches)

The Force Table Vector Addition and Resolution

The Big Picture. Correlation. Scatter Plots. Data

Binary Adders: Half Adders and Full Adders

Polynomial and Rational Functions

Cubes and Cube Roots

5.1 Radical Notation and Rational Exponents

Objectives. Materials

Preliminary Mathematics

Plot the following two points on a graph and draw the line that passes through those two points. Find the rise, run and slope of that line.

Elasticity. I. What is Elasticity?

Multiplication and Division with Rational Numbers

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

LINEAR EQUATIONS IN TWO VARIABLES

EdExcel Decision Mathematics 1

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

Relationships Between Two Variables: Scatterplots and Correlation

Solving Rational Equations

Solving Quadratic Equations

Years after US Student to Teacher Ratio

Chapter 7: Simple linear regression Learning Objectives

Playing with Numbers

Partial Fractions. (x 1)(x 2 + 1)

Trend and Seasonal Components

Math 132. Population Growth: the World

Method To Solve Linear, Polynomial, or Absolute Value Inequalities:

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2

1.5 Oneway Analysis of Variance

Click on the links below to jump directly to the relevant section

Linear Programming Notes V Problem Transformations

Math 115 Spring 2011 Written Homework 5 Solutions

Independent samples t-test. Dr. Tom Pierce Radford University

Because the slope is, a slope of 5 would mean that for every 1cm increase in diameter, the circumference would increase by 5cm.

Partial Fractions. Combining fractions over a common denominator is a familiar operation from algebra:

Chapter 4. Polynomial and Rational Functions. 4.1 Polynomial Functions and Their Graphs

Lecture 2 Mathcad Basics

AK 4 SLUTSKY COMPENSATION

table to see that the probability is (b) What is the probability that x is between 16 and 60? The z-scores for 16 and 60 are: = 1.

Chapter 21: The Discounted Utility Model

Lecture 13 - Basic Number Theory.

Unit 1 Number Sense. In this unit, students will study repeating decimals, percents, fractions, decimals, and proportions.

Week 4: Standard Error and Confidence Intervals

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

Transcription:

Notes 5: More on regression and residuals ECO 231W - Undergraduate Econometrics Prof. Carolina Caetano 1 Regression Method Let s review the method to calculate the regression line: 1. Find the point of averages ( x 1, ȳ). 2. Go one SD x1 horizontally, and one r y.x1 SD y vertically. This is the second point. 3. Trace the line that connects both points. The regression line crosses two points: Let s get the formula of the regression line by substituting these two points in a generic line equation: The inclination of the line is given by b. Let s understand it better: 1

2 The meaning of the regression line Let s write down the regression line, by substituting a and b calculated above: Predictor: If somebody has x 1i = 5, then the regression line predicts that their outcome will be It doesn t mean that y i is exactly that, but if the relation between x 1 and y is fairly linear, it should be a good prediction. Notice that predictors are not necessarily good at predicting what would happen if we intervened and changed someone s x 1i. There is a difference between guessing what is the most likely final grade (y i ) of somebody that went to 5 classes (x 1i = 5), and guessing what will be the final grade of someone that we forced to attend 5 classes. In the first case, people selected to go to as many classes as they wanted. In the second case we intervened. The second case is about the causal effect. In essence, suppose that the regression prediction is that someone that went to 5 classes will have a final grade of 35. Does that mean that if I get a random student in the course and force him to come to 5 classes, we can expect him to get a final grade of 35? No. The reason is because of choice. Students that chose to come to 5 classes on their own are probably not great students. The predicted grade of 35 is not only because students that attended 5 classes got enough knowledge in class to get 35 as final grade. It is also because students who attended 5 classes probably didn t study very much, didn t apply themselves in the homeworks, didn t get TA help, and so on. The regression prediction is taking all of this into account together. 2

A causal question is different. What would happen if I force a student to go to 5 classes? If the student was already planning to go to 5 classes, probably nothing will change. However, if the student was planning to go to 10 classes, he is probably a better student, one that would study more, participate in class, work harder on the homeworks. The fact that I forced him to go to 5 won t change who he is. Thus he is probably going to get a better grade than 35. In other words, 5 classes don t cause a grade of 35. To find the causal effect of 5 classes on the final grade, we have to control for the confounders. We have to fix how much the student studies, participates, how well they do the homeworks. The next class is about that. 3 Regression Residuals Regression Residual: is the difference between the actual value of y and the predicted value in the regression. The regression line prediction (the expected y for x 1i ) is ŷ i = Ç ȳ rsd y SD x1 x 1 å + rsd y SD x1 x 1i The residual (error of prediction) is: 3.1 Plotting the residuals I said above that the residuals would cancel each other out. This is actually a fact, they do! If we average the residuals, the average is zero, every time. Checking that this is true is an excellent exercise to check your skills using summation signs. (If you would like to try, begin with the average residuals 1 ni=1 n (y i ŷ i ) and substitute the equation for ŷ i.) It is always a good idea to plot the residuals. The plot should look like this: 3

The pattern above is a sign that the regression method is a good description of the data. This pattern is often a sign of homoskedastic errors. We will study more about this later in the course. Other patterns often seen in the data are: What do these plots have in common? The patterns above are often a sign of heteroskedastic errors. In this case, the regression method is not the best way to describe the data, but it is not a bad way either. We will also discuss more about this later in the course. 4

A yet different pattern is What do these plots have in common? The patterns above are often a sign that the relationship between x 1 and y is not very linear. In this case, the regression line should probably not be used to describe the data. The method of looking at residuals seems a bit silly now. It is an indirect way of judging the relationship between y and x 1. Why look at residuals when one can look directly at scatter plots and graphs of averages instead? Indeed at this point it makes no sense. However, plotting residuals becomes very useful indeed once we start studying multivariate regression. Hang in there. 3.2 Root Mean Square (RMS) error How wrong is the regression line in predicting the outcomes? For one observation, the answer to this question is the residual error. We would like to get a general sense of the error, how wrong is it in average? That is not a good way to look at it, because there are positive and negative residuals. Hence, residuals could be very large, but in average they would cancel each other out. To solve this, we square the residuals, so that they are all positive, and then we average them. The problem is that the result would be the average squared residual. In order to return them to the right unit, we get the square root: 5

You can compute the RMS error in the sample. It is quite easy, but there is also an alternative trick: 4 The method of least squares The regression method is often called Ordinary Least Squares (OLS). In fact, I like to use the term OLS, so you will see it frequently. The reason for this name is that the regression line is in fact the line that minimizes the RMS errors. To say this again using our notation: suppose there is a generic line y = a + bx 1. The question is what would be the values of a and b such that this line is the one with the smallest possible RMS errors? In English: what is the line such that the prediction error is as small as possible? The mathematical problem to solve is: It turns out that the answer to that question is So, among all the linear predictors, the regression line is the one with the smallest RMS errors. In practical terms, it means that it is the most precise among the linear predictors, the one with the smallest errors overall. NOTE: To come up with the answer to the problem in (1) on your own is a pretty challenging exercise in dealing with the summation notation. You are not required to know this in this course. However, if you feel like proving that you are awesome, and at the same time practice your knowledge on calculus and summation notation, here are some things to remember: 6

1. Minimizing the square root of something is the same as minimizing what s inside, so you can get rid of the square root in equation (1). 2. To minimize a convex function (and 1 ni=1 n (y i a bx 1i ) 2 is a convex function of a and b) you must take the first order conditions. In other words, differentiate it with respect to a and set it equal to zero, then differentiate it with respect to b and set it equal to zero. Now you have a system of equations with 2 unknowns (a and b) and 2 equations (the first order conditions). Solve it. 3. The derivative of a sum is the sum of the derivatives. 7