Linear Regression with One Regressor

Similar documents
Econometrics Simple Linear Regression

Example: Boats and Manatees

2. Linear regression with multiple regressors

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

MULTIPLE REGRESSION WITH CATEGORICAL DATA

Nonlinear Regression Functions. SW Ch 8 1/54/

Univariate Regression

Regression with a Binary Dependent Variable

2. Simple Linear Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Simple Regression Theory II 2010 Samuel L. Baker

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Simple linear regression

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

Introduction to Regression and Data Analysis

APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

5. Linear Regression

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

17. SIMPLE LINEAR REGRESSION II

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Solución del Examen Tipo: 1

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Causal Forecasting Models

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Penalized regression: Introduction

Regression and Correlation

Multiple Linear Regression in Data Mining

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Correlation key concepts:

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

STT 200 LECTURE 1, SECTION 2,4 RECITATION 7 (10/16/2012)

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

Financial Risk Management Exam Sample Questions/Answers

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Using Excel for Statistical Analysis

Exercise 1.12 (Pg )

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Module 3: Correlation and Covariance

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Chapter 7: Simple linear regression Learning Objectives

Geostatistics Exploratory Analysis

AP STATISTICS REVIEW (YMS Chapters 1-8)

HOW TO USE YOUR HP 12 C CALCULATOR

Statistical Models in R

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Introduction to Linear Regression

Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Review of Bivariate Regression

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

Multiple Linear Regression

Part 2: Analysis of Relationship Between Two Variables

Analysis of Variance ANOVA

Least Squares Regression. Alan T. Arnholt Department of Mathematical Sciences Appalachian State University

5. Multiple regression

Linear Models for Continuous Data

Elementary Statistics Sample Exam #3

Session 7 Bivariate Data and Analysis

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

11. Analysis of Case-control Studies Logistic Regression

List of Examples. Examples 319

1 Simple Linear Regression I Least Squares Estimation

SPSS Guide: Regression Analysis

Statistics 151 Practice Midterm 1 Mike Kowalski

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

Simple Linear Regression Inference

Homework 11. Part 1. Name: Score: / null

Stat 412/512 CASE INFLUENCE STATISTICS. Charlotte Wickham. stat512.cwick.co.nz. Feb

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Simple Methods and Procedures Used in Forecasting

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Regression III: Advanced Methods

HANDBOOK: HOW TO USE YOUR TI BA II PLUS CALCULATOR

4. Simple regression. QBUS6840 Predictive Analytics.

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

Session 9 Case 3: Utilizing Available Software Statistical Analysis

An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression

Testing for Granger causality between stock prices and economic growth

1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability).

1.5 Oneway Analysis of Variance

Joint Exam 1/P Sample Exam 1

Final Exam Practice Problem Answers

Chapter 23. Inferences for Regression

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

Forecast. Forecast is the linear function with estimated coefficients. Compute with predict command

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

Section 1: Simple Linear Regression

Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model

The importance of graphing the data: Anscombe s regression examples

Scatter Plot, Correlation, and Regression on the TI-83/84

TIME SERIES ANALYSIS & FORECASTING

Transcription:

Linear Regression with One Regressor Michael Ash Lecture 8

Linear Regression with One Regressor a.k.a. Bivariate Regression An important distinction 1. The Model Effect of a one-unit change in X on the mean of Y. Role of randomness and idiosyncratic variation. Mean caveats apply. 2. Estimating the Model Computing sample coefficients with Ordinary Least Squares Hypothesis testing

The Linear Regression Model Slope, or response β ClassSize = Change in TestScore Change in ClassSize = TestScore ClassSize The Greek capital letter delta stands for change in. The Greek lower-case letter beta β is the symbol for how Y (TestScore) responds to a change in X (ClassSize). In this case, β is measured in test points per student. Other examples: how murder rates (Y ) respond to poverty (X ); how highway deaths (Y ) respond to drunk-driving penalties (X ); how earnings (Y ) respond to years of schooling (X ). Consider the β in each case.

Using a known β Suppose we know that β = 0.6 test points per student. (Adding one student to the class reduces the class test score by 0.6 points.) What is the effect of reducing class size by two students? Rearranging the definition of β, and then putting in the particular example. TestScore = β ClassSize ClassSize = ( 0.6) ( 2) = +1.2 In words: Reducing class size by two students will raise test scores by 1.2 points.

Building a Model TestScore = β 0 + β ClassSize ClassSize is a statement about relationship that holds on average across the population of districts. TestScore = β 0 + β ClassSize ClassSize + other factors is a statement that is true for any district. β 0 + β ClassSize ClassSize is the average effect of class size. other factors includes teacher quality, textbook quality, community income or wealth, native English speakers, testing variation, luck.

Linear Regression Model with a Single Regressor Y i = β 0 + β 1 X i + u i Y is the dependent variable, or outcome variable, or left-hand variable. (No one says regressand with a straight face.) X is the independent variable, or regressor, or explanatory variable, or right-hand variable. β 0 + β 1 X is the population regression line, or the expected value (mean) of Y given X, or E(Y X ) = β 0 + β 1 X

Linear Regression Model with a Single Regressor Y i = β 0 + β 1 X i + u i β 1 and β 0 are the coefficents, or parameters of the regression line. β1 is the slope, the change in Y associated with a unit change in X. β0 is the intercept, the expected value of Y when X = 0. (Sometimes X = 0 doesn t make any sense.) β 0 raises or lowers the regression line.) u i is the error term or residual, which includes all of the unique, or idiosyncratic features of observation i, including randomness, measurement error, and luck that affect its outcome Y i.

Determinism and Randomness Appreciate determinism E(Y i X i ) = β 0 + β 1 X i Appreciate randomness u i Figure 4.1 Better or worse than predicted Determinism and randomness

Estimating the Coefficients of the Linear Regression Model Draw a best line through a scatterplot. Figure 4.2 Choosing ˆβ 0 and ˆβ 1 defines a line. What is the best line? Recall that the sample mean Y minimizes n (Y i m) 2 i=1

Estimating the Coefficients of the Linear Regression Model Now instead of getting to choose m, we get to choose b 0 and b 1 to minimize n (Y i b 0 b 1 X i ) 2 i=1 Let s focus on the key term: (Y i b 0 b 1 X i ) = (Y i (b 0 + b 1 X i )) = (Y i E(Y i X i ))

Estimating the Coefficients of the Linear Regression Model (Y i (b 0 + b 1 X i )) The key term expresses how far the actual value of Y i is from the expected value of Y i given X i according to the proposed line. We want to choose b 0 and b 1 to keep these gaps down. The values of b 0 and b 1 that keep n (Y i b 0 b 1 X i ) 2 i=1 as low as possible are called the ordinary least squares estimators of β 0 and β 1. The estimators are named ˆβ 0 and ˆβ 1.

The Ordinary Least Squares Estimators From Figure 4.2 to Figure 4.3 Choose b 0 and b 1 to minimize n (Y i b 0 b 1 X i ) 2 i=1 1. Could do this by trial and error. (Choose many alternative pairs b 0 and b 1 and see which gives the smallest sum of squared errors.) 2. Calculus creates simple formulas: ˆβ 1 = n i=1 (X i X)(Y i Y ) n i=1 (X i X ) 2 = s XY s 2 X ˆβ 0 = Y ˆβ 1 X 3. These are an average concept (again!) with all the good properties of sample averages.

Some OLS terminology OLS Regression Line Predicted Value of Y i given X i Ŷ = ˆβ 0 + ˆβ 1 X Ŷ i = ˆβ 0 + ˆβ 1 X i Predicted residual of the i th observation û i = Y i Ŷi

Test Scores and Student-Teacher Ratio Unit of observation: a California school district (n = 420) Variables: district average test score, district student-teacher ratio OLS Results ˆβ 0 = 698.9 ˆβ 1 = 2.28 E(TestScore) = β 0 + β STR STR TestScore = 698.9 2.28 STR The slope is 2.28: an increase in the student-teacher ratio (STR) by one student per class is, on average, associated with a decline in the districtwide test score by 2.28 points on the test.

Are the results big? Consider reducing STR by two students What would we expect to happen in a district with median student-to-teacher ratio and median test score? (No such district necessarily exists, but it s a useful reference point.) Table 4.1 Reduction of 2 students: from 19.7 (50th percentile) to 17.7 (c. 10th percentile) Expected change in test scores: 2.28 2 = +4.6 Expected change in test scores: from 654.5 (50th percentile) to 659.1 (c. 60th percentile)? Worth it? Consider reducing STR by five students Beware out-of-sample predictions

Advantages of OLS Estimators Widely used method in social sciences, policy, and administration. Desirable Properties of OLS 1. ˆβ consistent and unbiased estimator of β 2. ˆβ is approximately normally distributed in large samples 3. With additional assumptions, ˆβ may be the smallest variance estimator of β.