# Regression Analysis: Basic Concepts

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance or error, u i. y i = β 0 + β 1 x i + u i The error term u i is assumed to have a mean value of zero, a constant variance, and to be uncorrelated with its own past values (i.e., it is white noise ). The task of estimation is to determine regression coefficients ˆβ 0 and ˆβ 1, estimates of the unknown parameters β 0 and β 1 respectively. The estimated equation will have the form ŷ i = ˆβ 0 + ˆβ 1 x 1 OLS The basic technique for determining the coefficients ˆβ 0 and ˆβ 1 is Ordinary Least Squares (OLS). Picturing the residuals Values for the coefficients are chosen to minimize the sum of the squared estimated errors or residual sum of squares (SSR). The estimated error associated with each pair of data-values (x i, y i ) is defined as û i = y i ŷ i = y i ˆβ 0 ˆβ 1 x i y i ŷ i ûi ˆβ 0 + ˆβ 1 x We use a different symbol for this estimated error (û i ) as opposed to the true disturbance or error term, (u i ). These two coincide only if ˆβ 0 and ˆβ 1 happen to be exact estimates of the regression parameters α and β. The estimated errors are also known as residuals. The SSR may be written as SSR = Σû 2 i = Σ(y i ŷ i ) 2 = Σ(y i ˆβ 0 ˆβ 1 x i ) 2 x i The residual, û i, is the vertical distance between the actual value of the dependent variable, y i, and the fitted value, ŷ i = ˆβ 0 + ˆβ 1 x i. 2 3

2 Normal Equations Minimization of SSR is a calculus exercise: find the partial derivatives of SSR with respect to both ˆβ 0 and ˆβ 1 and set them equal to zero. This generates two equations (the normal equations of least squares) in the two unknowns, ˆβ 0 and ˆβ 1. These equations are solved jointly to yield the estimated coefficients. Equation (1) implies that Equation (2) implies that SSR/ ˆβ 0 = 2Σ(y i ˆβ 0 ˆβ 1 x i ) = 0 (1) SSR/ ˆβ 1 = 2Σx i (y i ˆβ 0 ˆβ 1 x i ) = 0 (2) Σy i n ˆβ 0 ˆβ 1 Σx i = 0 ˆβ 0 = ȳ ˆβ 1 x (3) Σx i y i ˆβ 0 Σx i ˆβ 1 Σx 2 i = 0 (4) Now substitute for ˆβ 0 in equation (4), using (3). This yields Σx i y i (ȳ ˆβ 1 x)σx i ˆβ 1 Σx 2 i = 0 Σx i y i ȳσx i ˆβ 1 (Σx 2 i xσx i) = 0 ˆβ 1 = (Σx iy i ȳσx i ) (Σx 2 i xσx i) Equations (3) and (4) can now be used to generate the regression coefficients. First use (5) to find ˆβ 1, then use (3) to find ˆβ 0. (5) 4 5 Goodness of fit The OLS technique ensures that we find the values of ˆβ 0 and ˆβ 1 which fit the sample data best, in the sense of minimizing the sum of squared residuals. There s no guarantee that ˆβ 0 and ˆβ 1 correspond exactly with the unknown parameters β 0 and β 1. No guarantee that the best fitting line fits the data well at all: maybe the data do not even approximately lie along a straight line relationship. How do we assess the adequacy of the fitted equation? First step: find the residuals. For each x-value in the sample, compute the fitted value or predicted value of y, using ŷ i = ˆβ 0 + ˆβ 1 x i. Then subtract each fitted value from the corresponding actual, observed, value of y i. Squaring and summing these differences gives the SSR. Example of finding residuals ˆβ 0 = ; ˆβ 1 = data (x i ) data (y i ) fitted (ŷ i ) û i = y i ŷ i û 2 i Σ = 0 Σ = = SSR 6 7

3 Standard error The magnitude of SSR depends in part on the number of data points. To allow for this we can divide though by the degrees of freedom, which is the number of data points minus the number of parameters to be estimated (2 in the case of a simple regression with intercept). Let n denote the number of data points (sample size); then the degrees of freedom, df = n 2. The square root of (SSR/df) is the standard error of the regression, ˆσ : SSR ˆσ = n 2 The standard error gives a first handle on how well the fitted equation fits the sample data. But what is a big ˆσ and what is a small one depends on the context. The regression standard error is sensitive to the units of measurement of the dependent variable. R-squared A more standardized statistic which also gives a measure of the goodness of fit of the estimated equation is R 2. R 2 = 1 SSR Σ(y i ȳ) 1 SSR 2 SST SSR can be thought of as the unexplained variation in the dependent variable the variation left over once the predictions of the regression equation are taken into account. Σ(y i ȳ) 2 (total sum of squares or SST), represents the total variation of the dependent variable around its mean value. 8 9 R 2 = (1 SSR/SST) is 1 minus the proportion of the variation in y i that is unexplained. It shows the proportion of the variation in y i that is accounted for by the estimated equation. As such, it must be bounded by 0 and 1. 0 R 2 1 R 2 = 1 is a perfect score, obtained only if the data points happen to lie exactly along a straight line; R 2 = 0 is perfectly lousy score, indicating that x i is absolutely useless as a predictor for y i. Adjusted R-squared Adding a variable to a regression equation cannot raise the SSR; it s likely to lower SSR somewhat even if the new variable is not very relevant. The adjusted R-squared, R 2, attaches a small penalty to adding more variables. If adding a variable raises the R 2 for a regression, that s a better indication that is has improved the model that if it merely raises the unadjusted R 2. R 2 = 1 SSR/(n k 1) SST/(n 1) = 1 n 1 n k 1 (1 R2 ) where k + 1 represents the number of parameters being estimated (2 in a simple regression)

4 To summarize so far Alongside the estimated regression coefficients ˆβ 0 and ˆβ 1, we might also examine the sum of squared residuals (SSR) the regression standard error ( ˆσ ) the R 2 value (adjusted or unadjusted) to judge whether the best-fitting line does in fact fit the data to an adequate degree. Confidence intervals for coefficients As we saw, a confidence interval provides a means of quantifying the uncertainty produced by sampling error. Instead of simply stating I found a sample mean income of \$39,000 and that is my best guess at the population mean, although I know it is probably wrong, we can make a statement like: I found a sample mean of \$39,000, and there is a 95 percent probability that my estimate is off the true parameter value by no more than \$1200. Confidence intervals for regression coefficients are constructed in a similar manner Suppose we re interested in the slope coefficient, ˆβ 1, of an estimated equation. Say we came up with ˆβ 1 =.90, using the OLS technique, and we want to quantify our uncertainty over the true slope parameter, β 1, by drawing up a 95 percent confidence interval for β 1. Provided our sample size is reasonably large, the rule of thumb is the same as before; the 95 percent confidence interval for β is given by: On the same grounds as before, there is a 95 per chance that our estimate ˆβ 1 will lie within 2 standard errors of its mean value, β 1. The standard error of ˆβ 1 (written as se( ˆβ 1 ), and not to be confused with the standard error of the regression, ˆσ ) is given by the formula: se( ˆβ ˆσ 1 ) = 2 Σ(x i x) 2 ˆβ 1 ± 2 standard errors Our single best guess at β 1 (point estimate ) is simply ˆβ 1, since the OLS technique yields unbiased estimates of the parameters (actually, this is not always true, but we ll postpone consideration of tricky cases where OLS estimates are biased). The larger is se( ˆβ 1 ), the wider will be our confidence interval. The larger is ˆσ, the larger will be se( ˆβ 1 ), and so the wider the confidence interval for the true slope. Makes sense: in the case of a poor fit we have high uncertainty over the true slope parameter. A high degree of variation of x i makes for a smaller se( ˆβ 1 ) (tighter confidence interval). The more x i has varied in our sample, the better the chance we have of accurately picking up any relationship that exists between x and y

5 Confidence interval example Is there really a positive linear relationship between x i and y i? We ve obtained ˆβ 1 =.90 and se( ˆβ 1 ) =.12. The approximate 95 percent confidence interval for β 1 is then.90 ± 2(.12) =.90 ±.24 =.66 to 1.14 Thus we can state, with at least 95 percent confidence, that β 1 > 0, and there is a positive relationship. If we had obtained se( ˆβ 1 ) =.61, our interval would have been.90 ± 2(.61) =.90 ± 1.22 =.32 to 2.12 In this case the interval straddles zero, and we cannot be confident (at the 95 percent level) that there exists a positive relationship. 16

### Simple Linear Regression Chapter 11

Simple Linear Regression Chapter 11 Rationale Frequently decision-making situations require modeling of relationships among business variables. For instance, the amount of sale of a product may be related

### where b is the slope of the line and a is the intercept i.e. where the line cuts the y axis.

Least Squares Introduction We have mentioned that one should not always conclude that because two variables are correlated that one variable is causing the other to behave a certain way. However, sometimes

### Chapter 11: Two Variable Regression Analysis

Department of Mathematics Izmir University of Economics Week 14-15 2014-2015 In this chapter, we will focus on linear models and extend our analysis to relationships between variables, the definitions

### t-tests and F-tests in regression

t-tests and F-tests in regression Johan A. Elkink University College Dublin 5 April 2012 Johan A. Elkink (UCD) t and F-tests 5 April 2012 1 / 25 Outline 1 Simple linear regression Model Variance and R

### Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2

Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2 Note: Whether we calculate confidence intervals or perform hypothesis tests we need the distribution of the statistic we will use.

### Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology

Regression in SPSS Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology John P. Bentley Department of Pharmacy Administration University of

### A brush-up on residuals

A brush-up on residuals Going back to a single level model... y i = β 0 + β 1 x 1i + ε i The residual for each observation is the difference between the value of y predicted by the equation and the actual

### Econometrics The Multiple Regression Model: Inference

Econometrics The Multiple Regression Model: João Valle e Azevedo Faculdade de Economia Universidade Nova de Lisboa Spring Semester João Valle e Azevedo (FEUNL) Econometrics Lisbon, March 2011 1 / 24 in

### Statistics II Final Exam - January Use the University stationery to give your answers to the following questions.

Statistics II Final Exam - January 2012 Use the University stationery to give your answers to the following questions. Do not forget to write down your name and class group in each page. Indicate clearly

### OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique - ie the estimator has the smallest variance

Lecture 5: Hypothesis Testing What we know now: OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique - ie the estimator has the smallest variance (if the Gauss-Markov

### Linear Regression with One Regressor

Linear Regression with One Regressor Michael Ash Lecture 8 Linear Regression with One Regressor a.k.a. Bivariate Regression An important distinction 1. The Model Effect of a one-unit change in X on the

### Regression, least squares

Regression, least squares Joe Felsenstein Department of Genome Sciences and Department of Biology Regression, least squares p.1/24 Fitting a straight line X Two distinct cases: The X values are chosen

### Regression Analysis. Pekka Tolonen

Regression Analysis Pekka Tolonen Outline of Topics Simple linear regression: the form and estimation Hypothesis testing and statistical significance Empirical application: the capital asset pricing model

### Linear Regression with One Regressor

Linear Regression with One Regressor Michael Ash Lecture 10 Analogy to the Mean True parameter µ Y β 0 and β 1 Meaning Central tendency Intercept and slope E(Y ) E(Y X ) = β 0 + β 1 X Data Y i (X i, Y

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### Two-Variable Regression: Interval Estimation and Hypothesis Testing

Two-Variable Regression: Interval Estimation and Hypothesis Testing Jamie Monogan University of Georgia Intermediate Political Methodology Jamie Monogan (UGA) Confidence Intervals & Hypothesis Testing

### Simple Regression Theory II 2010 Samuel L. Baker

SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

### 0.1 Multiple Regression Models

0.1 Multiple Regression Models We will introduce the multiple Regression model as a mean of relating one numerical response variable y to two or more independent (or predictor variables. We will see different

### Hence, multiplying by 12, the 95% interval for the hourly rate is (965, 1435)

Confidence Intervals for Poisson data For an observation from a Poisson distribution, we have σ 2 = λ. If we observe r events, then our estimate ˆλ = r : N(λ, λ) If r is bigger than 20, we can use this

### Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

### Statistics for Management II-STAT 362-Final Review

Statistics for Management II-STAT 362-Final Review Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. The ability of an interval estimate to

### SIMPLE REGRESSION ANALYSIS

SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two

### Chapter 9. Section Correlation

Chapter 9 Section 9.1 - Correlation Objectives: Introduce linear correlation, independent and dependent variables, and the types of correlation Find a correlation coefficient Test a population correlation

### Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

### Introduction to Regression. Dr. Tom Pierce Radford University

Introduction to Regression Dr. Tom Pierce Radford University In the chapter on correlational techniques we focused on the Pearson R as a tool for learning about the relationship between two variables.

### Regression in ANOVA. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Regression in ANOVA James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 30 Regression in ANOVA 1 Introduction 2 Basic Linear

### Definition The covariance of X and Y, denoted by cov(x, Y ) is defined by. cov(x, Y ) = E(X µ 1 )(Y µ 2 ).

Correlation Regression Bivariate Normal Suppose that X and Y are r.v. s with joint density f(x y) and suppose that the means of X and Y are respectively µ 1 µ 2 and the variances are 1 2. Definition The

### Chapter 8. Linear Regression. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Chapter 8 Linear Regression Copyright 2012, 2008, 2005 Pearson Education, Inc. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King

### Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

### Advanced High School Statistics. Preliminary Edition

Chapter 2 Summarizing Data After collecting data, the next stage in the investigative process is to summarize the data. Graphical displays allow us to visualize and better understand the important features

### In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a

Math 143 Inference on Regression 1 Review of Linear Regression In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a bivariate data set (i.e., a list of cases/subjects

### Questions and Answers on Hypothesis Testing and Confidence Intervals

Questions and Answers on Hypothesis Testing and Confidence Intervals L. Magee Fall, 2008 1. Using 25 observations and 5 regressors, including the constant term, a researcher estimates a linear regression

### OLS in Matrix Form. Let y be an n 1 vector of observations on the dependent variable.

OLS in Matrix Form 1 The True Model Let X be an n k matrix where we have observations on k independent variables for n observations Since our model will usually contain a constant term, one of the columns

### Statistics in Geophysics: Linear Regression II

Statistics in Geophysics: Linear Regression II Steffen Unkel Department of Statistics Ludwig-Maximilians-University Munich, Germany Winter Term 2013/14 1/28 Model definition Suppose we have the following

### 17.0 Linear Regression

17.0 Linear Regression 1 Answer Questions Lines Correlation Regression 17.1 Lines The algebraic equation for a line is Y = β 0 + β 1 X 2 The use of coordinate axes to show functional relationships was

### Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Centre for Excellence in Epidemiology University of Manchester 08/11/2016 Structure This Week What is a linear model? How

### Assessing Forecasting Error: The Prediction Interval. David Gerbing School of Business Administration Portland State University

Assessing Forecasting Error: The Prediction Interval David Gerbing School of Business Administration Portland State University November 27, 2015 Contents 1 Prediction Intervals 1 1.1 Modeling Error..................................

### Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

### Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part II)

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part II) Florian Pelgrin HEC September-December 2010 Florian Pelgrin (HEC) Constrained estimators September-December

### Linear and Piecewise Linear Regressions

Tarigan Statistical Consulting & Coaching statistical-coaching.ch Doctoral Program in Computer Science of the Universities of Fribourg, Geneva, Lausanne, Neuchâtel, Bern and the EPFL Hands-on Data Analysis

### Multiple Hypothesis Testing: The F-test

Multiple Hypothesis Testing: The F-test Matt Blackwell December 3, 2008 1 A bit of review When moving into the matrix version of linear regression, it is easy to lose sight of the big picture and get lost

### Notes 5: More on regression and residuals ECO 231W - Undergraduate Econometrics

Notes 5: More on regression and residuals ECO 231W - Undergraduate Econometrics Prof. Carolina Caetano 1 Regression Method Let s review the method to calculate the regression line: 1. Find the point of

### The Simple Linear Regression Model: Specification and Estimation

Chapter 3 The Simple Linear Regression Model: Specification and Estimation 3.1 An Economic Model Suppose that we are interested in studying the relationship between household income and expenditure on

### Sample Size Determination

Sample Size Determination Population A: 10,000 Population B: 5,000 Sample 10% Sample 15% Sample size 1000 Sample size 750 The process of obtaining information from a subset (sample) of a larger group (population)

### 5.4 The Quadratic Formula

Section 5.4 The Quadratic Formula 481 5.4 The Quadratic Formula Consider the general quadratic function f(x) = ax + bx + c. In the previous section, we learned that we can find the zeros of this function

### Introduction to Regression and Data Analysis

Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

### The Classical Linear Regression Model

The Classical Linear Regression Model 1 September 2004 A. A brief review of some basic concepts associated with vector random variables Let y denote an n x l vector of random variables, i.e., y = (y 1,

### , then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients (

Multiple regression Introduction Multiple regression is a logical extension of the principles of simple linear regression to situations in which there are several predictor variables. For instance if we

### Instrumental Variables & 2SLS

Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20 - Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental

### SELF-TEST: SIMPLE REGRESSION

ECO 22000 McRAE SELF-TEST: SIMPLE REGRESSION Note: Those questions indicated with an (N) are unlikely to appear in this form on an in-class examination, but you should be able to describe the procedures

### IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Chapter 5: Linear regression

Chapter 5: Linear regression Last lecture: Ch 4............................................................ 2 Next: Ch 5................................................................. 3 Simple linear

### Chapter 8: Interval Estimates and Hypothesis Testing

Chapter 8: Interval Estimates and Hypothesis Testing Chapter 8 Outline Clint s Assignment: Taking Stock Estimate Reliability: Interval Estimate Question o Normal Distribution versus the Student t-distribution:

### Simple Linear Regression

Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression Statistical model for linear regression Estimating

### 12/31/2016. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Understand linear regression with a single predictor Understand how we assess the fit of a regression model Total Sum of Squares

### REGRESSION LINES IN STATA

REGRESSION LINES IN STATA THOMAS ELLIOTT 1. Introduction to Regression Regression analysis is about eploring linear relationships between a dependent variable and one or more independent variables. Regression

### Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

### Econ 371 Problem Set #3 Answer Sheet

Econ 371 Problem Set #3 Answer Sheet 4.1 In this question, you are told that a OLS regression analysis of third grade test scores as a function of class size yields the following estimated model. T estscore

### 17. SIMPLE LINEAR REGRESSION II

17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

### 2. Linear regression with multiple regressors

2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

### Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

### Lecture 5 Hypothesis Testing in Multiple Linear Regression

Lecture 5 Hypothesis Testing in Multiple Linear Regression BIOST 515 January 20, 2004 Types of tests 1 Overall test Test for addition of a single variable Test for addition of a group of variables Overall

### Regression III: Advanced Methods

Lecture 5: Linear least-squares Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Simple Linear Regression

### 11. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE

11. CONFIDENCE INTERVALS FOR THE MEAN; KNOWN VARIANCE We assume here that the population variance σ 2 is known. This is an unrealistic assumption, but it allows us to give a simplified presentation which

### UNDERSTANDING MULTIPLE REGRESSION

UNDERSTANDING Multiple regression analysis (MRA) is any of several related statistical methods for evaluating the effects of more than one independent (or predictor) variable on a dependent (or outcome)

### Section I: Multiple Choice Select the best answer for each question.

Chapter 15 (Regression Inference) AP Statistics Practice Test (TPS- 4 p796) Section I: Multiple Choice Select the best answer for each question. 1. Which of the following is not one of the conditions that

### Simple Regression and Correlation

Simple Regression and Correlation Today, we are going to discuss a powerful statistical technique for examining whether or not two variables are related. Specifically, we are going to talk about the ideas

### Outline. Correlation & Regression, III. Review. Relationship between r and regression

Outline Correlation & Regression, III 9.07 4/6/004 Relationship between correlation and regression, along with notes on the correlation coefficient Effect size, and the meaning of r Other kinds of correlation

### Chapter 10 Correlation and Regression. Overview. Section 10-2 Correlation Key Concept. Definition. Definition. Exploring the Data

Chapter 10 Correlation and Regression 10-1 Overview 10-2 Correlation 10- Regression Overview This chapter introduces important methods for making inferences about a correlation (or relationship) between

### Math 62 Statistics Sample Exam Questions

Math 62 Statistics Sample Exam Questions 1. (10) Explain the difference between the distribution of a population and the sampling distribution of a statistic, such as the mean, of a sample randomly selected

### The importance of graphing the data: Anscombe s regression examples

The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective

### Inference for Regression

Simple Linear Regression Inference for Regression The simple linear regression model Estimating regression parameters; Confidence intervals and significance tests for regression parameters Inference about

### Example: Boats and Manatees

Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

### CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

### Instrumental Variables & 2SLS

Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20 - Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental

### The aspect of the data that we want to describe/measure is the degree of linear relationship between and The statistic r describes/measures the degree

PS 511: Advanced Statistics for Psychological and Behavioral Research 1 Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables ( and

### Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

### TIME SERIES REGRESSION

DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 816 TIME SERIES REGRESSION I. AGENDA: A. A couple of general considerations in analyzing time series data B. Intervention analysis

### 5. Linear Regression

5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

### 2. Simple Linear Regression

Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

### 4.7 Confidence and Prediction Intervals

4.7 Confidence and Prediction Intervals Instead of conducting tests we could find confidence intervals for a regression coefficient, or a set of regression coefficient, or for the mean of the response

### B Linear Regressions. Is your data linear? The Method of Least Squares

B Linear Regressions Is your data linear? So you want to fit a straight line to a set of measurements... First, make sure you really want to do this. That is, see if you can convince yourself that a plot

### Wooldridge, Introductory Econometrics, 4th ed. Multiple regression analysis:

Wooldridge, Introductory Econometrics, 4th ed. Chapter 4: Inference Multiple regression analysis: We have discussed the conditions under which OLS estimators are unbiased, and derived the variances of

### Simple Regression Theory I 2010 Samuel L. Baker

SIMPLE REGRESSION THEORY I 1 Simple Regression Theory I 2010 Samuel L. Baker Regression analysis lets you use data to explain and predict. A simple regression line drawn through data points In Assignment

### Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

### Numerical Summarization of Data OPRE 6301

Numerical Summarization of Data OPRE 6301 Motivation... In the previous session, we used graphical techniques to describe data. For example: While this histogram provides useful insight, other interesting

### EC327: Advanced Econometrics, Spring 2007

EC327: Advanced Econometrics, Spring 2007 Wooldridge, Introductory Econometrics (3rd ed, 2006) Appendix D: Summary of matrix algebra Basic definitions A matrix is a rectangular array of numbers, with m

### Linear Model Selection and Regularization

Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

### Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

### An Introduction to Bayesian Statistics

An Introduction to Bayesian Statistics Robert Weiss Department of Biostatistics UCLA School of Public Health robweiss@ucla.edu April 2011 Robert Weiss (UCLA) An Introduction to Bayesian Statistics UCLA

### Data and Regression Analysis. Lecturer: Prof. Duane S. Boning. Rev 10

Data and Regression Analysis Lecturer: Prof. Duane S. Boning Rev 10 1 Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance (ANOVA) 2. Multivariate Analysis of Variance Model forms 3.

### Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between

### Linear combinations of parameters

Linear combinations of parameters Suppose we want to test the hypothesis that two regression coefficients are equal, e.g. β 1 = β 2. This is equivalent to testing the following linear constraint (null

### Part (Semi Partial) and Partial Regression Coefficients

Part (Semi Partial) and Partial Regression Coefficients Hervé Abdi 1 1 overview The semi-partial regression coefficient also called part correlation is used to express the specific portion of variance

### Regression III: Advanced Methods

Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

### 7. Autocorrelation (Violation of Assumption #B3)

7. Autocorrelation (Violation of Assumption #B3) Assumption #B3: The error term u i is not autocorrelated, i.e. Cov(u i, u j ) = 0 for all i = 1,..., N and j = 1,..., N where i j Where do we typically

### Exam and Solution. Please discuss each problem on a separate sheet of paper, not just on a separate page!

Econometrics - Exam 1 Exam and Solution Please discuss each problem on a separate sheet of paper, not just on a separate page! Problem 1: (20 points A health economist plans to evaluate whether screening