2. Linear regression with multiple regressors

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "2. Linear regression with multiple regressors"

Transcription

1 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions in the multiple regression model Violations of the assumptions (omitted-variable bias, multicollinearity, heteroskedasticity, autocorrelation) 5

2 2.1. The multiple regression model Intuition: A regression model specifies a functional (parametric) relationship between a dependent (endogenous) variable Y and a set of k independent (exogenous) regressors X 1, X 2,..., X k In a first step, we consider the linear multiple regression model 6

3 Definition 2.1: (Multiple linear regression model) The multiple (linear) regression model is given by Y i = β 0 + β 1 X 1i + β 2 X 2i β k X ki + u i, (2.1) i = 1,..., n, where Y i is the i th observation on the dependent variable, X 1i, X 2i,..., X ki are the i th regressors, u i is the stochastic error term. observations on each of the k The population regression line is the relationship that holds between Y and the X s on average: E(Y i X 1i = x 1, X 2i = x 2,..., X ki = x k ) = β 0 +β 1 x β k x k. 7

4 Meaning of the coefficients: The intercept β 0 is the expected value of Y i (for all i = 1,..., n) when all X-regressors equal 0 β 1,..., β k are the slope coefficients on the respective regressors X 1,..., X k β 1, for example, is the expected change in Y i resulting from changing X 1i by one unit, holding constant X 2i,..., X ki (and analogously β 2,..., β k ) Definition 2.2: (Homoskedasticity, Heteroskedasticity) The error term u i is called homoskedastic if the conditional variance of u i given X 1i,..., X ki, Var(u i X 1i,..., X ki ), is constant for i = 1,..., n and does not depend on the values of X 1i,..., X ki. Otherwise, the error term is called heteroskedastic. 8

5 Example 1: (Student performance) Regression of student performance (Y ) in n = 420 USdistricts on distinct school characteristics (factors) Y i : average test score in the i th district (TEST SCORE) X 1i : average class size in the i th district (measured by the student-teacher ratio, STR) X 2i : percentage of English learners in the i th district (PCTEL) Expected signs of the coefficients: β 1 < 0 β 2 < 0 9

6 Example 2: (House prices) Regression of house prices (Y ) recorded for n = 546 houses sold in Windsor (Canada) on distinct housing characteristics Y i : sale price (in Canadian dollars) of the i th house (SALEPRICE) X 1i : lot size (in square feet) of the i th property (LOTSIZE) X 2i : number of bedrooms in the i th house (BEDROOMS) X 3i : number of bathrooms in the i th house (BATHROOMS) X 4i : number of storeys (excluding the basement) in the i th house (STOREYS) Expected signs of the coefficients: β 1, β 2, β 3, β 4 > 0 10

7 2.2. The OLS estimator in multiple regression Now: Estimation of the coefficients β 0, β 1,..., β k in the multiple regression model on the basis of n observations by applying the Ordinary Least Squares (OLS) technique Idea: Let b 0, b 1,..., b k be estimators of β 0, β 1,..., β k We can predict Y i by b 0 + b 1 X 1i b k X ki The prediction error is Y i b 0 b 1 X 1i... b k X ki 11

8 Idea: [continued] The sum of the squared prediction errors over all n observations is n i=1 (Y i b 0 b 1 X 1i... b k X ki ) 2 (2.2) Definition 2.3: (OLS estimators, predicted values, residuals) The OLS estimators ˆβ 0, ˆβ 1,..., ˆβ k are the values of b 0, b 1,..., b k that minimize the sum of squared prediction errors (2.2). The OLS predicted values Ŷ i and residuals û i (for i = 1,..., n) are and Ŷ i = ˆβ 0 + ˆβ 1 X 1i ˆβ k X ki (2.3) û i = Y i Ŷ i. (2.4) 12

9 Remarks: The OLS estimators ˆβ 0, ˆβ 1,..., ˆβ k and the residuals û i are computed from a sample of n observations of (X 1i,..., X ki, Y i ) for i = 1,..., n They are estimators of the unknown true population coefficients β 0, β 1,..., β k and u i There are closed-form formulas for calculating the OLS estimates from the data (see the lectures Econometrics I+II) In this lecture, we use the software-package EViews 13

10 Regression estimation results (EViews) for the student-performance dataset Dependent Variable: TEST_SCORE Method: Least Squares Date: 07/02/12 Time: 16:29 Sample: Included observations: 420 Variable Coefficient Std. Error t-statistic Prob. C STR PCTEL R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid Schwarz criterion Log likelihood Hannan-Quinn criter F-statistic Durbin-Watson stat Prob(F-statistic)

11 Predicted values Ŷ i and residuals û i for the student-performance dataset Residual Actual Fitted 15

12 Regression estimation results (EViews) for the house-prices dataset Dependent Variable: SALEPRICE Method: Least Squares Date: 07/02/12 Time: 16:50 Sample: Included observations: 546 Variable Coefficient Std. Error t-statistic Prob. C LOTSIZE BEDROOMS BATHROOMS STOREYS R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid 1.80E+11 Schwarz criterion Log likelihood Hannan-Quinn criter F-statistic Durbin-Watson stat Prob(F-statistic)

13 Predicted values Ŷ i and residuals û i for the house-prices dataset 200, , , ,000 80,000 40,000 80,000 40, ,000-80, Residual Actual Fitted 17

14 OLS assumptions in the multiple regression model (2.1): 1. u i has conditional mean zero given X 1i, X 2i,..., X ki : E(u i X 1i, X 2i,..., X ki ) = 0 2. (X 1i, X 2i,..., X ki, Y i ), i = 1,..., n, are independently and identically distributed (i.i.d.) draws from their joint distribution 3. Large outliers are unlikely: X 1i, X 2i,..., X ki and Y i have nonzero finite fourth moments 4. There is no perfect multicollinearity Remarks: Note that we do not assume any specific parametric distribution for the u i The OLS assumptions imply specific distribution results 18

15 Theorem 2.4: (Unbiasedness, consistency, normality) Given the OLS assumptions the following properties of the OLS estimators ˆβ 0, ˆβ 1,..., ˆβ k hold: 1. ˆβ 0, ˆβ 1,..., ˆβ k are unbiased estimators of β 0,..., β k. 2. ˆβ 0, ˆβ 1,..., ˆβ k are consistent estimators of β 0,..., β k. (Convergence in probability) 3. In large samples ˆβ 0, ˆβ 1,..., ˆβ k are jointly normally distributed and each single OLS estimator ˆβ j, j = 0,..., k, is normally distributed with mean β j and variance σ 2ˆβ j, that is ˆβ j N(β j, σ 2ˆβ j ). 19

16 Remarks: In general, the OLS estimators are correlated This correlation among ˆβ 0, ˆβ 1,..., ˆβ k arises from the correlation among the regressors X 1,..., X k The sampling distribution of the OLS estimators will become relevant in Section 3 (hypothesis-testing, confidence intervals) 20

17 2.3. Measures-of-fit in multiple regression Now: Three well-known summary statistics that measure how well the OLS estimates fit the data Standard error of regression (SER): The SER estimates the standard deviation of the error term u i (under the assumption of homoskedasticity): SER = 1 n k 1 n û 2 i i=1 21

18 Standard error of regression: [continued] We denote the sum of squared residuals by SSR n i=1 û 2 i so that SER = SSR n k 1 Given the OLS assumptions and homoskedasticity the squared SER, (SER) 2, is an unbiased estimator of the unknown constant variance of the u i SER is a measure of the spread of the distribution of Y i around the population regression line Both measures, SER and SSR, are reported in the EViews regression output 22

19 R 2 : The R 2 is the fraction of the sample variance of the Y i explained by the regressors Equivalently, the R 2 is 1 minus the fraction of the variance of the Y i not explained by the regressors (i.e. explained by the residuals) Denoting the explained sum of squares (ESS) and the total sum of squares (TSS) by ESS = n i=1 (Ŷ i Ȳ ) 2 and TSS = respectively, we define the R 2 as R 2 = ESS TSS = 1 SSR TSS n i=1 (Y i Ȳ ) 2, 23

20 R 2 : [continued] In multiple regression, the R 2 increases whenever an additional regressor X k+1 is added to the regression model, unless the estimated coefficient ˆβ k+1 is exactly equal to zero Since in practice it is extremely unusual to have exactly ˆβ k+1 = 0, the R 2 generally increases (and never decreases) when an new regressor is added to the regression model An increase in the R 2 due to the inclusion of a new regressor does not necessarily indicate an actually improved fit of the model 24

21 Adjusted R 2 : The adjusted R 2 (in symbols: R 2 ), deflates the conventional R 2 : R 2 = 1 n 1 SSR n k 1TSS It is always true that R 2 < R 2 (why?) When adding a new regressor X k+1 to the model, the R 2 can increase or decrease (why?) The R 2 can be negative (why?) 25

22 2.4. Omitted-variable bias Now: Discussion of a phenomenon that implies violation of the first OLS assumption on Slide 18 This issue is known under the phrasing omitted-variable bias and is extremely relevant in practice Although theoretically easy to grasp, avoiding this specification problem turns out to be a nontrivial task in many empirical applications 26

23 Definition 2.5: (Omitted-variable bias) Consider the multiple regression model in Definition 2.1 on Slide 7. Omitted-variable bias is the bias in the OLS estimator ˆβ j of the coefficient β j (for j = 1,..., k) that arises when the associated regressor X j is correlated with an omitted variable. More precisely, for omitted-variable bias to occur, the following two conditions must hold: 1. X j is correlated with the omitted variable. 2. The omitted variable is a determinant of the dependent variable Y. 27

24 Example: Consider the house-prices dataset (Slides 16, 17) Using the entire set of regressors, we obtain the OLS estimate ˆβ 2 = for the BEDROOMS-coefficient The correlation coefficients between the regressors are as follows: BEDROOMS BATHROOMS LOTSIZE STOREYS BEDROOMS BATHROOMS LOTSIZE STOREYS

25 Example: [continued] There is positive (significant) correlation between the variable BEDROOMS and all other regressors Excluding the other variables from the regression yields the following OLS-estimates: Dependent Variable: SALEPRICE Method: Least Squares Date: 14/02/12 Time: 16:10 Sample: Included observations: 546 Variable Coefficient Std. Error t-statistic Prob. C BEDROOMS R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid 3.36E+11 Schwarz criterion Log likelihood Hannan-Quinn criter F-statistic Durbin-Watson stat Prob(F-statistic) The alternative OLS-estimates of the BEDROOMS-coefficient differ substantially 29

26 Intuitive explanation of the omitted-variable bias: Consider the variable LOTSIZE as omitted LOTSIZE is an important variable for explaining SALEPRICE If we omit LOTSIZE in the regression, it will try to enter in the only way it can, namely through its positive correlation with the included variable BEDROOMS The coefficient on BEDROOMS will confound the effect of BED- ROOMS and LOTSIZE on SALEPRICE 30

27 More formal explanation: Omitted-variable bias means that the first OLS assumption on Slide 18 is violated Reasoning: In the multiple regression model the error term u i represents all factors other than the included regressors X 1,..., X k that are determinants of Y i If an omitted variable is correlated with at least one of the included regressors X 1,..., X k, then u i (which contains this factor) is correlated with the set of regressors This implies that E(u i X 1i,..., X ki ) 0 31

28 Important result: In the case of omitted-variable bias the OLS estimators on the corresponding included regressors are biased in finite samples this bias does not vanish in large samples the OLS estimators are inconsistent Solutions to omitted-variable bias: To be discussed in Section 5 32

29 2.5. Multicollinearity Definition 2.6: (Perfect multicollinearity) Consider the multiple regression model in Definition 2.1 on Slide 7. The regressors X 1,..., X k are said to be perfectly multicollinear if one of the regressors is a perfect linear function of the other regressors. Remarks: Under perfect multicollinearity the OLS estimates cannot be calculated due to division by zero in the OLS formulas Perfect multicollinearity often reflects a logical mistake in choosing the regressors or some unrecognized feature in the data set 33

30 Example: (Dummy variable trap) Consider the student-performance dataset Suppose we partition the school districts into the 3 categories (1) rural, (2) suburban, (3) urban We represent the categories by the dummy regressors { 1 if district i is rural RURAL i = 0 otherwise and by SUBURBAN i and URBAN i analogously defined Since each district belongs to one and only one category, we have for each district i: RURAL i + SUBURBAN i + URBAN i = 1 34

31 Example: [continued] Now, let us define the constant regressor X 0 associated with the intercept coefficient β 0 in the multiple regression model on Slide 7 by X 0i 1 for i = 1,... n Then, for i = 1,..., n, the following relationship holds among the regressors: Perfect multicollinearity X 0i = RURAL i + SUBURBAN i + URBAN i To estimate the regression we must exclude either one of the dummy regressors or the constant regressor X 0 (the intercept β 0 ) from the regression 35

32 Theorem 2.7: (Dummy variable trap) Let there be G different categories in the data set represented by G dummy regressors. If 1. each observation i falls into one and only one category, 2. there is an intercept (constant regressor) in the regression, 3. all G dummy regressors are included as regressors, then regression estimation fails because of perfect multicollinearity. Usual remedy: Exclude one of the dummy regressors (G 1 dummy regressors are sufficient) 36

33 Definition 2.8: (Imperfect multicollinearity) Consider the multiple regression model in Definition 2.1 on Slide 7. The regressors X 1,..., X k are said to be imperfectly multicollinear if two or more of the regressors are highly correlated in the sense that there is a linear function of the regressors that is highly correlated with another regressor. Remarks: Imperfect multicollinearity does not pose any (numeric) problems in calculating OLS estimates However, if regressors are imperfectly multicollinear, then the coefficients on at least one individual regressor will be imprecisely estimated 37

34 Remarks: [continued] Techniques for identifying and mitigating imperfect multicollinearity are presented in econometric textbooks (e.g. Hill et al., 2010, pp ) 38

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

The Simple Linear Regression Model: Specification and Estimation

The Simple Linear Regression Model: Specification and Estimation Chapter 3 The Simple Linear Regression Model: Specification and Estimation 3.1 An Economic Model Suppose that we are interested in studying the relationship between household income and expenditure on

More information

Nonlinear Regression Functions. SW Ch 8 1/54/

Nonlinear Regression Functions. SW Ch 8 1/54/ Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General

More information

Heteroskedasticity and Weighted Least Squares

Heteroskedasticity and Weighted Least Squares Econ 507. Econometric Analysis. Spring 2009 April 14, 2009 The Classical Linear Model: 1 Linearity: Y = Xβ + u. 2 Strict exogeneity: E(u) = 0 3 No Multicollinearity: ρ(x) = K. 4 No heteroskedasticity/

More information

Forecasting the US Dollar / Euro Exchange rate Using ARMA Models

Forecasting the US Dollar / Euro Exchange rate Using ARMA Models Forecasting the US Dollar / Euro Exchange rate Using ARMA Models LIUWEI (9906360) - 1 - ABSTRACT...3 1. INTRODUCTION...4 2. DATA ANALYSIS...5 2.1 Stationary estimation...5 2.2 Dickey-Fuller Test...6 3.

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Bivariate Regression Analysis. The beginning of many types of regression

Bivariate Regression Analysis. The beginning of many types of regression Bivariate Regression Analysis The beginning of many types of regression TOPICS Beyond Correlation Forecasting Two points to estimate the slope Meeting the BLUE criterion The OLS method Purpose of Regression

More information

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

Chapter 5: Basic Statistics and Hypothesis Testing

Chapter 5: Basic Statistics and Hypothesis Testing Chapter 5: Basic Statistics and Hypothesis Testing In this chapter: 1. Viewing the t-value from an OLS regression (UE 5.2.1) 2. Calculating critical t-values and applying the decision rule (UE 5.2.2) 3.

More information

Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur. Lecture - 2 Simple Linear Regression

Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur. Lecture - 2 Simple Linear Regression Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur Lecture - 2 Simple Linear Regression Hi, this is my second lecture in module one and on simple

More information

Econometrics The Multiple Regression Model: Inference

Econometrics The Multiple Regression Model: Inference Econometrics The Multiple Regression Model: João Valle e Azevedo Faculdade de Economia Universidade Nova de Lisboa Spring Semester João Valle e Azevedo (FEUNL) Econometrics Lisbon, March 2011 1 / 24 in

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Eviews Tutorial. File New Workfile. Start observation End observation Annual

Eviews Tutorial. File New Workfile. Start observation End observation Annual APS 425 Professor G. William Schwert Advanced Managerial Data Analysis CS3-110L, 585-275-2470 Fax: 585-461-5475 email: schwert@schwert.ssb.rochester.edu Eviews Tutorial 1. Creating a Workfile: First you

More information

Econ 371 Problem Set #3 Answer Sheet

Econ 371 Problem Set #3 Answer Sheet Econ 371 Problem Set #3 Answer Sheet 4.1 In this question, you are told that a OLS regression analysis of third grade test scores as a function of class size yields the following estimated model. T estscore

More information

Linear Regression with One Regressor

Linear Regression with One Regressor Linear Regression with One Regressor Michael Ash Lecture 10 Analogy to the Mean True parameter µ Y β 0 and β 1 Meaning Central tendency Intercept and slope E(Y ) E(Y X ) = β 0 + β 1 X Data Y i (X i, Y

More information

Chapter 12: Time Series Models

Chapter 12: Time Series Models Chapter 12: Time Series Models In this chapter: 1. Estimating ad hoc distributed lag & Koyck distributed lag models (UE 12.1.3) 2. Testing for serial correlation in Koyck distributed lag models (UE 12.2.2)

More information

Regression Analysis: Basic Concepts

Regression Analysis: Basic Concepts The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance

More information

2. What are the theoretical and practical consequences of autocorrelation?

2. What are the theoretical and practical consequences of autocorrelation? Lecture 10 Serial Correlation In this lecture, you will learn the following: 1. What is the nature of autocorrelation? 2. What are the theoretical and practical consequences of autocorrelation? 3. Since

More information

Air passenger departures forecast models A technical note

Air passenger departures forecast models A technical note Ministry of Transport Air passenger departures forecast models A technical note By Haobo Wang Financial, Economic and Statistical Analysis Page 1 of 15 1. Introduction Sine 1999, the Ministry of Business,

More information

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

More information

2. Pooled Cross Sections and Panels. 2.1 Pooled Cross Sections versus Panel Data

2. Pooled Cross Sections and Panels. 2.1 Pooled Cross Sections versus Panel Data 2. Pooled Cross Sections and Panels 2.1 Pooled Cross Sections versus Panel Data Pooled Cross Sections are obtained by collecting random samples from a large polulation independently of each other at different

More information

DEPARTMENT OF ECONOMICS. Unit ECON 12122 Introduction to Econometrics. Notes 4 2. R and F tests

DEPARTMENT OF ECONOMICS. Unit ECON 12122 Introduction to Econometrics. Notes 4 2. R and F tests DEPARTMENT OF ECONOMICS Unit ECON 11 Introduction to Econometrics Notes 4 R and F tests These notes provide a summary of the lectures. They are not a complete account of the unit material. You should also

More information

Variance of OLS Estimators and Hypothesis Testing. Randomness in the model. GM assumptions. Notes. Notes. Notes. Charlie Gibbons ARE 212.

Variance of OLS Estimators and Hypothesis Testing. Randomness in the model. GM assumptions. Notes. Notes. Notes. Charlie Gibbons ARE 212. Variance of OLS Estimators and Hypothesis Testing Charlie Gibbons ARE 212 Spring 2011 Randomness in the model Considering the model what is random? Y = X β + ɛ, β is a parameter and not random, X may be

More information

The relationship between stock market parameters and interbank lending market: an empirical evidence

The relationship between stock market parameters and interbank lending market: an empirical evidence Magomet Yandiev Associate Professor, Department of Economics, Lomonosov Moscow State University mag2097@mail.ru Alexander Pakhalov, PG student, Department of Economics, Lomonosov Moscow State University

More information

Instrumental Variables & 2SLS

Instrumental Variables & 2SLS Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20 - Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Simple Linear Regression Chapter 11

Simple Linear Regression Chapter 11 Simple Linear Regression Chapter 11 Rationale Frequently decision-making situations require modeling of relationships among business variables. For instance, the amount of sale of a product may be related

More information

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2 University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

More information

Econ 371 Problem Set #4 Answer Sheet. P rice = (0.485)BDR + (23.4)Bath + (0.156)Hsize + (0.002)LSize + (0.090)Age (48.

Econ 371 Problem Set #4 Answer Sheet. P rice = (0.485)BDR + (23.4)Bath + (0.156)Hsize + (0.002)LSize + (0.090)Age (48. Econ 371 Problem Set #4 Answer Sheet 6.5 This question focuses on what s called a hedonic regression model; i.e., where the sales price of the home is regressed on the various attributes of the home. The

More information

Regression analysis in practice with GRETL

Regression analysis in practice with GRETL Regression analysis in practice with GRETL Prerequisites You will need the GNU econometrics software GRETL installed on your computer (http://gretl.sourceforge.net/), together with the sample files that

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Instrumental Variables Regression. Instrumental Variables (IV) estimation is used when the model has endogenous s.

Instrumental Variables Regression. Instrumental Variables (IV) estimation is used when the model has endogenous s. Instrumental Variables Regression Instrumental Variables (IV) estimation is used when the model has endogenous s. IV can thus be used to address the following important threats to internal validity: Omitted

More information

On the Degree of Openness of an Open Economy Carlos Alfredo Rodriguez, Universidad del CEMA Buenos Aires, Argentina

On the Degree of Openness of an Open Economy Carlos Alfredo Rodriguez, Universidad del CEMA Buenos Aires, Argentina On the Degree of Openness of an Open Economy Carlos Alfredo Rodriguez, Universidad del CEMA Buenos Aires, Argentina car@cema.edu.ar www.cema.edu.ar\~car Version1-February 14,2000 All data can be consulted

More information

SYSTEMS OF REGRESSION EQUATIONS

SYSTEMS OF REGRESSION EQUATIONS SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations

More information

Instrumental Variables & 2SLS

Instrumental Variables & 2SLS Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20 - Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental

More information

UK GDP is the best predictor of UK GDP, literally.

UK GDP is the best predictor of UK GDP, literally. UK GDP IS THE BEST PREDICTOR OF UK GDP, LITERALLY ERIK BRITTON AND DANNY GABAY 6 NOVEMBER 2009 UK GDP is the best predictor of UK GDP, literally. The ONS s preliminary estimate of UK GDP for the third

More information

Econometrics Regression Analysis with Time Series Data

Econometrics Regression Analysis with Time Series Data Econometrics Regression Analysis with Time Series Data João Valle e Azevedo Faculdade de Economia Universidade Nova de Lisboa Spring Semester João Valle e Azevedo (FEUNL) Econometrics Lisbon, May 2011

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Statistics II Final Exam - January Use the University stationery to give your answers to the following questions.

Statistics II Final Exam - January Use the University stationery to give your answers to the following questions. Statistics II Final Exam - January 2012 Use the University stationery to give your answers to the following questions. Do not forget to write down your name and class group in each page. Indicate clearly

More information

IMPACT OF WORKING CAPITAL MANAGEMENT ON PROFITABILITY

IMPACT OF WORKING CAPITAL MANAGEMENT ON PROFITABILITY IMPACT OF WORKING CAPITAL MANAGEMENT ON PROFITABILITY Hina Agha, Mba, Mphil Bahria University Karachi Campus, Pakistan Abstract The main purpose of this study is to empirically test the impact of working

More information

Solución del Examen Tipo: 1

Solución del Examen Tipo: 1 Solución del Examen Tipo: 1 Universidad Carlos III de Madrid ECONOMETRICS Academic year 2009/10 FINAL EXAM May 17, 2010 DURATION: 2 HOURS 1. Assume that model (III) verifies the assumptions of the classical

More information

Lecture 18 Linear Regression

Lecture 18 Linear Regression Lecture 18 Statistics Unit Andrew Nunekpeku / Charles Jackson Fall 2011 Outline 1 1 Situation - used to model quantitative dependent variable using linear function of quantitative predictor(s). Situation

More information

Astate implements tough new penalties on drunk drivers; what is the effect

Astate implements tough new penalties on drunk drivers; what is the effect CHAPTER 4 Linear Regression with One Regressor Astate implements tough new penalties on drunk drivers; what is the effect on highway fatalities? A school district cuts the size of its elementary school

More information

CONSOLIDATED EDISON COMPANY OF NEW YORK, INC. VOLUME FORECASTING MODELS. Variable Coefficient Std. Error t-statistic Prob.

CONSOLIDATED EDISON COMPANY OF NEW YORK, INC. VOLUME FORECASTING MODELS. Variable Coefficient Std. Error t-statistic Prob. PAGE 1 OF 6 SC 1 (RESIDENTIAL AND RELIGIOUS) Dependent Variable: DLOG(GWH17/BDA0,0,4) Convergence achieved after 16 iterations MA Backcast: 1987Q2 C 0.011618 0.003667 3.168199 0.002100 DLOG(PRICE17S(-3),0,4)

More information

THE CORRELATION BETWEEN UNEMPLOYMENT AND REAL GDP GROWTH. A STUDY CASE ON ROMANIA

THE CORRELATION BETWEEN UNEMPLOYMENT AND REAL GDP GROWTH. A STUDY CASE ON ROMANIA THE CORRELATION BETWEEN UNEMPLOYMENT AND REAL GDP GROWTH. A STUDY CASE ON ROMANIA Dumitrescu Bogdan Andrei The Academy of Economic Studies Faculty of Finance, Insurance, Banking and Stock Exchange 6, Romana

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

Weather Normalization of MISO Historical Data Procedure

Weather Normalization of MISO Historical Data Procedure Weather Normalization of MISO Historical Data Procedure Goal The goal of this weather normalization work was to provide a preliminary methodology for weather normalization as MISO does not currently have

More information

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ 1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

More information

5. Multiple regression

5. Multiple regression 5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful

More information

Econ 371 Problem Set #3 Answer Sheet

Econ 371 Problem Set #3 Answer Sheet Econ 371 Problem Set #3 Answer Sheet 4.3 In this question, you are told that a OLS regression analysis of average weekly earnings yields the following estimated model. AW E = 696.7 + 9.6 Age, R 2 = 0.023,

More information

The Impact of Privatization in Insurance Industry on Insurance Efficiency in Iran

The Impact of Privatization in Insurance Industry on Insurance Efficiency in Iran The Impact of Privatization in Insurance Industry on Insurance Efficiency in Iran Shahram Gilaninia 1, Hosein Ganjinia, Azadeh Asadian 3 * 1. Department of Industrial Management, Islamic Azad University,

More information

where b is the slope of the line and a is the intercept i.e. where the line cuts the y axis.

where b is the slope of the line and a is the intercept i.e. where the line cuts the y axis. Least Squares Introduction We have mentioned that one should not always conclude that because two variables are correlated that one variable is causing the other to behave a certain way. However, sometimes

More information

OLS in Matrix Form. Let y be an n 1 vector of observations on the dependent variable.

OLS in Matrix Form. Let y be an n 1 vector of observations on the dependent variable. OLS in Matrix Form 1 The True Model Let X be an n k matrix where we have observations on k independent variables for n observations Since our model will usually contain a constant term, one of the columns

More information

Lecture 2: Simple Linear Regression

Lecture 2: Simple Linear Regression DMBA: Statistics Lecture 2: Simple Linear Regression Least Squares, SLR properties, Inference, and Forecasting Carlos Carvalho The University of Texas McCombs School of Business mccombs.utexas.edu/faculty/carlos.carvalho/teaching

More information

Regression Analysis (Spring, 2000)

Regression Analysis (Spring, 2000) Regression Analysis (Spring, 2000) By Wonjae Purposes: a. Explaining the relationship between Y and X variables with a model (Explain a variable Y in terms of Xs) b. Estimating and testing the intensity

More information

The Effect of Seasonality in the CPI on Indexed Bond Pricing and Inflation Expectations

The Effect of Seasonality in the CPI on Indexed Bond Pricing and Inflation Expectations The Effect of Seasonality in the CPI on Indexed Bond Pricing and Inflation Expectations Roy Stein* *Research Department, Roy Stein roy.stein@boi.org.il, tel: 02-6552559 This research was partially supported

More information

Econometric Principles and Data Analysis

Econometric Principles and Data Analysis Econometric Principles and Data Analysis product: 4339 course code: c230 c330 Econometric Principles and Data Analysis Centre for Financial and Management Studies SOAS, University of London 1999, revised

More information

Competition as an Effective Tool in Developing Social Marketing Programs: Driving Behavior Change through Online Activities

Competition as an Effective Tool in Developing Social Marketing Programs: Driving Behavior Change through Online Activities Competition as an Effective Tool in Developing Social Marketing Programs: Driving Behavior Change through Online Activities Corina ŞERBAN 1 ABSTRACT Nowadays, social marketing practices represent an important

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

SELF-TEST: SIMPLE REGRESSION

SELF-TEST: SIMPLE REGRESSION ECO 22000 McRAE SELF-TEST: SIMPLE REGRESSION Note: Those questions indicated with an (N) are unlikely to appear in this form on an in-class examination, but you should be able to describe the procedures

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

MACRO ECONOMIC PATTERNS AND STORIES. Is Your Job Cyclical?

MACRO ECONOMIC PATTERNS AND STORIES. Is Your Job Cyclical? Is Your Job at Risk? Page 1 of 8 Is Your Job Cyclical? Accessing the website of the Bureau of Labor Statistics Finding out about the ups and downs of your job Total Nonfarm Employment is illustrated in

More information

The Multiple Regression Model: Hypothesis Tests and the Use of Nonsample Information

The Multiple Regression Model: Hypothesis Tests and the Use of Nonsample Information Chapter 8 The Multiple Regression Model: Hypothesis Tests and the Use of Nonsample Information An important new development that we encounter in this chapter is using the F- distribution to simultaneously

More information

ECON Introductory Econometrics. Lecture 17: Experiments

ECON Introductory Econometrics. Lecture 17: Experiments ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.

More information

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used

More information

CHAPTER 5. Exercise Solutions

CHAPTER 5. Exercise Solutions CHAPTER 5 Exercise Solutions 91 Chapter 5, Exercise Solutions, Principles of Econometrics, e 9 EXERCISE 5.1 (a) y = 1, x =, x = x * * i x i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 y * i (b) (c) yx = 1, x = 16, yx

More information

Chapter 3: The Multiple Linear Regression Model

Chapter 3: The Multiple Linear Regression Model Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics

More information

THE IMPORTANCE OF GOODS PRODUCTION AND INTERMEDIATE CONSUMPTION FOR AN INCREASED GDP

THE IMPORTANCE OF GOODS PRODUCTION AND INTERMEDIATE CONSUMPTION FOR AN INCREASED GDP THE IMPORTANCE OF GOODS PRODUCTION AND INTERMEDIATE CONSUMPTION FOR AN INCREASED GDP RADU-MARCEL JOIA * Abstract Human existence is conditioned, of course, by the consumption of goods to meet the needs.

More information

, then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients (

, then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients ( Multiple regression Introduction Multiple regression is a logical extension of the principles of simple linear regression to situations in which there are several predictor variables. For instance if we

More information

EC327: Advanced Econometrics, Spring 2007

EC327: Advanced Econometrics, Spring 2007 EC327: Advanced Econometrics, Spring 2007 Wooldridge, Introductory Econometrics (3rd ed, 2006) Appendix D: Summary of matrix algebra Basic definitions A matrix is a rectangular array of numbers, with m

More information

Regression Analysis. Data Calculations Output

Regression Analysis. Data Calculations Output Regression Analysis In an attempt to find answers to questions such as those posed above, empirical labour economists use a useful tool called regression analysis. Regression analysis is essentially a

More information

Simultaneous Equation Models As discussed last week, one important form of endogeneity is simultaneity. This arises when one or more of the

Simultaneous Equation Models As discussed last week, one important form of endogeneity is simultaneity. This arises when one or more of the Simultaneous Equation Models As discussed last week, one important form of endogeneity is simultaneity. This arises when one or more of the explanatory variables is jointly determined with the dependent

More information

Inference in Regression Analysis. Dr. Frank Wood

Inference in Regression Analysis. Dr. Frank Wood Inference in Regression Analysis Dr. Frank Wood Inference in the Normal Error Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters

More information

Regression Analysis Using ArcMap. By Jennie Murack

Regression Analysis Using ArcMap. By Jennie Murack Regression Analysis Using ArcMap By Jennie Murack Regression Basics How is Regression Different from other Spatial Statistical Analyses? With other tools you ask WHERE something is happening? Are there

More information

Forecasting Thai Gold Prices

Forecasting Thai Gold Prices 1 Forecasting Thai Gold Prices Pravit Khaemasunun This paper addresses forecasting Thai gold price. Two forecasting models, namely, Multiple-Regression, and Auto-Regressive Integrated Moving Average (ARIMA),

More information

Lecture 15. Endogeneity & Instrumental Variable Estimation

Lecture 15. Endogeneity & Instrumental Variable Estimation Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental

More information

Chapter 11: Two Variable Regression Analysis

Chapter 11: Two Variable Regression Analysis Department of Mathematics Izmir University of Economics Week 14-15 2014-2015 In this chapter, we will focus on linear models and extend our analysis to relationships between variables, the definitions

More information

Determinants of Stock Market Performance in Pakistan

Determinants of Stock Market Performance in Pakistan Determinants of Stock Market Performance in Pakistan Mehwish Zafar Sr. Lecturer Bahria University, Karachi campus Abstract Stock market performance, economic and political condition of a country is interrelated

More information

RELATIONSHIP BETWEEN STOCK MARKET VOLATILITY AND EXCHANGE RATE: A STUDY OF KSE

RELATIONSHIP BETWEEN STOCK MARKET VOLATILITY AND EXCHANGE RATE: A STUDY OF KSE RELATIONSHIP BETWEEN STOCK MARKET VOLATILITY AND EXCHANGE RATE: A STUDY OF KSE Waseem ASLAM Department of Finance and Economics, Foundation University Rawalpindi, Pakistan seem_aslam@yahoo.com Abstract:

More information

REGRESSION LINES IN STATA

REGRESSION LINES IN STATA REGRESSION LINES IN STATA THOMAS ELLIOTT 1. Introduction to Regression Regression analysis is about eploring linear relationships between a dependent variable and one or more independent variables. Regression

More information

Linear combinations of parameters

Linear combinations of parameters Linear combinations of parameters Suppose we want to test the hypothesis that two regression coefficients are equal, e.g. β 1 = β 2. This is equivalent to testing the following linear constraint (null

More information

Wooldridge, Introductory Econometrics, 4th ed. Multiple regression analysis:

Wooldridge, Introductory Econometrics, 4th ed. Multiple regression analysis: Wooldridge, Introductory Econometrics, 4th ed. Chapter 4: Inference Multiple regression analysis: We have discussed the conditions under which OLS estimators are unbiased, and derived the variances of

More information

Weighted least squares

Weighted least squares Weighted least squares Patrick Breheny February 7 Patrick Breheny BST 760: Advanced Regression 1/17 Introduction Known weights As a precursor to fitting generalized linear models, let s first deal with

More information

Regression step-by-step using Microsoft Excel

Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

More information

UNDERSTANDING MULTIPLE REGRESSION

UNDERSTANDING MULTIPLE REGRESSION UNDERSTANDING Multiple regression analysis (MRA) is any of several related statistical methods for evaluating the effects of more than one independent (or predictor) variable on a dependent (or outcome)

More information

Regression with a Binary Dependent Variable

Regression with a Binary Dependent Variable Regression with a Binary Dependent Variable Chapter 9 Michael Ash CPPA Lecture 22 Course Notes Endgame Take-home final Distributed Friday 19 May Due Tuesday 23 May (Paper or emailed PDF ok; no Word, Excel,

More information

August 2012 EXAMINATIONS Solution Part I

August 2012 EXAMINATIONS Solution Part I August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,

More information

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

Source engine marketing: A preliminary empirical analysis of web search data

Source engine marketing: A preliminary empirical analysis of web search data Source engine marketing: A preliminary empirical analysis of web search data ABSTRACT Bruce Q. Budd Alfaisal University The purpose of this paper is to empirically investigate a website performance and

More information

Quick Stata Guide by Liz Foster

Quick Stata Guide by Liz Foster by Liz Foster Table of Contents Part 1: 1 describe 1 generate 1 regress 3 scatter 4 sort 5 summarize 5 table 6 tabulate 8 test 10 ttest 11 Part 2: Prefixes and Notes 14 by var: 14 capture 14 use of the

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

More information

Calculate the holding period return for this investment. It is approximately

Calculate the holding period return for this investment. It is approximately 1. An investor purchases 100 shares of XYZ at the beginning of the year for $35. The stock pays a cash dividend of $3 per share. The price of the stock at the time of the dividend is $30. The dividend

More information

ELEC-E8104 Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems

ELEC-E8104 Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems Minimum Mean Square Error (MMSE) MMSE estimation of Gaussian random vectors Linear MMSE estimator for arbitrarily distributed

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information