Limitations of regression analysis

Similar documents
Chapter 2. Dynamic panel data models

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

Note 2 to Computer class: Standard mis-specification tests

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

1 Another method of estimation: least squares

Preparation course Msc Business & Econonomics

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

CAPM, Arbitrage, and Linear Factor Models

Lecture 15. Endogeneity & Instrumental Variable Estimation

= C + I + G + NX ECON 302. Lecture 4: Aggregate Expenditures/Keynesian Model: Equilibrium in the Goods Market/Loanable Funds Market

16 : Demand Forecasting

Financial Risk Management Exam Sample Questions/Answers

Chapter 4: Vector Autoregressive Models

y t by left multiplication with 1 (L) as y t = 1 (L) t =ª(L) t 2.5 Variance decomposition and innovation accounting Consider the VAR(p) model where

Solución del Examen Tipo: 1

E 4101/5101 Lecture 8: Exogeneity

Chapter 3: The Multiple Linear Regression Model

Introduction to Macroeconomics TOPIC 2: The Goods Market

SYSTEMS OF REGRESSION EQUATIONS

University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination February 14 th, 2014.

Introduction to Path Analysis

IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD

Econometrics Simple Linear Regression

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors

1 Teaching notes on GMM 1.

1 Short Introduction to Time Series

10. Fixed-Income Securities. Basic Concepts

From the help desk: Bootstrapped standard errors

Empirical Methods in Applied Economics

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

Data Mining: Algorithms and Applications Matrix Math Review

Simple Linear Regression Inference

Topic 5: Stochastic Growth and Real Business Cycles

Multiple Linear Regression in Data Mining

Least Squares Estimation

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

So, using the new notation, P X,Y (0,1) =.08 This is the value which the joint probability function for X and Y takes when X=0 and Y=1.

Introduction to Dynamic Models. Slide set #1 (Ch in IDM).

An introduction to Value-at-Risk Learning Curve September 2003

Chapter 5 Estimating Demand Functions

Department of Economics and Related Studies Financial Market Microstructure. Topic 1 : Overview and Fixed Cost Models of Spreads

The Real Business Cycle Model

Dynamics of Small Open Economies

LOGIT AND PROBIT ANALYSIS

On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information

Forecast covariances in the linear multiregression dynamic model.

Correlated Random Effects Panel Data Models

Panel Data Econometrics

ANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? 1. INTRODUCTION

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Chapter 5: The Cointegrated VAR model

C(t) (1 + y) 4. t=1. For the 4 year bond considered above, assume that the price today is 900$. The yield to maturity will then be the y that solves

Mgmt 469. Fixed Effects Models. Suppose you want to learn the effect of price on the demand for back massages. You

APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING

Sections 2.11 and 5.8

4. Simple regression. QBUS6840 Predictive Analytics.

1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability).

5. Linear Regression

Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems

These are some practice questions for CHAPTER 23. Each question should have a single answer. But be careful. There may be errors in the answer key!

FORECASTING DEPOSIT GROWTH: Forecasting BIF and SAIF Assessable and Insured Deposits

MULTIVARIATE PROBABILITY DISTRIBUTIONS

Clustering in the Linear Model

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

3.1 Least squares in matrix form

PS 271B: Quantitative Methods II. Lecture Notes

ADVANCED FORECASTING MODELS USING SAS SOFTWARE

Marketing Mix Modelling and Big Data P. M Cain

Time Series and Forecasting

Example: Boats and Manatees

The Big Picture. Correlation. Scatter Plots. Data

The Real Business Cycle model

Economics 326: Duality and the Slutsky Decomposition. Ethan Kaplan

Midterm Exam:Answer Sheet

Lecture 1: Asset pricing and the equity premium puzzle

Lecture 3: Differences-in-Differences

Panel Data: Linear Models

2. Linear regression with multiple regressors

Univariate Time Series Analysis; ARIMA Models

Regression III: Advanced Methods

Probability and Random Variables. Generation of random variables (r.v.)

ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE

Impulse Response Functions

The Engle-Granger representation theorem

Lecture Notes: Basic Concepts in Option Pricing - The Black and Scholes Model

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Covariance and Correlation

Online Appendix to Impatient Trading, Liquidity. Provision, and Stock Selection by Mutual Funds

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Deflator Selection and Generalized Linear Modelling in Market-based Accounting Research

Life Table Analysis using Weighted Survey Data

Linear Threshold Units

Do declining exchange rates help the U.S. economy?

Univariate Time Series Analysis; ARIMA Models

Conditional guidance as a response to supply uncertainty

Answers to Text Questions and Problems in Chapter 8

Theory of Errors and Least Squares Adjustment

Transcription:

Limitations of regression analysis Ragnar Nymoen Department of Economics, UiO 8 February 2009

Overview What are the limitations to regression? Simultaneous equations bias Measurement errors in explanatory variables In both cases the explanatory variable is not exogenous in the econometric sense Main reference is G Ch 15.1 and 15.2;. B Ch 8.1, 10.1 and 10.2;K: Ch 9.3,10.2

What are the limitations to regression analysis? It is not linearity in variables, as we have seen it is not linearity in parameters, although we have only covered the linear regression model here Remember that by rst estimating the linear model we can use the results to estimate parameters that are non-linear functions of the estimated model s parameters (the delta method or its equivalent in the Bårdsen method) If the model is non-linear in the parameter from the outset, can use Non-Linear Least Squares to t the best non-linear curve to the data. Greene Ch 11, not in the syllabus to this course. It si not con ned to single equation, as we seen with the SURE estimator. The real limitation to the regression model is when the regression function does not contain the parameter of interest

A simple Keynes model Let Y t denote GDP in period t D 1, 2,..., T. C t is endogenous expenditure and let X t denote exogenous expenditure. Assume that C t depends on GDP, then our example model is Y t D C t C X t (1) C t D b 1 C b 2 Y t C " t, 0 < b 2 < 1 (2) " t is a random disturbance term. We assume that it is white noise uncorrelated with X t. For simplicity we assume normality " t N.0, 2 " /. The parameter of interest is the marginal propensity to consume b 2.

The reduced form of the model (1) and (2) de nes a simultaneous equations model. Solution for the two endogenous variables: Y t D 11 C 12 X t C 1t (3) C t D 21 C 22 X t C 2t (4) 11 D b 1 12 D 1 1t D 1 " t 21 D b 1 21 D b 2 2t D 1 " t

The distribution of Y and C The Reduced Form written more compactly Y t D yt C 1t (5) C t D ct C 2t (6) where 1t 2t N 2 0, y cy cy 2 c j X t. (7) The conditional distributions of the stochastic variables 1t and 2t are binormal with zero expectations and variance matrix: 2 y cy j X t. cy 2 c

Conditional distribution of C It follows that Y t and C t are normally distributed with the same covariance matrix as. 1t 2t / 0 and expectations yt D 11 C 12 X t, ct D 21 C 22 X t. It also follows (Lect 1) that the conditional distribution of C t is normal with conditional expectation: E [C t j Y t ] D ct c y yt C c y Y t (8) D 21 C 22 X t c y. 11 C 12 X t / C c y Y t D. 21 c y 11 / C. 22 c y 12 /X t C c y Y t

We see that The macro model implies (8) as the conditional expextation for C t. It is the valid regression model of C t on Y t and can be estimated with full e cency by OLS. It will not deliver an estimate of the marginal propensity to consume, b 2! In sum: The regression function implied by (1) and (2) is (8), not the regression of C t on Y t and a constant. And the regression function (8) is not helpful for the estimation for the parameter of interest b 1 (in fact since c y D 1 it estimates the identity in this special case) )

Simultaneity bias in the macro model example Suppose we estimate the consumption function by OLS regardless. We will estimate some parameter. What is it? P P Ct.Y t NY / Ct.Y t NY / Ob 2 D P D.Yt NY / 2 P Yt.Y t NY / where NY D 1/T P Y t. Ob 2 D 1 P Yt.Y t NY / X fb1 C b 2 Y t C " t g t.y t NY / (9) D P "t.y t NY / b 2 C P.Yt NY / 2 We must evaluate the term P "t.y t NY / P.Yt NY / 2 in the light of the model.

Since Y t depends on the shocks " t to consumption, and C t depends on Y t, then " t and Y t are correlated. This correlation will not go away as T grows. Using the RF expression for Y t, the denominator can be written as 1 X.Yt NY / 2 D 1 X 12.X t NX / C. 1t N 1 / 2 T T Take probability limits: plim 1 T X.Yt NY / 2 D D plim 1 T X 2 12.X t NX / 2 C 2 12 plim 1 T X.Xt NX /. 1t N 1 / C plim 1 T X.1t N 1 / 2 D 2 12 Var.X t/ C 2 y

plim b O2 b 2 D D plim 1 P T "t.y t NY / plim 1 P T.Yt NY / 2 Cov." t, Y t / 2 12 Var.X t/ C 2 y From the Reduced Form we also have Cov." t, Y t / D E [" t yt ] D E [" t 1 " t ] D 1 Var[" t ] 2 12 Var.X t/ C 2 y D D 2 " 1 2 Var.X t / C 2 "

The inconsistency of OLS, plim b O2 b 2 D D 2 " 1 2 Var.Xt / C 2 ". / 2 " Var.X t / C 2 " D./ C 1 Var.X t / 2 " The bias is positive Large variance in X t relative to " t reduces the biases. But it does not kill the bias. The reason is that OLS assumes the wrong model for C t, one with Cov.Y t, " t / D 0. It is not here.

Example with an expectations variable Assume the simple regression model (in Greene s notation again): y i D 1 C 2 x i C " i, i D 1, 2,..., n. (10) with all the classical assumptions holding. If xi is an expectations variable that we as econometricians cannot observe or cannot measure without error, we can still try to estimate 1 and 2 using the observable (actual) where x i. We then need to make assumptions about the properties of the di erence u i D x i x i. (11)

Assumptions: u i is random, zero mean, variance 2 u Cov.u i, " i / D 0 Cov.u i, x i / D 0 Both u i and " i have the classical properties The model that we estimate becomes: But with y i D 1 C 2 x i C i (12) i D " i 2 u i (13) E [x i i ] D E [.x i C u i /." i 2 u i /] D 2 2 u

OLS gives and we have plim P i.x i Nx/ O 2 b 2 D 2 C P.xi Nx/ 2 plim O 2 2 D plim 1 P T i.x i Nx/ plim 1 P T.xi Nx/ 2 we already have that 2 2 u goes into the numerator. The denominator is more work (like in the sim eq case) but intuitively it must boil down to the sum of the variances of xi and u i, hence plim ( O 2 2 / D 2 2 u Var.x i / C 2 u

plim O 2 D 2 1 C 2 u Var.x i / < 2 if 2 is positive. It can be shown that by taking the inverse regression, x i on y i, gives an overestimation, so OLS de nes a bound around the true parameter. Measurement errors in y i : No bias problem, but potential for heteroscedasticity. Solution to both classes of bias problems exempli ed here: Replace OLS with other estimators. IV, 2SLS as we shall see.