Econometric Methods for Panel Data

Similar documents
Panel Data Analysis in Stata

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

Econometric Methods for Panel Data

SYSTEMS OF REGRESSION EQUATIONS

Chapter 6: Multivariate Cointegration Analysis

Panel Data: Linear Models

Simple Linear Regression Inference

2. Linear regression with multiple regressors

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS

Solución del Examen Tipo: 1

Correlated Random Effects Panel Data Models

Department of Economics

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Chapter 10: Basic Linear Unobserved Effects Panel Data. Models:

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s

Introduction to General and Generalized Linear Models

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

Chapter 4: Vector Autoregressive Models

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Regression Analysis (Spring, 2000)

Introductory Econometrics

Use of deviance statistics for comparing models

Chapter 2. Dynamic panel data models

Clustering in the Linear Model

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

STATISTICA Formula Guide: Logistic Regression. Table of Contents

1.5 Oneway Analysis of Variance

ON THE ROBUSTNESS OF FIXED EFFECTS AND RELATED ESTIMATORS IN CORRELATED RANDOM COEFFICIENT PANEL DATA MODELS

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions

1. THE LINEAR MODEL WITH CLUSTER EFFECTS

Spatial panel models

Panel Data Econometrics

Assessing the Relative Power of Structural Break Tests Using a Framework Based on the Approximate Bahadur Slope

Part 2: Analysis of Relationship Between Two Variables

Example: Boats and Manatees

FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS

1 Introduction. 2 The Econometric Model. Panel Data: Fixed and Random Effects. Short Guides to Microeconometrics Fall 2015

Chapter 5 Analysis of variance SPSS Analysis of variance

The Effect of R&D Expenditures on Stock Returns, Price and Volatility

Logit Models for Binary Data

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

COURSES: 1. Short Course in Econometrics for the Practitioner (P000500) 2. Short Course in Econometric Analysis of Cointegration (P000537)

Regression Analysis: A Complete Example

ESTIMATING AN ECONOMIC MODEL OF CRIME USING PANEL DATA FROM NORTH CAROLINA BADI H. BALTAGI*

Linear Models for Continuous Data

Financial Risk Management Exam Sample Questions/Answers

MODELS FOR PANEL DATA Q

UNDERSTANDING THE TWO-WAY ANOVA

Examining the effects of exchange rates on Australian domestic tourism demand: A panel generalized least squares approach

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables. Kathleen M. Lang* Boston College.

Lecture 15. Endogeneity & Instrumental Variable Estimation

Technical report. in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE

Chapter 1. Linear Panel Models and Heterogeneity

Overview of Factor Analysis

Wooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables

T-test & factor analysis

Marketing Mix Modelling and Big Data P. M Cain

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

The marginal cost of rail infrastructure maintenance; does more data make a difference?

Evaluating one-way and two-way cluster-robust covariance matrix estimates

Least Squares Estimation

Poor identification and estimation problems in panel data models with random effects and autocorrelated errors

Panel Data Analysis Fixed and Random Effects using Stata (v. 4.2)

The Power of the KPSS Test for Cointegration when Residuals are Fractionally Integrated

Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:

Firm characteristics. The current issue and full text archive of this journal is available at

Multiple Linear Regression in Data Mining

Testing for Granger causality between stock prices and economic growth

Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure

From the help desk: Swamy s random-coefficients model

Econometrics Simple Linear Regression

MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM. R, analysis of variance, Student test, multivariate analysis

APPROXIMATING THE BIAS OF THE LSDV ESTIMATOR FOR DYNAMIC UNBALANCED PANEL DATA MODELS. Giovanni S.F. Bruno EEA

ANALYSIS OF FACTOR BASED DATA MINING TECHNIQUES

SPSS Guide: Regression Analysis

Regression III: Advanced Methods

FULLY MODIFIED OLS FOR HETEROGENEOUS COINTEGRATED PANELS

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

Study Guide for the Final Exam

Multivariate Normal Distribution

Non-Stationary Time Series andunitroottests

Module 5: Multiple Regression Analysis

Introduction to Regression and Data Analysis

Employer-Provided Health Insurance and Labor Supply of Married Women

1 Teaching notes on GMM 1.

Notes on Applied Linear Regression

SHORT RUN AND LONG RUN DYNAMICS OF RESIDENTIAL ELECTRICITY CONSUMPTION: HOMOGENEOUS AND HETEROGENEOUS PANEL ESTIMATIONS FOR OECD

Granger Causality between Government Revenues and Expenditures in Korea

Statistical Tests for Multiple Forecast Comparison

SAS Software to Fit the Generalized Linear Model

Descriptive Statistics

Crosstabulation & Chi Square

August 2012 EXAMINATIONS Solution Part I

Testing Research and Statistical Hypotheses

Transcription:

Based on the books by Baltagi: Econometric Analysis of Panel Data and by Hsiao: Analysis of Panel Data Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna April 19, 2013

Outline Introduction Fixed effects Random effects Two-way panels Tests in panel models Poolability tests Testing for the presence of random effects The Hausman test Coefficients of determination in panels

Poolability tests The hypotheses of poolability tests These tests help select the panel model to be estimated, within the framework of fixed-effects models. For example: Are there individual effects or is it preferable to ignore them and to estimate by pooled OLS? Are there time effects on top of the individual effects? Are the coefficients β really constant across individuals? These tests follow a simple principle: restricted and unrestricted models are compared via LR or Wald statistics. Null distributions are F or χ 2, with degrees of freedom matching the number of restrictions.

Poolability tests Testing for individual effects Here, the null model (restricted) is y it = α+β X it +ν it, with iid errors, and the alternative (unrestricted) model is y it = α+β X it +µ i +ν it. The null hypothesis may be written as: H 0 : µ i = 0,i = 1,...,N.

Poolability tests Testing for individual effects: the statistic The traditional restriction-test statistic is F 1 way = (ESS R ESS U )/(N 1) ESS U /((T 1)N K), as there are N 1 restrictions from the maintained hypothesis to the null. There are (T 1)N K degrees of freedom in the unrestricted model. Under H 0, this statistic will be distributed as F N 1,N(T 1) K, assuming Gaussian errors.

Poolability tests Other analogous tests For the null of the pooled model and the alternative of the two-way model, the distribution of the F statistic will be distributed as F (N+T 2,NT N T+1 K) under the null, assuming Gaussian errors. For the null of the one-way individual-effects model and the alternative of the two-way model, the F statistic will be distributed as F (T 1,NT N T+1 K) under the null, assuming Gaussian errors. For large NT, χ 2 versions may be used instead, which are independent of Gaussian error assumptions.

Poolability tests Testing homogeneity of individual slopes In one specification, the unrestricted model is and the null can be expressed as y it = α+β i X it +µ i +ν it, H 0 : β 1 =... = β N = β,µ 1 =... = µ N = 0, with (K +1)(N 1) restrictions. The unrestricted model has only N(T K 1) degrees of freedom. If this test rejects, either the one-way model should be considered or individuals must be modelled separately. It may make more sense just to test the slopes, with K(N 1) restrictions.

Poolability tests Constellations in panel models and degrees of freedom coefficients β effects name β β i α OLS NT K 1 N(T K) 1 α i one-way N(T 1) K N(T K 1) α it two-way N(T 1) T +1 K N(T K 1) T +1

Testing for the presence of random effects Likelihood-ratio tests for variance parameters Assume L u is the likelihood of an unrestricted model, L r is the likelihood of a restricted model. A lemma says that the likelihood-ratio (LR) statistic LR = 2(logL u logl r ), will be, under the null of the restricted model, asymptotically distributed χ 2 with degrees of freedom matching the number of restrictions. The lemma requires several regularity conditions. In testing a variance parameter for zero, these conditions are violated and the property is not guaranteed to hold.

Testing for the presence of random effects LR test for random effects in two-way panels The unrestricted RE model y it = α+β X it +u it, u it = µ i +λ t +ν it, with the restricted null H 0 : σµ 2 = σ2 λ = 0 violates the regularity conditions. The LR test statistic has the non-standard distribution 1 4 χ2 (0)+ 1 2 χ2 (1)+ 1 4 χ2 (2), where χ 2 (0) denotes point mass at zero.

Testing for the presence of random effects LR test for random effects in one-way panels The unrestricted RE model y it = α+β X it +u it, u it = µ i +ν it, with the restricted null H 0 : σµ 2 = 0 again violates the regularity conditions. The LR test statistic has the non-standard distribution 1 2 χ2 (0)+ 1 2 χ2 (1). An analogous test can be used for random time effects. These tests are due to Gourieroux, Holly & Monfort.

Testing for the presence of random effects LM tests for random effects Lagrange-multiplier (LM) tests have standard χ 2 asymptotics. They have usually less power. One may test for two-way random effects using pooled-ols residuals ũ by LM: LM = LM 1 +LM 2, LM 1 = LM 2 = NT 2(T 1) NT 2(N 1) { 1 ũ (I N J T )ũ ũ ũ { 1 ũ (J N I T )ũ ũ ũ } 2, } 2, or one may test for one-way effects by using LM 1 or LM 2. This test is due to Breusch & Pagan.

The Hausman test The Hausman test principle Hausman tests can be used in all situations where two model specifications and two estimators are available with the following properties: 1. In the restricted model (null), the estimator ˆθ is efficient, the estimator θ is consistent though typically not efficient; 2. in the unrestricted model (alternative), the estimator ˆθ is inconsistent, the estimator θ is consistent. Then, the difference q = ˆθ θ should diverge under the alternative and it should converge to zero under the null. Moreover, under the null q and ˆθ should be uncorrelated.

The Hausman test The RE and the FE model The null of the RE and the alternative of the FE model correspond to the Hausman situation: 1. In the RE model, the GLS-type RE estimator is efficient by construction for Gaussian errors, the FE estimator and even the OLS estimator are consistent; 2. in the FE model, the RE estimator is inconsistent, because of the omitted-variable effect, while FE is consistent by construction.

The Hausman test Two views on the RE inconsistency in the FE model 1. The estimator (X Ω 1 X) 1 X Ω 1 y is inconsistent, if the true model is y = Xβ+Z µ µ+ν, as the regressors Z µ are omitted; 2. Mundlak showed that the RE estimator for a stochastic regression model with X and Z µ correlated is identical to the FE estimator: the RE estimator imposes independence of effects and covariates. Some argue, however, that Mundlak s alternative is not really the same concept as the fixed-effects model.

The Hausman test The Hausman test statistic The Hausman test statistic is defined as m = q (varˆβ FE varˆβ RE ) 1 q, with q = ˆβ FE ˆβ RE. Under RE, the matrix difference in brackets is positive, as the RE estimator is efficient and any other estimator has a larger variance. The statistic m is distributed χ 2 under the null of RE, with degrees of freedom determined by the dimension of β, K.

R 2 in a panel model Should the variation due to effects be part of the explained variation or not? If yes, the R 2 has little to say on the importance of the substantial covariates β. There is no unanimous agreement on which R 2 to report in a panel. Some programs (STATA etc.) follow the suggestion by Wooldridge and report three measures: within R 2, between R 2, and overall R 2.

The within R 2 For the within R 2, the total sum of squares TSS is assumed as TSS = N T (y it ȳ i ) 2, i=1 t=1 i.e. after adjusting for effects. Then, also residuals in the residual sum of squares ESS rely on the purged model, and the statistic 1 ESS TSS is maximized by LSDV or the FE estimator.

The between R 2 For the between R 2, the total sum of squares TSS is assumed as TSS = N (ȳ i ȳ) 2, i=1 i.e. just N individual time averages. Residuals in the residual sum of squares ESS also rely on the between model with N observations, and the statistic 1 ESS TSS is maximized by the between estimator.

The overall R 2 Finally, the overall R 2 relies on the usual TSS definition TSS = T t=1 i=1 N (y it ȳ) 2, and on residuals calculated without accounting for effects. It is maximized by pooled OLS. Values for the three R 2 may be compared. For example, if within and overall R 2 are close, this is evidence for individual effects being not so important etc.