5. Assessing studies based on multiple regression

Size: px
Start display at page:

Download "5. Assessing studies based on multiple regression"

Transcription

1 5. Assessing studies based on multiple regression Questions of this section: What makes a study using multiple regression (un)reliable? When does multiple regression provide a useful estimate of the causal effect under consideration? Conceptual framework: Internal and external validity 114

2 Definition 5.1: (Internal and external validity) A statistical analysis is said to have internal validity if the statistical inferences about causal effects are valid for the population being studied. The analysis is said to have external validity if its inferences and conclusions can be generalized from the population and setting studied to other populations and settings. Terminology: The population studied is the population of entities (people, companies,...) from which the sample was drawn The population of interest is the population of entities to which the causal inferences from the study are to be applied By setting we mean the institutional, legal, social, and economic environment 115

3 Threats to internal validity: The estimator of a causal effect should be unbiased and consistent Hypothesis tests should have the desired significance level Confidence intervals should have the desired confidence levels Requirements for internal validity are that the OLS estimator is unbiased and consistent and that standard errors are computed in a way that makes confidence intervals have the desired confidence levels These requirements might not be met for various reasons Threats to internal validity that lead to failures of our OLS assumptions from Slide

4 Threats to external validity: Differences between the population being studied and the population of interest Example: Medical studies that use animal populations like mice (the population being studied), but aim at transfering the results to human populations (the population of interest) Even if the population studied and the population of interest are identical, the study results may not be generalized due to differences in the settings Example: The effect of an antidrinking advertising campaign on the drinking behaviour of a group of first-term students might differ at two universities if the legal penalties for drinking differ at both universities 117

5 Threats to external validity: [continued] Important questions with respect to external validity are: How to assess the external validity of a study? How to design an externally valid study? Both issues require specific knowledge of the populations and settings being studied and those of interest A rigorous treatment of both issues is beyond the scope of this lecture (cf. Shadish et al., 2002, for details) We focus on aspects of internal validity 118

6 5.1. Threats to internal validity of multiple regression analysis Objectives: Survey of five reasons why the OLS estimator of a multiple regression coefficient may be biased even in large samples (Sections ) All five sources of bias arise because the regressor is correlated with the error term in the population regression Violation of the first OLS assumption from Slide 18 What can be done to reduce this bias? 119

7 Omitted variable bias Omitted variable bias: If an omitted variable is a determinant of Y i and if it is correlated with at least one of the regressors, then the OLS estimator of at least one of the coefficients will have omitted variable bias (see Definition 2.5 on Slide 27) This bias persists even in large samples OLS estimator(s) is (are) inconsistent Mathematically, under omitted variable bias at least one of the regressors is correlated with the error term u i implying that E(u i X 1i,..., X ki ) 0 120

8 Mitigation of omitted variable bias: Inclusion of control variables in the regression equation Definition 5.2: (Control variable) A control variable is not the object of interest in a regression analysis, but is rather a regressor included to hold constant factors that, if neglected, could lead the estimated causal effect of interest to suffer from omitted variable bias. Remarks: Up to now: OLS assumptions on Slide 18 treat all regressors symmetrically Now: explicit distinction between regressors of interest and control variables 121

9 Mathematical motivation: Consider a regression with two variables X 1i (the regressor) and X 2i (the control variable) Y i = β 0 + β 1 X 1i + β 2 X 2i + u i Replace the first OLS assumption E(u i X 1i, X 2i ) = 0 by the so-called conditional-mean-independence assumption E(u i X 1i, X 2i ) = E(u i X 2i ) (5.1) β 1 has a causal interpretation, but β 2 does not (see class for details) 122

10 Intuition of (5.1): The inclusion of the control variable X 2i makes the regressor X 1i uncorrelated with u i so that the OLS estimator ˆβ 1 can estimate the causal effect on Y i of a change in X 1i By contrast, the control variable X 2i remains correlated with u i so that its coefficient β 2 is subject to omitted variable bias and does not have a causal interpretation The control variable X 2i is included because it controls for omitted factors that affect Y i and are correlated with X 1i it might (but need not) have a causal effect itself When a control variable is used, it is controlling for both, (1) its own direct causal effect (if any), and (2) for the effect of correlated omitted factors 123

11 Terminology: Complete phrasing: The coefficient β 1 on the regressor X 1i is the causal effect on Y i of a change in X 1i using the control variable X 2i both (1) to hold constant the direct effect of X 2i, and (2) to control for factors correlated with X 1i Conventional, less awkward phrasing: The coefficient β 1 on X 1i is the effect on Y i controlling for X 2i 124

12 Example: Consider the student-performance dataset Consider the regression results of TEST SCORE on STR and PCTEL (see left panel on Slide 126) Potentially omitted factor could be outside learning opportunities Factor outside learning opportunities is difficult to measure, but correlated with the students economic background Include a measure of economic background to control for omitted income-related determinants of TEST SCORE like outside learning opportunities Such a control variable is MEAL PCT measuring the percentage of students receiving a free or subsidized lunch 125

13 TEST SCORE regression results with and without the control variable MEAT PCT Dependent Variable: TEST_SCORE Method: Least Squares Date: 19/05/12 Time: 18:00 Sample: Included observations: 420 White heteroskedasticity-consistent standard errors & covariance Variable Coefficient Std. Error t-statistic Prob. C STR PCTEL R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid Schwarz criterion Log likelihood Hannan-Quinn criter F-statistic Durbin-Watson stat Prob(F-statistic) Dependent Variable: TEST_SCORE Method: Least Squares Date: 19/05/12 Time: 17:52 Sample: Included observations: 420 White heteroskedasticity-consistent standard errors & covariance Variable Coefficient Std. Error t-statistic Prob. C STR PCTEL MEAL_PCT R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid Schwarz criterion Log likelihood Hannan-Quinn criter F-statistic Durbin-Watson stat Prob(F-statistic)

14 Example: [continued] Including the control variable MEAL PCT does not substantially change the effect of STR on TEST SCORE (ˆβ 1 changes from to ) changes the size (but not the sign) of the effect of PCTEL on TEST SCORE (ˆβ 2 changes from to ) The estimated coefficient ˆβ 3 = is not reasonable, since if it were we could boost TEST SCORE by eliminating the reduced-price lunch programme Do not treat β 3 as causal 127

15 Solutions to omitted variable bias: Distinction between two situations, namely (1) one in which data on the omitted variable or on adequate control variables are available, and (2) one in which data are not available Situation #1: If data on the omitted variable is available, include it in the regression equation If you have data on adequate control variables (with the hope of achieving conditional mean independence), include these variables in the regression equation 128

16 Trade-off: Adding a variable to a regression has both costs and benefits On the one hand, omitting the variable could result in omitted variable bias On the other hand, including a variable that is not a relevant regressor (that is, when its population regression coefficient is zero) reduces the precision of the estimators of the other regression coefficients (in the form of higher variances of the OLS estimators) 129

17 Situation #2: If no data are available there are three potential ways of mitigating the omitted variable bias The use of panel data (see Section 6) The use of instrumental variables (see Section 8) The conduct of randomized controlled experiments (see Stock & Watson, 2011, Section 13) 130

18 Guidelines for deciding whether to include an additional variable: Be specific about the coefficients of interest Use a priori reasoning to identify the most important potential sources of omitted variable bias Consider a baseline specification and some questionable variables Test whether additional questionable variables have nonzero coefficients Provide full disclosure representative tabulations of your results so that others can see the effect of including questionable variables on the coefficients of interest Observe, if your results change after including a questionable control variable 131

19 Misspecification of the functional form of the regression function Definition 5.3: (Functional form misspecification) Functional form misspecification arises when the functional form of the estimated function differs from the (true) functional form of the population regression function. Two aspects of misspecification: If the (true) population regression function is nonlinear, but we estimate a linear regression equation, then the estimator of the coefficients suffer from omitted variable bias If the (true) population regression function is linear, but we estimate a nonlinear regression equation, then we estimate non-existent coefficients 132

20 Solutions to functional form misspecification: Detection of misspecification by using statistical specification tests, for example the Regression Specification Error Test (RESET) (see Econometrics I) Plot the data and the estimated regression function Correct the misspecification by trying alternative functional forms of the regression function 133

21 Measurement error and errors-in-variable bias Definition 5.4: (Errors-in-variable bias) Errors-in-variables bias in the OLS estimator arises when an independent variable is measured imprecisely. The bias depends on the nature of the measurement error and persists even if the sample size is large. Sources of measurement errors: Wrong answer of a respondent to a survey question (e.g. about her/his income) Typographical errors in data collected from computerized administrative records 134

22 Consequence: Consider a regression with a single regressor X i X i is imprecisely measured by X i The true population regression equation is Y i = β 0 + β 1 X i + u i = β 0 + β 1 X i + [ β 1 ( X i X i ) + ui ] = β 0 + β 1 X i + v i where v i = β 1 (X i X i ) + ui The population regression equation in terms of X i has an error term containing the measurement error X i X i If X i X i is correlated with X i, then the regressor X i will be correlated with v i 135

23 Consequence: [continued] Violation of OLS assumption 1 on Slide 18 ˆβ 1 will be biased and inconsistent Classical measurement error model: Suppose the measured value X i equals the unmeasured value X i plus a purely random component w i with expected value 0 and variance σ 2 w Suppose further that Corr(w i, X i ) = 0 and Corr(w i, u i ) = 0 It then follows that (see class for details) plim ˆβ 1 = σ2 X σ 2 X + σ2 w β 1 136

24 Solutions to errors-in-variables bias: Try to obtain an accurate measure of X (if possible) Use instrumental variables (see Section 8) 137

25 Missing data and sample selection We consider three cases: 1. Data are missing completely at random 2. Data are missing based on a regressor 3. Data are missing because of a selection process that is related to Y beyond depending on X (sample selection bias) 138

26 Case #1: When the data are missing completely at random (for reasons unrelated to the values of X or Y ) the effect is to reduce the sample size but not introduce bias Case #2: When the data are missing based on the value of a regressor, the effect also is to reduce the sample size but not introduce bias 139

27 Case #3: If the data are missing because of a selection process that is related to the value of the dependent variable Y beyond depending on the regressors X 1,..., X k then this selection process can introduce correlation between the error term and the regressors Bias in the OLS estimators that persists in large samples Definition 5.5: (Sample selection bias) Sample selection bias arises when a selection process influences the availability of data and that process is related to the dependent variable, beyond depending on the regressors. 140

28 Example: We consider the question as to whether stock mutual funds outperform the market To this end, many studies compare future returns on mutual funds that had high returns over the past year to future returns on other funds and on the market as a whole Some databases include historical data on funds currently available for purchase This approach implies that the most poorly performing funds are omitted from the dataset because they went out of business or were merged into other funds Only the better funds survive to be in the data set (survivorship bias) 141

29 Solutions to sample selection bias: Beyond the scope of this lecture 142

30 Simultaneous causality Definition 5.6: (Simultaneous causality bias) Simultaneous causality bias, also called simultaneous equation bias, arises in a regression of Y on X when, in addition to the causal link of interest from X to Y, there is a causal link from Y to X. Remark: The reverse causality makes X correlated with the error term Bias in the OLS estimators that persists in large samples 143

31 Example: Consider the following two regression equations that hold simultaneously: Y i = β 0 + β 1 X i + u i (5.2) X i = γ 0 + γ 1 Y i + v i (5.3) From Eq. (5.2) it follows that if u i < 0 then Y i decreases If γ 1 > 0, then a low value of Y i leads to a low value of X i in Eq. (5.3) If γ 1 > 0, then Corr(X i, u i ) > 0 in Eq. (5.2) (see class for details) 144

32 Solutions to simultaneous causality bias: Use instrumental variables (see Section 8) 145

33 Sources of inconsistency of OLS standard errors OLS standard errors: Inconsistent standard errors pose a threat to internal validity Even if the OLS estimators are consistent and the sample size is large, inconsistent standard errors will produce invalid hypothesis tests and confidence intervals Main reasons for inconsistent standard errors: Heteroskedasticity of the error terms u i Autocorrelation among the error terms u i 146

34 Remedies: Use heteroskedasticity-consistent standard errors (see Section ) Use heteroskedasticity-autocorrelation consistent (HAC) standard errors (see Section ) 147

35 5.2. Summary Five threats to internal validity: 1. Omitted variables 2. Functional form misspecification 3. Errors-in-variables 4. Sample selection 5. Simultaneous causality 148

36 Remarks: Each of these, if present, result in failure of the first OLS assumption from Slide 18: E(u i X 1i,..., X ki ) = 0 Bias in the OLS estimators that persists in large samples Incorrect calculation of standard errors poses a further threat to internal validity Applying this list of threats to a multiple regression study provides a systematic way to assess the internal validity of that study 149

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

Nonlinear Regression Functions. SW Ch 8 1/54/

Nonlinear Regression Functions. SW Ch 8 1/54/ Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4

4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4 4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Non-linear functional forms Regression

More information

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

More information

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2 University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

More information

Financial Risk Management Exam Sample Questions/Answers

Financial Risk Management Exam Sample Questions/Answers Financial Risk Management Exam Sample Questions/Answers Prepared by Daniel HERLEMONT 1 2 3 4 5 6 Chapter 3 Fundamentals of Statistics FRM-99, Question 4 Random walk assumes that returns from one time period

More information

ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008. Description of the course

ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008. Description of the course ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008 Instructor: Maria Heracleous Lectures: M 8:10-10:40 p.m. WARD 202 Office: 221 Roper Phone: 202-885-3758 Office Hours: M W

More information

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used

More information

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved

problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved 4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random

More information

Lecture 15. Endogeneity & Instrumental Variable Estimation

Lecture 15. Endogeneity & Instrumental Variable Estimation Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental

More information

Forecasting the US Dollar / Euro Exchange rate Using ARMA Models

Forecasting the US Dollar / Euro Exchange rate Using ARMA Models Forecasting the US Dollar / Euro Exchange rate Using ARMA Models LIUWEI (9906360) - 1 - ABSTRACT...3 1. INTRODUCTION...4 2. DATA ANALYSIS...5 2.1 Stationary estimation...5 2.2 Dickey-Fuller Test...6 3.

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

UK GDP is the best predictor of UK GDP, literally.

UK GDP is the best predictor of UK GDP, literally. UK GDP IS THE BEST PREDICTOR OF UK GDP, LITERALLY ERIK BRITTON AND DANNY GABAY 6 NOVEMBER 2009 UK GDP is the best predictor of UK GDP, literally. The ONS s preliminary estimate of UK GDP for the third

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

16 : Demand Forecasting

16 : Demand Forecasting 16 : Demand Forecasting 1 Session Outline Demand Forecasting Subjective methods can be used only when past data is not available. When past data is available, it is advisable that firms should use statistical

More information

Competition as an Effective Tool in Developing Social Marketing Programs: Driving Behavior Change through Online Activities

Competition as an Effective Tool in Developing Social Marketing Programs: Driving Behavior Change through Online Activities Competition as an Effective Tool in Developing Social Marketing Programs: Driving Behavior Change through Online Activities Corina ŞERBAN 1 ABSTRACT Nowadays, social marketing practices represent an important

More information

From the help desk: Bootstrapped standard errors

From the help desk: Bootstrapped standard errors The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution

More information

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions What will happen if we violate the assumption that the errors are not serially

More information

Chapter 4: Vector Autoregressive Models

Chapter 4: Vector Autoregressive Models Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...

More information

Multiple Regression: What Is It?

Multiple Regression: What Is It? Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in

More information

SYSTEMS OF REGRESSION EQUATIONS

SYSTEMS OF REGRESSION EQUATIONS SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations

More information

Marketing Mix Modelling and Big Data P. M Cain

Marketing Mix Modelling and Big Data P. M Cain 1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

More information

Chapter 9 Assessing Studies Based on Multiple Regression

Chapter 9 Assessing Studies Based on Multiple Regression Chapter 9 Assessing Studies Based on Multiple Regression Solutions to Empirical Exercises 1. Age 0.439** (0.030) Age 2 Data from 2004 (1) (2) (3) (4) (5) (6) (7) (8) Dependent Variable AHE ln(ahe) ln(ahe)

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

Determinants of Stock Market Performance in Pakistan

Determinants of Stock Market Performance in Pakistan Determinants of Stock Market Performance in Pakistan Mehwish Zafar Sr. Lecturer Bahria University, Karachi campus Abstract Stock market performance, economic and political condition of a country is interrelated

More information

Air passenger departures forecast models A technical note

Air passenger departures forecast models A technical note Ministry of Transport Air passenger departures forecast models A technical note By Haobo Wang Financial, Economic and Statistical Analysis Page 1 of 15 1. Introduction Sine 1999, the Ministry of Business,

More information

A Basic Introduction to Missing Data

A Basic Introduction to Missing Data John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Solución del Examen Tipo: 1

Solución del Examen Tipo: 1 Solución del Examen Tipo: 1 Universidad Carlos III de Madrid ECONOMETRICS Academic year 2009/10 FINAL EXAM May 17, 2010 DURATION: 2 HOURS 1. Assume that model (III) verifies the assumptions of the classical

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD

IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD REPUBLIC OF SOUTH AFRICA GOVERNMENT-WIDE MONITORING & IMPACT EVALUATION SEMINAR IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD SHAHID KHANDKER World Bank June 2006 ORGANIZED BY THE WORLD BANK AFRICA IMPACT

More information

The Impact of Privatization in Insurance Industry on Insurance Efficiency in Iran

The Impact of Privatization in Insurance Industry on Insurance Efficiency in Iran The Impact of Privatization in Insurance Industry on Insurance Efficiency in Iran Shahram Gilaninia 1, Hosein Ganjinia, Azadeh Asadian 3 * 1. Department of Industrial Management, Islamic Azad University,

More information

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

More information

Note 2 to Computer class: Standard mis-specification tests

Note 2 to Computer class: Standard mis-specification tests Note 2 to Computer class: Standard mis-specification tests Ragnar Nymoen September 2, 2013 1 Why mis-specification testing of econometric models? As econometricians we must relate to the fact that the

More information

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its

More information

Practical. I conometrics. data collection, analysis, and application. Christiana E. Hilmer. Michael J. Hilmer San Diego State University

Practical. I conometrics. data collection, analysis, and application. Christiana E. Hilmer. Michael J. Hilmer San Diego State University Practical I conometrics data collection, analysis, and application Christiana E. Hilmer Michael J. Hilmer San Diego State University Mi Table of Contents PART ONE THE BASICS 1 Chapter 1 An Introduction

More information

On the Degree of Openness of an Open Economy Carlos Alfredo Rodriguez, Universidad del CEMA Buenos Aires, Argentina

On the Degree of Openness of an Open Economy Carlos Alfredo Rodriguez, Universidad del CEMA Buenos Aires, Argentina On the Degree of Openness of an Open Economy Carlos Alfredo Rodriguez, Universidad del CEMA Buenos Aires, Argentina car@cema.edu.ar www.cema.edu.ar\~car Version1-February 14,2000 All data can be consulted

More information

17. SIMPLE LINEAR REGRESSION II

17. SIMPLE LINEAR REGRESSION II 17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

More information

5. Multiple regression

5. Multiple regression 5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful

More information

Testing for Granger causality between stock prices and economic growth

Testing for Granger causality between stock prices and economic growth MPRA Munich Personal RePEc Archive Testing for Granger causality between stock prices and economic growth Pasquale Foresti 2006 Online at http://mpra.ub.uni-muenchen.de/2962/ MPRA Paper No. 2962, posted

More information

Advanced Forecasting Techniques and Models: ARIMA

Advanced Forecasting Techniques and Models: ARIMA Advanced Forecasting Techniques and Models: ARIMA Short Examples Series using Risk Simulator For more information please visit: www.realoptionsvaluation.com or contact us at: admin@realoptionsvaluation.com

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Introduction to Fixed Effects Methods

Introduction to Fixed Effects Methods Introduction to Fixed Effects Methods 1 1.1 The Promise of Fixed Effects for Nonexperimental Research... 1 1.2 The Paired-Comparisons t-test as a Fixed Effects Method... 2 1.3 Costs and Benefits of Fixed

More information

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

More information

Causes of Inflation in the Iranian Economy

Causes of Inflation in the Iranian Economy Causes of Inflation in the Iranian Economy Hamed Armesh* and Abas Alavi Rad** It is clear that in the nearly last four decades inflation is one of the important problems of Iranian economy. In this study,

More information

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Cointegration The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Economic theory, however, often implies equilibrium

More information

6/15/2005 7:54 PM. Affirmative Action s Affirmative Actions: A Reply to Sander

6/15/2005 7:54 PM. Affirmative Action s Affirmative Actions: A Reply to Sander Reply Affirmative Action s Affirmative Actions: A Reply to Sander Daniel E. Ho I am grateful to Professor Sander for his interest in my work and his willingness to pursue a valid answer to the critical

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

Imputing Missing Data using SAS

Imputing Missing Data using SAS ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are

More information

An Introduction to Regression Analysis

An Introduction to Regression Analysis The Inaugural Coase Lecture An Introduction to Regression Analysis Alan O. Sykes * Regression analysis is a statistical tool for the investigation of relationships between variables. Usually, the investigator

More information

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

The correlation coefficient

The correlation coefficient The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Conducting an empirical analysis of economic data can be rewarding and

Conducting an empirical analysis of economic data can be rewarding and CHAPTER 10 Conducting a Regression Study Using Economic Data Conducting an empirical analysis of economic data can be rewarding and informative. If you follow some basic guidelines, it is possible to use

More information

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional

More information

Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

More information

Penalized regression: Introduction

Penalized regression: Introduction Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood

More information

Time Series Analysis

Time Series Analysis Time Series Analysis Forecasting with ARIMA models Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and García-Martos (UC3M-UPM)

More information

T-test & factor analysis

T-test & factor analysis Parametric tests T-test & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue

More information

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Moderation. Moderation

Moderation. Moderation Stats - Moderation Moderation A moderator is a variable that specifies conditions under which a given predictor is related to an outcome. The moderator explains when a DV and IV are related. Moderation

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared jn2@ecs.soton.ac.uk Relationships between variables So far we have looked at ways of characterizing the distribution

More information

PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU

PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU PITFALLS IN TIME SERIES ANALYSIS Cliff Hurvich Stern School, NYU The t -Test If x 1,..., x n are independent and identically distributed with mean 0, and n is not too small, then t = x 0 s n has a standard

More information

European Journal of Business and Management ISSN 2222-1905 (Paper) ISSN 2222-2839 (Online) Vol.5, No.30, 2013

European Journal of Business and Management ISSN 2222-1905 (Paper) ISSN 2222-2839 (Online) Vol.5, No.30, 2013 The Impact of Stock Market Liquidity on Economic Growth in Jordan Shatha Abdul-Khaliq Assistant Professor,AlBlqa Applied University, Jordan * E-mail of the corresponding author: yshatha@gmail.com Abstract

More information

THE IMPACT OF COMPANY INCOME TAX AND VALUE-ADDED TAX ON ECONOMIC GROWTH: EVIDENCE FROM NIGERIA

THE IMPACT OF COMPANY INCOME TAX AND VALUE-ADDED TAX ON ECONOMIC GROWTH: EVIDENCE FROM NIGERIA THE IMPACT OF COMPANY INCOME TAX AND VALUE-ADDED TAX ON ECONOMIC GROWTH: EVIDENCE FROM NIGERIA Dr. Lyndon M. Etale and Dr. Paymaster F. Bingilar Department of Finance and Accountancy, Faculty of Management

More information

The relationship between stock market parameters and interbank lending market: an empirical evidence

The relationship between stock market parameters and interbank lending market: an empirical evidence Magomet Yandiev Associate Professor, Department of Economics, Lomonosov Moscow State University mag2097@mail.ru Alexander Pakhalov, PG student, Department of Economics, Lomonosov Moscow State University

More information

Econometric Principles and Data Analysis

Econometric Principles and Data Analysis Econometric Principles and Data Analysis product: 4339 course code: c230 c330 Econometric Principles and Data Analysis Centre for Financial and Management Studies SOAS, University of London 1999, revised

More information

Types of Error in Surveys

Types of Error in Surveys 2 Types of Error in Surveys Surveys are designed to produce statistics about a target population. The process by which this is done rests on inferring the characteristics of the target population from

More information

Test Bias. As we have seen, psychological tests can be well-conceived and well-constructed, but

Test Bias. As we have seen, psychological tests can be well-conceived and well-constructed, but Test Bias As we have seen, psychological tests can be well-conceived and well-constructed, but none are perfect. The reliability of test scores can be compromised by random measurement error (unsystematic

More information

Forecasting Methods. What is forecasting? Why is forecasting important? How can we evaluate a future demand? How do we make mistakes?

Forecasting Methods. What is forecasting? Why is forecasting important? How can we evaluate a future demand? How do we make mistakes? Forecasting Methods What is forecasting? Why is forecasting important? How can we evaluate a future demand? How do we make mistakes? Prod - Forecasting Methods Contents. FRAMEWORK OF PLANNING DECISIONS....

More information

Violent crime total. Problem Set 1

Violent crime total. Problem Set 1 Problem Set 1 Note: this problem set is primarily intended to get you used to manipulating and presenting data using a spreadsheet program. While subsequent problem sets will be useful indicators of the

More information

FORECASTING DEPOSIT GROWTH: Forecasting BIF and SAIF Assessable and Insured Deposits

FORECASTING DEPOSIT GROWTH: Forecasting BIF and SAIF Assessable and Insured Deposits Technical Paper Series Congressional Budget Office Washington, DC FORECASTING DEPOSIT GROWTH: Forecasting BIF and SAIF Assessable and Insured Deposits Albert D. Metz Microeconomic and Financial Studies

More information

Forecasting Using Eviews 2.0: An Overview

Forecasting Using Eviews 2.0: An Overview Forecasting Using Eviews 2.0: An Overview Some Preliminaries In what follows it will be useful to distinguish between ex post and ex ante forecasting. In terms of time series modeling, both predict values

More information

Quantile Regression under misspecification, with an application to the U.S. wage structure

Quantile Regression under misspecification, with an application to the U.S. wage structure Quantile Regression under misspecification, with an application to the U.S. wage structure Angrist, Chernozhukov and Fernandez-Val Reading Group Econometrics November 2, 2010 Intro: initial problem The

More information

Fixed-Effect Versus Random-Effects Models

Fixed-Effect Versus Random-Effects Models CHAPTER 13 Fixed-Effect Versus Random-Effects Models Introduction Definition of a summary effect Estimating the summary effect Extreme effect size in a large study or a small study Confidence interval

More information

Integrated Resource Plan

Integrated Resource Plan Integrated Resource Plan March 19, 2004 PREPARED FOR KAUA I ISLAND UTILITY COOPERATIVE LCG Consulting 4962 El Camino Real, Suite 112 Los Altos, CA 94022 650-962-9670 1 IRP 1 ELECTRIC LOAD FORECASTING 1.1

More information

Binomial Sampling and the Binomial Distribution

Binomial Sampling and the Binomial Distribution Binomial Sampling and the Binomial Distribution Characterized by two mutually exclusive events." Examples: GENERAL: {success or failure} {on or off} {head or tail} {zero or one} BIOLOGY: {dead or alive}

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Firm characteristics. The current issue and full text archive of this journal is available at www.emeraldinsight.com/0307-4358.htm

Firm characteristics. The current issue and full text archive of this journal is available at www.emeraldinsight.com/0307-4358.htm The current issue and full text archive of this journal is available at www.emeraldinsight.com/0307-4358.htm How firm characteristics affect capital structure: an empirical study Nikolaos Eriotis National

More information

Multiple Imputation for Missing Data: A Cautionary Tale

Multiple Imputation for Missing Data: A Cautionary Tale Multiple Imputation for Missing Data: A Cautionary Tale Paul D. Allison University of Pennsylvania Address correspondence to Paul D. Allison, Sociology Department, University of Pennsylvania, 3718 Locust

More information

Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

More information

Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection

Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection Directions in Statistical Methodology for Multivariable Predictive Modeling Frank E Harrell Jr University of Virginia Seattle WA 19May98 Overview of Modeling Process Model selection Regression shape Diagnostics

More information

Uniwersytet Ekonomiczny

Uniwersytet Ekonomiczny Uniwersytet Ekonomiczny George Matysiak Introduction to modelling & forecasting December 15 th, 2014 Agenda Modelling and forecasting - Models Approaches towards modelling and forecasting Forecasting commercial

More information

Correlational Research

Correlational Research Correlational Research Chapter Fifteen Correlational Research Chapter Fifteen Bring folder of readings The Nature of Correlational Research Correlational Research is also known as Associational Research.

More information

Clustering in the Linear Model

Clustering in the Linear Model Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple

More information

Time Series Analysis

Time Series Analysis Time Series Analysis Identifying possible ARIMA models Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and García-Martos

More information

Correlation of International Stock Markets Before and During the Subprime Crisis

Correlation of International Stock Markets Before and During the Subprime Crisis 173 Correlation of International Stock Markets Before and During the Subprime Crisis Ioana Moldovan 1 Claudia Medrega 2 The recent financial crisis has spread to markets worldwide. The correlation of evolutions

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

What drove Irish Government bond yields during the crisis?

What drove Irish Government bond yields during the crisis? What drove Irish Government bond yields during the crisis? David Purdue and Rossa White, September 2014 1. Introduction The Irish Government bond market has been exceptionally volatile in the seven years

More information