The endogeneity problem Proxy variables Instrumental variables STATA. Endogeneity. Gabriel Montes-Rojas
|
|
- Cornelius Greene
- 7 years ago
- Views:
Transcription
1 Gabriel V. Montes-Rojas
2 The endogeneity problem A variable is endogenous if cov(x j, error) 0, j = 1, 2,..., K. Consider the simple regression model where cov(x, u) 0 (i.e. x is endogenous) What is the problem? : y = β 0 + β 1 x + u β1 OLS cov(y, x) = = cov(β 0 + β 1 x + u, x) cov(u, x) = β 1 + = β 1 + BIAS var(x) var(x) var(x)
3 The endogeneity problem Consider the wage equation model log(wage) = β 0 + β 1 educ + β 2 exper + β 3 abil + v Our primary interest is in the estimation of β 1 and β 2. However, we know that if abil is not observed, we would obtain biased estimators of them (Why? See Omitted Variables Bias). In practice, we can only estimate the model where u β 3 abil + v. log(wage) = β 0 + β 1 educ + β 2 exper + u Here, the main problem is that educ and exper are not exogenous, or they are endogenous: cov(educ, u) 0, cov(exper, u) 0.
4 The endogeneity problem Consider now a general regression structural model of the form y = β 0 + β 1 x 1 + β 2 x β K x K + γq + v, E(v x 1, x 2,..., x K, q) = 0. Assume that q is nonobservable. Handle this by putting it into the error term, and assume E(q) = 0 (because there is an intercept this is not an issue) y = β 0 + β 1 x 1 + β 2 x β K x K + u, u γq + v. Consider the linear projection of q on x as q = δ 0 + δ 1 x δ K x K + r, where by definition E(r) = 0, cov(x j, r) = 0, j = 1, 2,..., K. Then, and y = (β 0 + γδ 0 ) + (β 1 + γδ 1 )x 1 + (β 2 + γδ 2 )x (β K + γδ K )x K + u, plim ˆβ j = β j + γδ j.
5 There are 3 possible solutions to the problem above problem: 1 Measure abil or q, the unobserved variable. But this is almost impossible, how can you measure it? 2 Find a proxy variable for ability/q. 3 Find an instrumental variable for the endogenous variables.
6 Measurement errors in variables Let the true model be y = xβ + u, u iid(0, σ 2 u I) Assume that we observe x with error and that we actually observe x such that x = x + ν, ν iid(0, ω 2 I). The vector ν is a vector of measurement errors which are assumed (possibly unrealistically) to be independent of x and u.substituing x ν for x we have y = x β νβ + u = x β + u where u u νβ. Then note that u is not independent of x because ( E x u ) = E ( (x + ν) (u νβ) ) = Nω 2 β. Then measurement errors in explanatory variables can be seen as an endogeneity problem. Thus if we assume that β > 0, the error term u is negatively correlated with x. Then we have the attenuation bias, ˆβ p β Nω 2 [E(x x )] 1 β.
7 Consider the wage equation model Let x = (educ, exper). log(wage) = β 0 + β 1 educ + β 2 exper + β 3 abil + u A proxy variable for abil is IQ. A proxy variable should satisfy: 1 abil = δ 0 + δ 3 IQ+v 3, where v 3 is uncorrelated with educ, exper and IQ. 2 u is uncorrelated with educ, exper and abil. This can also be expressed as E(lwage x, abil, IQ) = E(lwage x, abil), and we say that the proxy is irrelevant for explaining wages once the observables and the true variable abil are used. The we can estimate y = (β 0 + β 3 δ 0 ) + β 1 educ + β 2 exper + β 3 δ 3 IQ + u + β 3 v 3
8 IQ as a proxy for ability use clear reg lwage educ exper tenure married south urban black reg lwage educ exper tenure married south urban black IQ gen educiq=educ*iq reg lwage educ exper tenure married south urban black IQ educiq Variables (1) (2) (3) educ (.006) (.007) (.041) exper (.002) (.002) (.003) tenure (.002) (.002) (.002) married (.039) (.039) (.039) south (.026) (.026) (.026) urban (.027) (.027) (.027) black (.038) (.039) (.040) IQ (.0010) (.0052) educiq (.00038)
9 Potential bias when using a proxy Assume that abil = δ 0 + δ 1 educ + δ 2 exper + δ 3 IQ + v 3 y = (β 0 + β 3 δ 0 ) + (β 1 + β 3 δ 1 )educ +(β 2 + β 3 δ 2 )exper + β 3 δ 3 IQ + u + β 3 v 3 In this case, IQ is called an imperfect proxy. OLS with imperfect proxies is inconsistent.
10 Consider the simple regression model y = β 0 + β 1 x + u where cov(x, u) 0 (i.e. x is endogenous) A good instrumental variable (say z) satisfies these two conditions: 1 It is not correlated with the error term: cov(z, u) = 0 2 It is correlated with the endogenous variable: cov(x, z) 0
11 : Main idea How can we estimate β 1 using z? Note that Why? β 1 = cov(z, y) cov(z, x) cov(z, y) = cov(z, β 0 + β 1 x + u) = cov(z, β 0 ) + cov(z, β 1 x) + cov(z, u)
12 IV as a two-stage least squares estimator Consider a simple regression model where cov(x, u) 0. y = β 0 + β 1 x + u, Consider the auxiliary regression (first stage) x = γ 0 + γ 1 z + r. Take the predicted values ˆx γ 0 + γ 1 z. Note that x = ˆx + r and γ 1 = cov(x,z) Var(z). Note that ˆx is uncorrelated with r and cov(ˆx, u) = 0. Consider a regression model (second stage) y = β 0 + β 1 (ˆx + r) + u = β 0 + β 1ˆx + v, where v r + u and cov(ˆx, v) = cov(ˆx, r + u) = 0. Then, ˆβ IV 1 = cov(y, ˆx) Var(ˆx) = ( ) cov y, cov(x,z) Var(z) z ( ) = β 1. cov(x,z) Var Var(z) z
13 : multiple regression model Consider the model y = β 0 + β 1 x 1 + β 2 x β K x K + u where cov(x K, u) 0 (i.e. x 1 is endogenous) and cov(x j, u) = 0, j = 1, 2,..., K 1 (i.e. the rest are exogenous). A good instrumental variable (say z) satisfies these two conditions: 1 It is not correlated with the error term: cov(z, u) = 0 2 It is correlated with the endogenous variable. More formally, consider the linear projection of x K onto all the exogenous variables: x K = δ 0 + δ 1 x δ K 1 x K 1 + θz + r K, where by definition E(r K ) = 0 and r K is uncorrelated with x 1, x 2,..., x K 1. The key assumption is that θ 0.
14 : identification Consider a regression model y = xβ + u. Define z (1, x 1,..., x K 1, z), the vector of all exogenous variables. There are then K orthogonality conditions: E(z u) = 0. Multiply the regression model by z, taking expectations get [E(z x)]β = E(z y), where E(z x) is a K K matrix and E(z y) is K 1. This system has a unique solution if and only if the former matrix has full rank in which case E(z x) = K, β = [E(z x)] 1 E(z y). The instrumental variables (IV) estimator of β is ( N ) 1 ( N ) ˆβ IV = N 1 z i x i N 1 z i y i = (Z X) 1 (Zy) i=1 i=1
15 : multiple instruments When there are multiple instruments (say M, z 1, z 2,..., z M ) the most efficient estimator is the two-stage least squares (2SLS): where ( ˆβ N ) 1 ( N ) 2SLS = N 1 ˆx i x i N 1 ˆx i y i i=1 i=1 = (Ẑ 1 X) (Ẑy) x K = δ 0 + δ 1 x δ K 1 x K 1 + θ 1 z θ M z M + r K ˆx K = ˆδ 0 + ˆδ 1 x ˆδ K 1 x K 1 + ˆθ 1 z ˆθ M z M Note that ˆX = Z(Z Z) 1 Z X = P Z X, i.e. it is a projection of x onto the z (1, x 1,..., x K 1, z 1,..., z M ) space, where P Z the projection matrix is idempotent and symmetric. Therefore, ˆX ˆX = ˆX X. Thus the 2SLS estimator is an OLS estimator where ˆx is used in place of x.
16 Assumptions for identification and consistency of 2SLS Assumption 2SLS.1: For some 1 L vector z, E(z u) = 0. Assumption 2SLS.2: (a) rank E(z z) = L; (b) rank E(z x) = K. Necessary for the rank condition is the order condition L K, i.e. there must be at least as many instruments as endogenous variables. With a single endogenous variable the rank condition can be tested from eq. x K = δ 0 + δ 1 x δ K 1 x K 1 + θ 1 z θ M z M + r K and the null hypothesis H 0 : θ 1 = 0, θ 2 = 0,..., θ 3 = 0.
17 Identification of 2SLS Assuming that E(z z) is nonsingular, let x = zπ, where Pi = [E(z z)] 1 E(z x) is a L K. Multiplying the structural equation by x, taking expectations we get E(x y) = E(x x)β + E(x u) = E(x x)β Thus β is identified by β = [E(x x)] 1E(x y). For this we need E(x x ) to be nonsingular. But E(x x) = E(Pi z x) = E(x z)[e(z z)] 1 E(z x) Then this matrix is nonsingular if it has rank K, and then we need E(z x) with rank K (Assumption 2SLS.2b). For this we need E(z z) to be nonsingular and thus with rank L (Assumption 2SLS.2a).
18 Consistency of 2SLS ( ˆβ ) ( ) 1 ( 2SLS = N 1 x i z i N 1 z i z N 1 i N 1 z i x i i=1 i=1 i=1 ( N ) ( N ) 1 ( N ) N 1 x i z i N 1 z i z i N 1 z i y i i=1 i=1 i=1 Consistency of 2SLS: Under Assumptions 2SLS.1 and 2SLS.2, plim ˆβ 2SLS = β. Proof: Apply law of large numbers and Slutsky s theorem.
19 Asymptotic normality of 2SLS Assumption 2SLS.3: E(u 2 z z) = σ 2 E(z z), where σ 2 = E(u 2 ). Asymptotic normality of 2SLS: Under Assumptions 2SLS.1, 2SLS.2 and 2SLS.3, N( ˆβ 2SLS β) d N ( 0, σ 2 ([E(x z)][e(z z)] 1 [E(z x)]) ).
20 Testing for endogeneity The 2SLS estimator is less efficient (i.e. larger variance) than OLS when the explanatory variables are exogenous. Therefore, it is important to test for endogeneity first, in order to avoid using an IV estimator that is: 1 more computationally intensive (2 stages is more difficult than 1) 2 less efficient
21 Testing for endogeneity Consider the model y 1 = β 0 + β 1 y 2 + β 2 z 1 + β 3 z 2 + u where y 2 is (possibly) endogenous; z 3 and z 4 are IVs. In order to test for endogeneity: 1 y 2 = π 0 + π 1 z 1 + π 2 z 2 + π 3 z 3 + π 4 z 4 + v 2 compute residuals ˆv 2 2 y 1 = β 0 + β 1 y 2 + β 2 z 1 + +β 3 z 2 + δ 1ˆv 2 + error 3 Test for the significance of ˆv 2 in the latter model. If we reject H 0 : δ 1 = 0, then there is evidence that u and v 2 are correlated, therefore y 2 is endogenous!!!
22 Testing for endogeneity Consider the Durbin-Wu-Hausman (DWH) test based on the comparison of ˆβ 2SLS and ˆβ OLS. (Same idea to the Hausman test to compare RE with FE.) Consider the null hypothesis of no endogeneity, H 0 : E(x u) = 0. Then, Then 1 Under the null hypothesis Avar[ N( ˆβ 2SLS ˆβ OLS )] = σ 2 ( [E(x x )] 1 [E(x x )] 1 ). 2 Moreover, because OLS is more efficient than 2SLS, ([E(x x )] 1 [E(x x )] 1 ) is positive semidefinite. 3 Both 2SLS and OLS are consistent for β. DWH = ( ˆβ 2SLS ˆβ OLS ) [(ˆX ˆX) 1 (X X)]( ˆβ 2SLS ˆβ OLS ) a χ 2 L K
23 Testing for the validity of the instruments Overidentification restrictions This is a test that will tell you if the instruments are uncorrelated with the error term, an essential condition for the validity of the IVs. Requirement: You need more IVs than endogenous variables. In the model above, we can run the 2SLS with z 3 as the only IV; compute û 3 = y 1 ˆβ 0 ˆβ 1 y 2 ˆβ 2 z 1 ˆβ 3 z 2 ; and then evaluate the regression model û 3 = δ 0 + δ 1 z 4, in particular, test the significance of z 4. This is a valid test for the validity of z 4 as an IV. Important: it needs to assume that z 3 is a valid IV.
24 Testing for the validity of the instruments: Sargan test 1 H 0 can be interpreted as exogeneity of all variables in the model. Then if you reject H 0 one (or more) of your IVs are not exogenous. 2 Estimate the full 2SLS model with all IVs, obtain the residuals û. 3 Regress û on ALL exogenous variables (i.e. the exogenous variables, including a constant and the IVs). 4 Consider NR 2 u a χ 2 L K, where R2 is taken from the last regression.
25 How to do it in? Assume that x1 is endogenous and x2 is exogenous. Moreover assume that you have two instruments available: z1 and z2 ivregress 2sls y (x1=z1 z2) x2 (instrumental variables estimation) ivregress 2sls y (x1=z1 z2) x2, first (idem - request that the first-stage regression results are shown) estat firststage (test for the significance of the instruments - thumb-rule F > 10) estat overid (test for the validity of the instruments: need more instruments than endogenous variables...) estat endogenous (test for the exogeneity of all variables) reg x1 z1 z2 test z1 z2 (test for the significance of the instruments - thumb-rule F > 10)
26 How to do it in? An intuitive way of understanding the IV estimator is to run a two stage regression model. For instance suppose you want to obtain: ivreg y (x1=z1 z2) x2 You can obtain the same coefficients by: reg x1 z1 z2 x2 predict x1hat reg y x1hat x2 However, note that the standard errors are different.
27 examples from Wooldridge book
28 Training and wages Consider the study of the effect of public-sponsored training programs. As argued in LaLonde (1995), public programs of training and employment are designed to improve participant s productive skills, which in turn would affect their earnings and dependency on social welfare benefits. We use the Job Training Partnership Act (JTPA), a public training program that has been extensively studied in the literature. For example, see Bloom et al. (1997) and Abadie, Angrist and Imbens (2002). The JTPA was a large publicly-funded training program in the United States that began funding in October 1983 and continued until late 1990 s. We focus on the Title II subprogram, which was offered only to individuals with barriers to employment (long-term use of welfare, being a high-school drop-out, 15 or more recent weeks of unemployment, limited English proficiency, physical or mental disability, reading proficiency below 7th grade level or an arrest record). Individuals in the randomly assigned JTPA treatment group were offered training, while those in the control group were excluded for a period of 18 months. Our interest lies in measuring the effect of training on participants future earnings. We use the database in Abadie, Angrist and Imbens (2002) that contains information about adult male and female JTPA participants and non-participants. You can download the database jtpa.dta
29 Training and wages earnings: 30 months accumulated earnings jtpa offer: JTPA offer: dummy variable for individuals that received a JTPA offer; jtpa training: JTPA training: dummy variable for individuals that took JTPA training; sex: male dummy variable; hsorged: dummy variable for individuals with completed high school or GSE; black: race dummy variable; hispanic: dummy variable for hispanic; married: dummy variable for married individuals; wkless13: dummy variable for individuals working less than 13 weeks in the past year; age2225,age2629,age3035,age3644 and age4554: age range indicator variables.
30 Training and wages References - Abadie, A., J. Angrist, and G. Imbens (2002). estimates of the effect of subsidized training on the quantiles of trainee earnings. Econometrica 70, Bloom, H. S. B., L. L. Orr, S. H. Bell, G. Cave, F. Doolittle, W. Lin, and J. M. Bos (1997). The benefits and costs of JTPA Title II-a programs. Key findings from the national job training partnership act study. Journal of Human Resources 32, LaLonde, R. J. (1995). The promise of public-sponsored training programs. Journal of Economic Perspectives 9,
ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
More informationSolución del Examen Tipo: 1
Solución del Examen Tipo: 1 Universidad Carlos III de Madrid ECONOMETRICS Academic year 2009/10 FINAL EXAM May 17, 2010 DURATION: 2 HOURS 1. Assume that model (III) verifies the assumptions of the classical
More informationproblem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved
4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random
More informationLecture 15. Endogeneity & Instrumental Variable Estimation
Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental
More informationPanel Data: Linear Models
Panel Data: Linear Models Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Panel Data: Linear Models 1 / 45 Introduction Outline What
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationWhat s New in Econometrics? Lecture 8 Cluster and Stratified Sampling
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and
More informationChapter 10: Basic Linear Unobserved Effects Panel Data. Models:
Chapter 10: Basic Linear Unobserved Effects Panel Data Models: Microeconomic Econometrics I Spring 2010 10.1 Motivation: The Omitted Variables Problem We are interested in the partial effects of the observable
More informationThe Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables. Kathleen M. Lang* Boston College.
The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables Kathleen M. Lang* Boston College and Peter Gottschalk Boston College Abstract We derive the efficiency loss
More informationUsing instrumental variables techniques in economics and finance
Using instrumental variables techniques in economics and finance Christopher F Baum 1 Boston College and DIW Berlin German Stata Users Group Meeting, Berlin, June 2008 1 Thanks to Mark Schaffer for a number
More informationChapter 2. Dynamic panel data models
Chapter 2. Dynamic panel data models Master of Science in Economics - University of Geneva Christophe Hurlin, Université d Orléans Université d Orléans April 2010 Introduction De nition We now consider
More informationSYSTEMS OF REGRESSION EQUATIONS
SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations
More informationClustering in the Linear Model
Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationIMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD
REPUBLIC OF SOUTH AFRICA GOVERNMENT-WIDE MONITORING & IMPACT EVALUATION SEMINAR IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD SHAHID KHANDKER World Bank June 2006 ORGANIZED BY THE WORLD BANK AFRICA IMPACT
More informationIntroduction to Regression Models for Panel Data Analysis. Indiana University Workshop in Methods October 7, 2011. Professor Patricia A.
Introduction to Regression Models for Panel Data Analysis Indiana University Workshop in Methods October 7, 2011 Professor Patricia A. McManus Panel Data Analysis October 2011 What are Panel Data? Panel
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics
ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Quantile Treatment Effects 2. Control Functions
More informationWooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions
Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions What will happen if we violate the assumption that the errors are not serially
More informationWeak instruments: An overview and new techniques
Weak instruments: An overview and new techniques July 24, 2006 Overview of IV IV Methods and Formulae IV Assumptions and Problems Why Use IV? Instrumental variables, often abbreviated IV, refers to an
More informationESTIMATING AN ECONOMIC MODEL OF CRIME USING PANEL DATA FROM NORTH CAROLINA BADI H. BALTAGI*
JOURNAL OF APPLIED ECONOMETRICS J. Appl. Econ. 21: 543 547 (2006) Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/jae.861 ESTIMATING AN ECONOMIC MODEL OF CRIME USING PANEL
More informationLecture 3: Differences-in-Differences
Lecture 3: Differences-in-Differences Fabian Waldinger Waldinger () 1 / 55 Topics Covered in Lecture 1 Review of fixed effects regression models. 2 Differences-in-Differences Basics: Card & Krueger (1994).
More informationCorrelated Random Effects Panel Data Models
INTRODUCTION AND LINEAR MODELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. The Linear
More informationIAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the
More informationEconometric analysis of the Belgian car market
Econometric analysis of the Belgian car market By: Prof. dr. D. Czarnitzki/ Ms. Céline Arts Tim Verheyden Introduction In contrast to typical examples from microeconomics textbooks on homogeneous goods
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
More informationQuadratic forms Cochran s theorem, degrees of freedom, and all that
Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us
More informationMgmt 469. Model Specification: Choosing the Right Variables for the Right Hand Side
Mgmt 469 Model Specification: Choosing the Right Variables for the Right Hand Side Even if you have only a handful of predictor variables to choose from, there are infinitely many ways to specify the right
More informationMortgage Lending Discrimination and Racial Differences in Loan Default
Mortgage Lending Discrimination and Racial Differences in Loan Default 117 Journal of Housing Research Volume 7, Issue 1 117 Fannie Mae Foundation 1996. All Rights Reserved. Mortgage Lending Discrimination
More informationPlease follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
More information1. THE LINEAR MODEL WITH CLUSTER EFFECTS
What s New in Econometrics? NBER, Summer 2007 Lecture 8, Tuesday, July 31st, 2.00-3.00 pm Cluster and Stratified Sampling These notes consider estimation and inference with cluster samples and samples
More informationUniversity of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination February 14 th, 2014.
University of Ljubljana Doctoral Programme in Statistics ethodology of Statistical Research Written examination February 14 th, 2014 Name and surname: ID number: Instructions Read carefully the wording
More informationFinancial Risk Management Exam Sample Questions/Answers
Financial Risk Management Exam Sample Questions/Answers Prepared by Daniel HERLEMONT 1 2 3 4 5 6 Chapter 3 Fundamentals of Statistics FRM-99, Question 4 Random walk assumes that returns from one time period
More informationWooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables
Wooldridge, Introductory Econometrics, 4th ed. Chapter 7: Multiple regression analysis with qualitative information: Binary (or dummy) variables We often consider relationships between observed outcomes
More informationFIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS
FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS Jeffrey M. Wooldridge Department of Economics Michigan State University East Lansing, MI 48824-1038
More informationChapter 3: The Multiple Linear Regression Model
Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics
More informationChapter 6: Multivariate Cointegration Analysis
Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationQuantile Regression under misspecification, with an application to the U.S. wage structure
Quantile Regression under misspecification, with an application to the U.S. wage structure Angrist, Chernozhukov and Fernandez-Val Reading Group Econometrics November 2, 2010 Intro: initial problem The
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More informationSections 2.11 and 5.8
Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and
More informationChapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem
Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become
More informationLOGIT AND PROBIT ANALYSIS
LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y
More informationFrom the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationCAPM, Arbitrage, and Linear Factor Models
CAPM, Arbitrage, and Linear Factor Models CAPM, Arbitrage, Linear Factor Models 1/ 41 Introduction We now assume all investors actually choose mean-variance e cient portfolios. By equating these investors
More informationAddressing Alternative. Multiple Regression. 17.871 Spring 2012
Addressing Alternative Explanations: Multiple Regression 17.871 Spring 2012 1 Did Clinton hurt Gore example Did Clinton hurt Gore in the 2000 election? Treatment is not liking Bill Clinton 2 Bivariate
More informationThe Effect of Housing on Portfolio Choice. July 2009
The Effect of Housing on Portfolio Choice Raj Chetty Harvard Univ. Adam Szeidl UC-Berkeley July 2009 Introduction How does homeownership affect financial portfolios? Linkages between housing and financial
More informationClass 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
More informationEmployer-Provided Health Insurance and Labor Supply of Married Women
Upjohn Institute Working Papers Upjohn Research home page 2011 Employer-Provided Health Insurance and Labor Supply of Married Women Merve Cebi University of Massachusetts - Dartmouth and W.E. Upjohn Institute
More information1 Introduction to Matrices
1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns
More informationCS229 Lecture notes. Andrew Ng
CS229 Lecture notes Andrew Ng Part X Factor analysis Whenwehavedatax (i) R n thatcomesfromamixtureofseveral Gaussians, the EM algorithm can be applied to fit a mixture model. In this setting, we usually
More informationON THE ROBUSTNESS OF FIXED EFFECTS AND RELATED ESTIMATORS IN CORRELATED RANDOM COEFFICIENT PANEL DATA MODELS
ON THE ROBUSTNESS OF FIXED EFFECTS AND RELATED ESTIMATORS IN CORRELATED RANDOM COEFFICIENT PANEL DATA MODELS Jeffrey M. Wooldridge THE INSTITUTE FOR FISCAL STUDIES DEPARTMENT OF ECONOMICS, UCL cemmap working
More informationThe VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.
Cointegration The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Economic theory, however, often implies equilibrium
More information3.1 Least squares in matrix form
118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression
More informationIntroduction to Fixed Effects Methods
Introduction to Fixed Effects Methods 1 1.1 The Promise of Fixed Effects for Nonexperimental Research... 1 1.2 The Paired-Comparisons t-test as a Fixed Effects Method... 2 1.3 Costs and Benefits of Fixed
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationWooldridge, Introductory Econometrics, 4th ed. Chapter 10: Basic regression analysis with time series data
Wooldridge, Introductory Econometrics, 4th ed. Chapter 10: Basic regression analysis with time series data We now turn to the analysis of time series data. One of the key assumptions underlying our analysis
More informationImplementations of tests on the exogeneity of selected. variables and their Performance in practice ACADEMISCH PROEFSCHRIFT
Implementations of tests on the exogeneity of selected variables and their Performance in practice ACADEMISCH PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Universiteit van Amsterdam op gezag
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More informationEmpirical Methods in Applied Economics
Empirical Methods in Applied Economics Jörn-Ste en Pischke LSE October 2005 1 Observational Studies and Regression 1.1 Conditional Randomization Again When we discussed experiments, we discussed already
More information1 Teaching notes on GMM 1.
Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in
More informationEarnings in private jobs after participation to post-doctoral programs : an assessment using a treatment effect model. Isabelle Recotillet
Earnings in private obs after participation to post-doctoral programs : an assessment using a treatment effect model Isabelle Recotillet Institute of Labor Economics and Industrial Sociology, UMR 6123,
More informationAutocovariance and Autocorrelation
Chapter 3 Autocovariance and Autocorrelation If the {X n } process is weakly stationary, the covariance of X n and X n+k depends only on the lag k. This leads to the following definition of the autocovariance
More informationMULTIPLE REGRESSION WITH CATEGORICAL DATA
DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting
More informationRecall that two vectors in are perpendicular or orthogonal provided that their dot
Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal
More informationComparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors
Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors Arthur Lewbel, Yingying Dong, and Thomas Tao Yang Boston College, University of California Irvine, and Boston
More informationIt s Hip to be Square: Using Quadratic First Stages to Investigate Instrument Validity and Heterogeneous Effects
It s Hip to be Square: Using Quadratic First Stages to Investigate Instrument Validity and Heterogeneous Effects Steven Dieterle University of Edinburgh Andy Snell University of Edinburgh August 25, 2014
More informationChapter 4: Vector Autoregressive Models
Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...
More informationSolution to HW - 1. Problem 1. [Points = 3] In September, Chapel Hill s daily high temperature has a mean
Problem 1. [Points = 3] In September, Chapel Hill s daily high temperature has a mean of 81 degree F and a standard deviation of 10 degree F. What is the mean, standard deviation and variance in terms
More informationSimilarity and Diagonalization. Similar Matrices
MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that
More informationPanel Data Econometrics
Panel Data Econometrics Master of Science in Economics - University of Geneva Christophe Hurlin, Université d Orléans University of Orléans January 2010 De nition A longitudinal, or panel, data set is
More informationThe performance of immigrants in the Norwegian labor market
J Popul Econ (1998) 11:293 303 Springer-Verlag 1998 The performance of immigrants in the Norwegian labor market John E. Hayfron Department of Economics, University of Bergen, Fosswinckelsgt. 6, N-5007
More informationMultivariate Analysis (Slides 13)
Multivariate Analysis (Slides 13) The final topic we consider is Factor Analysis. A Factor Analysis is a mathematical approach for attempting to explain the correlation between a large set of variables
More informationEstimating Price Elasticities in Differentiated Product Demand Models with Endogenous Characteristics
Estimating Price Elasticities in Differentiated Product Demand Models with Endogenous Characteristics Daniel A. Ackerberg UCLA Gregory S. Crawford University of Arizona March 27, 2009 Preliminary and Incomplete
More informationPanel Data Analysis in Stata
Panel Data Analysis in Stata Anton Parlow Lab session Econ710 UWM Econ Department??/??/2010 or in a S-Bahn in Berlin, you never know.. Our plan Introduction to Panel data Fixed vs. Random effects Testing
More informationOn Marginal Effects in Semiparametric Censored Regression Models
On Marginal Effects in Semiparametric Censored Regression Models Bo E. Honoré September 3, 2008 Introduction It is often argued that estimation of semiparametric censored regression models such as the
More informationAppendices with Supplementary Materials for CAPM for Estimating Cost of Equity Capital: Interpreting the Empirical Evidence
Appendices with Supplementary Materials for CAPM for Estimating Cost of Equity Capital: Interpreting the Empirical Evidence This document contains supplementary material to the paper titled CAPM for estimating
More informationStat 704 Data Analysis I Probability Review
1 / 30 Stat 704 Data Analysis I Probability Review Timothy Hanson Department of Statistics, University of South Carolina Course information 2 / 30 Logistics: Tuesday/Thursday 11:40am to 12:55pm in LeConte
More informationEconometric Methods for Panel Data
Based on the books by Baltagi: Econometric Analysis of Panel Data and by Hsiao: Analysis of Panel Data Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies
More informationBasic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
More informationChapter 1. Linear Panel Models and Heterogeneity
Chapter 1. Linear Panel Models and Heterogeneity Master of Science in Economics - University of Geneva Christophe Hurlin, Université d Orléans Université d Orléans January 2010 Introduction Speci cation
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationEstimating Marginal Returns to Education
Estimating Marginal Returns to Education Pedro Carneiro Department of Economics University College London Gower Street London WC1E 6BT United Kingdom James J. Heckman Department of Economics University
More informationStandard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
More informationInternet Appendix to CAPM for estimating cost of equity capital: Interpreting the empirical evidence
Internet Appendix to CAPM for estimating cost of equity capital: Interpreting the empirical evidence This document contains supplementary material to the paper titled CAPM for estimating cost of equity
More informationAssociation Between Variables
Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi
More informationResponse to Critiques of Mortgage Discrimination and FHA Loan Performance
A Response to Comments Response to Critiques of Mortgage Discrimination and FHA Loan Performance James A. Berkovec Glenn B. Canner Stuart A. Gabriel Timothy H. Hannan Abstract This response discusses the
More informationIntroduction to Hypothesis Testing
I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true
More informationIntroduction to Path Analysis
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationNonlinear Regression Functions. SW Ch 8 1/54/
Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General
More informationOrthogonal Diagonalization of Symmetric Matrices
MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding
More informationDiagnostic Testing in Econometrics: Variable Addition, RESET, and Fourier Approximations *
Diagnostic Testing in Econometrics: Variable Addition, RESET, and Fourier Approximations * Linda F. DeBenedictis and David E. A. Giles Policy, Planning and Legislation Branch, Ministry of Social Services,
More informationTHE IMPACT OF 401(K) PARTICIPATION ON THE WEALTH DISTRIBUTION: AN INSTRUMENTAL QUANTILE REGRESSION ANALYSIS
THE IMPACT OF 41(K) PARTICIPATION ON THE WEALTH DISTRIBUTION: AN INSTRUMENTAL QUANTILE REGRESSION ANALYSIS VICTOR CHERNOZHUKOV AND CHRISTIAN HANSEN Abstract. In this paper, we use the instrumental quantile
More informationMagne Mogstad and Matthew Wiswall
Discussion Papers No. 586, May 2009 Statistics Norway, Research Department Magne Mogstad and Matthew Wiswall How Linear Models Can Mask Non-Linear Causal Relationships An Application to Family Size and
More informationThe Impact of Alcohol Consumption on Occupational Attainment in England
The Impact of Alcohol Consumption on Occupational Attainment in England ZIGGY MACDONALD and MICHAEL A. SHIELDS University of Leicester November 1998, Revised October 1999 In this study we provide evidence
More informationAnswer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade
Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements
More informationCovariance and Correlation
Covariance and Correlation ( c Robert J. Serfling Not for reproduction or distribution) We have seen how to summarize a data-based relative frequency distribution by measures of location and spread, such
More information