problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved
|
|
- Robert Lyons
- 8 years ago
- Views:
Transcription
1 4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random sample is available specifically, {y i, x i } is observed iff y i b i differs from censored regression model in that x i is also unobserved examples only individuals with income below the poverty line are surveyed only firms with less than 100 employees are surveyed 50
2 MLE likelihood must account for truncation likelihood function ln[l(θ)] = ln[pr(y i x i, θ, b i, y i b i )] i again, what is Pr(y i x i, θ, b i, y i b i )? Pr(y i y i b i ) = f(y i )/F (b i ), where f( ) is the PDF of y and F ( ) is the CDF of y division by F (b) rescales probabilities to sum to one implies likelihood function is ln[l(θ)] = ln[pr(y i x i, θ, b i, y i b i )] i = [ ] (1/σ) ln φ(εi /σ) i Φ(b i /σ) 51
3 truncation from above and below population model y i = x i β + ε i, ε i N(0, σ 2 ) where {y i, x i } is observed iff a i y i b i likelihood function ln[l(θ)] = i ln[pr(y i x i, θ, a i, b i, a i y i b i )] again, what is Pr(y i x i, θ, a i, b i, a i y i b i )? likelihood function is ln[l(θ)] = ln[pr(y i x i, θ, a i, b i, a i y i b i )] i = [ ] (1/σ) ln φ(εi /σ) i Φ(b i /σ) Φ(a i /σ) 52
4 marginal effects truncated from above only E[y i y i b i ] x k = β k ( 1 λ 2 i α i λ i ) α i = b i x i β σ λ i = φ(α i) Φ(α i ) truncated from above and below E[y i a i y i b i ] x k STATA: -truncreg- α i1 α i2 = β k = a i x i β σ = b i x i β σ λ i = φ(α i) Φ(α i ) { 1 λ 2 i α i2 λ i [b i a i ]φ(α i1 ) σ [Φ(α i2 ) Φ(α i1 )] } 53
5 4.2 Sample Selection (Incidental Truncation) population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when data on y is only available for a non-random sample let S i = 1 if y i is observed; S i = 0 if y i is unobserved differs from truncated regression model in that x i is observed for regardless of S i differs from censored regression model in that there is no clear censoring rule; i.e., S i = 0 implies nothing is none about y i, whereas in censored regression we know that y i c i 54
6 implies following data structure have data on a random sample, {y i, x i, S i } N i=1, but y i =. if S i = 0 can only use M i S i observations to estimate any model examples wages only observed for workers firm profits only observed for firms that remain in business SAT scores only observed for test takers house prices only observed for houses on the market issue is OLS still unbiased and consistent? answer: depends 55
7 exogenous sample selection if S i depends only on: (i) exogenous observables, x i, or (ii) unobservables, u i, where u i ε i then OLS is unbiased and consistent, where estimation uses only the sub-sample of M observations example w i = α + βeduc i + γ 1 age + γ 2 age 2 + ε i and Pr(w i observed) = f(educ, age) then OLS using only workers is consistent 56
8 endogenous sample selection model outcome and selection equations simultaneously y i = x i β + ε i S i = z i γ + u i 1 if Si > 0 S i = 0 if Si 0 y i =. if S i = 0 ε i, u i N 2 (0, 0, σ 2, 1, ρ) x, z are exogenous z = [x w }{{} exclusion restriction(s) ] 57
9 problem E[y z] = xβ, but E[y z, S = 1] = xβ + ρφ(zγ)/φ(zγ), where ρφ(zγ)/φ(zγ) is known as the Inverse Mills Ratio implies that E[y z, S = 1] = xβ iff ρ = 0 OLS estimation of y i = x i β + ε i using only M observations omits the IMR term, which implies that solution ε i = ρφ(zγ)/φ(zγ) + ε i which is not mean zero, and is not independent of x estimate IMR (using i = 1,..., N) estimate probit model, where S is dependent variable and z are the covariates = γ obtain IMR i = φ(z i γ) Φ(z i γ) regress y i on x i, IMR i via OLS (using i = 1,..., M) test of endogenous selection H o : ρ = 0 H a : ρ 0 58
10 notes usual OLS standard errors are incorrect since IMR is predicted; must account for additional uncertainty due to estimation of γ need an exclusion restriction(s) a variable in z not in x due to the fact that otherwise model is identified from nonlinearity of IMR, which arises solely from the assumption of joint normality STATA: -heckman-, -heckman2-59
11 4.3 Cov(x, ε) 0 OLS requires Cov(x i, ε i ) = 0; otherwise, E[ β ols ] = β + Cov(x, ε) Var(x) β situation can arise for a number of reasons omitted variable bias (unobserved heterogeneity) reverse causation measurement error terminology x is exogenous if it is uncorrelated with ε x is endogenous if it is correlated with ε 60
12 4.3.1 Omitted Variable Bias a relevant regressor is excluded from the regression model and is correlated with x example y i = α + βx i + γw i + ε i True Model y i = α + βx i + ε i Estimated Model where ε i = γw i + ε i OLS on the estimated model yields E[ β ols ] = β + = β + = β + Cov(x, ε) Var(x) Cov(x, γw) + Cov(x, ε) Var(x) γ Cov(x, w) + Cov(x, ε) Var(x) if Cov(x, ε) = 0 (i.e., only source of correlation between x and ε is w), then Cov(x, w) E[ β ols ] = β + γ Var(x)? β depending on sgn(γ) and direction of correlation between x and w 61
13 notes w may represent an observed variable that is excluded by mistake, or an unobserved variable that the analyst does not have data on in multiple regression model, bias spills over across variables example y i = α + β 1 x 1i + β 2 x 2i + γw i + ε i True Model y i = α + β 1 x 1i + β 2 x 2i + ε i Estimated Model where ε i = γw i + ε i if Cov(x 1, ε) = 0, but Cov(x 2, ε) 0, then not only is β 2 biased, but β 1 is biased iff Cov(x 1, x 2 ) 0 62
14 4.3.2 Reverse Causation not only does x have an effect on y, but y also has an effect on x (i.e., the two variables are jointly determined) example: wages of working women and the number of children... more children may reduce a woman s productivity at work, or increase her desire for a more flexible job (sacrificing pay), thus reducing her wage; low wage woman may opt for more children because the opportunity cost of their time is lower model y i = α + βx i + ε i x i = θ + δy i + µ i where the parameters represent the structural parameters substitution for y in the second equation reveals x i = θ + δα + δβx i + δε i + µ i 1 = 1 δβ (θ + δα + δε i + µ i ) which implies that Cov(x, ε) 0 intuitively, an unobserved shock to y (i.e., ε) must be correlated with x since changes in y lead to changes in x 63
15 4.3.3 Measurement Error problem: data are measured imprecisely examples recall error coding errors mis-information (e.g., overstate income, understate drug use) rounding errors (e.g., labor supply = 40 hrs/wk, or rounded to nearest 5 ; income rounded to $1000s) two cases: (i) error in the dependent variable, or (ii) error(s) in independent variable(s) 64
16 dependent variable true model y i = α + βx i + ε i, ε i N(0, σ 2 ε) where on a variable indicates correctly measured given a random sample {yi, x i }N i=1, OLS is consistent and efficient with measurement error, do not observe y i instead one observes y i where y i }{{} observed = y }{{} i + µ }{{} i true measurement error, µ i N(0, σ 2 µ) reliability ratio RR = Var(y ) Var(y) [0, 1] susbtitution implies that the estimated model is y i = α + βx i + (µ i + ε i ) = α + βx i + ε i 65
17 properties of OLS estimates β OLS is unbiased and consistent iff Cov(x, ε) = 0, which is the case if Cov(x, ε) = Cov(x, ε) } {{ } + Cov(x, µ) } {{ } 0 by assumption 0 if ME of x α OLS is unbiased and consistent iff β OLS is unbiased and consistent since α OLS = y β OLS x and E[ ε] = 0, which is the case if E[ ε] = E[ε] }{{} 0 by + β E[µ] }{{} 0 if assumption classical ME 66
18 OLS standard errors are correct if µ i N implies ε N this holds even if Cov(µ, ε) 0 what is σ 2 ε? Var( ε) = Var(µ + ε) = Var(µ) + Var(ε) + 2 Cov(µ, ε) = σ 2 µ + σ 2 ε + 2ρσ µ σ ε which is greater than Var(ε) if ρ = 0 if Var( ε) Var(ε), then standard errors are larger summary: Classical Errors-in-Variables (CEV) model assumptions (i) µ i N(0, σ 2 µ) (ii) Cov(µ, ε) = 0 (iii) Cov(x, µ) = 0 implications (i) OLS unbiased, consistent (ii) standard errors are correct (iii) R 2, standard errors due to extra noise in the data 67
19 independent variable true model y i = α + βx i + ε i, ε i N(0, σ 2 ε) where on a variable indicates correctly measured given a random sample {yi, x i }N i=1, OLS is consistent and efficient with measurement error, do not observe x i instead one observes x i where x i }{{} observed = x }{{} i + µ }{{} i true measurement error, µ i N(0, σ 2 µ) reliability ratio RR = Var(x ) Var(x) [0, 1] susbtitution implies that the estimated model is y i = α + βx i + (ε i βµ i ) = α + βx i + ε i 68
20 properties of OLS estimates β OLS is unbiased and consistent iff Cov(x, ε) = 0, which is not likely Cov(x, ε) = Cov(x, ε) + Cov(x, βµ) = Cov(x, ε) } {{ } 0 by assumption + Cov(µ, ε) } {{ }? βcov(x, µ) } {{ } 0 = β OLS is unbiased and consistent if (i) β = 0 and Cov(µ, ε), or (ii) Cov(µ, ε) = β Cov(x, µ) α OLS is unbiased and consistent iff β OLS is unbiased and consistent since α OLS = y β OLS x and E[ ε] = 0, which is the case if E[ ε] = E[ε] }{{} 0 by + β E[µ] }{{} 0 if assumption classical ME 69
21 summary: Classical Errors-in-Variables (CEV) model assumptions (i) µ i N(0, σ 2 µ) (ii) Cov(µ, ε) = 0 (iii) Cov(x, µ) = 0 implications (i) OLS biased, inconsistent (ii) β OLS is attenuated toward zero (i.e., biased toward zero, biased down in absolute value, correct sign) plim( β OLS ) = β + Cov(x, ε) Var(x) Cov(x, ε βµ) = β + Var(x) Cov(x, ε) β Cov(x, µ) = β + Var(x) = β + = β [ Cov(x, ε) Var(x) } {{ } =0 1 σ2 µ [ ] σ 2 = β x σ } {{ 2 x } [0,1] σ 2 x ] Cov(x, µ) β + Var(x) } {{ } = β = β RR }{{} [0,1] =0 [ σ 2 x σ 2 µ σ 2 x ] Cov(µ, µ) Var(x) } {{ } =σ 2 µ/σ 2 x which is smaller than β in absolute value, but of the same sign as β 70
22 (iii) in multiple regression yi = α + βx i + K γ kx ki + ε k=1 where x is a mismeasured version of x and x k, k = 1,..., K, are correctly measured, then β OLS suffers from attenuation bias, and γ k are also biased in a complex way iff x k is uncorrelated with x 71
23 4.3.4 The Solution: Instrumental Variables goal: devise alternative estimation technique to obtain consistent estimates when x is endogenous solution identify β from exogenous variation in x suppose x can be decomposed into two independent parts: x = x + x where Cov(x, ε) = Cov(x, ε) + Cov(x, ε) and Cov(x, ε) 0, but Cov(x, ε) = 0 idea is to use variation in x due to x to identify β; ignore variation in x from x since this impact of this variation on y confounds effects of x and ε to only use variation arising from x, need additional information get this new information by adding data on a new var, z, called an instrument or instrumental variable (IV) or exclusion restriction 72
24 z is an IV for x iff (i) Cov(x, z) 0 (ii) Cov(ε, z) = 0 (iii) E[y x, z] = E[y x] (i.e., z has no direct effect on y; z is excluded from the model for y) (i) and (ii) = z is correlated with x through x estimation techniques IV Two-Stage Least Squares (TSLS or 2SLS) MLE 73
25 IV estimator model y i = α + βx i + ε i implies Cov(y, z) = Cov(α, z) + Cov(βx, z) + Cov(ɛ, z) = β Cov(x, z) estimator which is unbiased, consistent β IV = Cov(y, z) Cov(x, z) formula β IV = 1 N 1 1 N 1 i (y i y)(z i z) i (x i x)(z i z) 74
26 properties of β IV β IV is consistent plim β IV = = = = β 1 N 1 1 N 1 1 N 1 1 N 1 1 N 1 i y i(z i z) i x i(z i z) i (α + βx i + ε i )(z i z) i x i(z i z) 1 N 1 i βx i(z i z) i x i(z i z) α IV is consistent, since α IV = y β IV x Var(ε) = σ 2 σ 2 = 1 N 2 (y i α IV β IV x i ) 2 i Var( β IV ) Var( β IV ) = σ N Var(x)ρ 2 x,z σ i (x i x) R 2 x,z }{{} (sample counterpart) = ρ 2 x,z in simple OLS which is decreasing in Var(x) and ρ x,z 75
27 notes Var( β IV ) > Var( β OLS ) if ρ 2 x,z < 1 recall, Var( β OLS ) = σ/ i (x i x) inefficient to use IV if x is exogenous IV is algebraically equivalent to OLS using x as an instrument for itself β IV = = 1 N 1 1 N 1 1 N 1 i (y i y)(z i z) i (x i x)(z i z) i (y i y)(x i x) i (x i x) 2 1 N 1 = β OLS and α IV = y β IV x = y β OLS x = α OLS and σ Var( β IV ) = i (x i x)rx,z 2 σ = i (x i x)rx,x 2 σ = i (x i x) = Var( β OLS ) 76
28 multiple regression with only 1 endogenous var exogenous x s serve as instruments for themselves solution is simple using matrix algebra multiple regression with more than 1 endogenous var need unique instrument for each endogeous var exogenous x s serve as instruments for themselves solution is simple using matrix algebra 77
29 TSLS estimation proceeds in 2 steps first-stage x i = δ + πz i + µ i estimable via OLS = x i Cov(x, ε) 0 = Cov(µ, ε) 0 x i varies across i due to variation in z i (not µ i since x i does not depend on µ i ) second-stage y i = α + β x i + ε i 78
30 notes β T SLS is consistent standard errors need to be adjusted since x i is a predicted regressor if multiple endogenous vars, need a unique IV for each endogenous x if second-stage contains other exogenous vars, these vars must be included in the first-stage test of π 0 is test for Cov(x, z) 0 can test endogeneity using a Hausman test comparing β T SLS with β OLS if more than 1 IV for an endogenous var, then model is overidentified (as opposed to exactly identified) test of non-zero covariance between the set of IVs and x is given by a test that the coeffs on all IVs are jointly equal to zero enables other tests for instrument validity GMM estimation is more efficient if ε is heteroskdastic 79
31 MLE estimate first- and second-stage simultaneously, but second-stage is replaced with reduced form (i.e., y is expressed solely as a function of exogenous variables in the model) model x i = δ + πz i + µ i y i = α + βx i + ε i (structural eqn) = (α + βδ) + βπz i + (ε i + βµ i ) = (α + βδ) + βπz i + ε i (reduced form) where and ε, µ N 2 (0, Σ) bivariate normal dbn Σ = σ2 ε ρσ ε σ µ ρσ ε σ µ σ 2 µ is a 2x2 symmetric, positive definite matrix 80
32 the joint dbn of the reduced form errors is ε, µ N 2 (0, Σ) where Σ = σ2 ε + β 2 σ 2 µ + 2βρσ ε σ µ ρσ ε σ µ + βσ 2 µ ρσ ε σ µ + βσ 2 µ σ 2 µ derive ln[l(θ)], where θ = {δ, π, α, β, σ ε, σ µ, ρ} ln[l(θ)] = i ln[pr(y i, x i z i, θ)] = ln[pr( ɛ i, µ i z i, θ)] i = [ ( ɛi ln J φ 2, µ )] i, i σ ɛ σ Σ µ where J is the determinant of the Jacobian and φ 2 is the bivariate std normal pdf estimates obtained as arg max θ ln[l(θ)] = [ ( ɛi ln J φ 2, µ )] i, i σ ɛ σ Σ µ test of H o : π = 0 is a test for Cov(x, z) 0 test of endogeneity given by H o : ρ = 0 81
33 specification tests testing endogeneity may be relevant for economic reasons relevant since OLS is more efficient if x is exogenous Hausman test if x is exogenous, then β IV β OLS if x is endogenous, then β IV β OLS define test statistic based on difference β IV β OLS H = ( βiv β ) OLS ( ΣIV Σ ) 1 OLS ( βiv β ) OLS χ 2 K where K = # of x s 82
34 Durbin-Wu-Hausman test model x i = δ + πz i + µ i y i = α + βx i + ε i x is endogenous iff Cov(µ, ε) 0 steps: (i) estimate µ i via OLS (ii) estimate y i = α + βx i + δ µ i + ε i via OLS (iii) test H o : δ = 0, rejection implies x is endogenous if multiple endogenous vars, then conduct joint test H o : δ 1 =... = δ K = 0 (K = # of endog vars) 83
35 testing overidentifying restrictions if # IVs > # endogenous vars, can test if Cov(z, ε) = 0 steps: (i) regress y on x via TSLS = α T SLS, β T SLS = ε i (ii) regress ε i on z s (all IVs) = R 2 (iii) test statistic NR 2 χ 2 q where q is # of overidentifying restrictions intuition: if Cov(z, ε) = 0, then explanatory power of second regression should be small, R
36 weak IV = Cov(x, z) 0 can show plim β IV = β + ρ z,ε ρ z,x σ ε σ x if z is a valid IV, then ρ z,x > 0 and ρ z,ε = 0 = plim β IV = β but, if ρ z,x 0 and/or ρ z,ε 0, then plim β IV β OLS plim β OLS = β + ρ x,ε σ ε σ x and the asymptotic bias of OLS is smaller than IV iff ρ z,ε > ρ x,ε ρ z,x which becomes more likely as ρ z,x 0 STATA: -ivreg2-85
ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
More informationIMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD
REPUBLIC OF SOUTH AFRICA GOVERNMENT-WIDE MONITORING & IMPACT EVALUATION SEMINAR IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD SHAHID KHANDKER World Bank June 2006 ORGANIZED BY THE WORLD BANK AFRICA IMPACT
More informationLecture 15. Endogeneity & Instrumental Variable Estimation
Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental
More informationHURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009
HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationReject Inference in Credit Scoring. Jie-Men Mok
Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business
More informationGenerating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010
Simulation Methods Leonid Kogan MIT, Sloan 15.450, Fall 2010 c Leonid Kogan ( MIT, Sloan ) Simulation Methods 15.450, Fall 2010 1 / 35 Outline 1 Generating Random Numbers 2 Variance Reduction 3 Quasi-Monte
More informationOn Marginal Effects in Semiparametric Censored Regression Models
On Marginal Effects in Semiparametric Censored Regression Models Bo E. Honoré September 3, 2008 Introduction It is often argued that estimation of semiparametric censored regression models such as the
More informationWeb-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni
1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed
More informationIAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationOutline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More informationSales forecasting # 1
Sales forecasting # 1 Arthur Charpentier arthur.charpentier@univ-rennes1.fr 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting
More informationAccounting for Time-Varying Unobserved Ability Heterogeneity within Education Production Functions
Accounting for Time-Varying Unobserved Ability Heterogeneity within Education Production Functions Weili Ding Queen s University Steven F. Lehrer Queen s University and NBER July 2008 Abstract Traditional
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationPanel Data: Linear Models
Panel Data: Linear Models Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Panel Data: Linear Models 1 / 45 Introduction Outline What
More informationMeasurement Error in Criminal Justice Data
Measurement Error in Criminal Justice Data John Pepper Department of Economics University of Virginia jvpepper@virginia.edu Carol Petrie Committee on Law and Justice National Research Council CPetrie@nas.edu
More informationDepartment of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.
Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x
More informationα α λ α = = λ λ α ψ = = α α α λ λ ψ α = + β = > θ θ β > β β θ θ θ β θ β γ θ β = γ θ > β > γ θ β γ = θ β = θ β = θ β = β θ = β β θ = = = β β θ = + α α α α α = = λ λ λ λ λ λ λ = λ λ α α α α λ ψ + α =
More informationAuxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationFactor Analysis. Factor Analysis
Factor Analysis Principal Components Analysis, e.g. of stock price movements, sometimes suggests that several variables may be responding to a small number of underlying forces. In the factor model, we
More informationFrom the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
More informationRegression with a Binary Dependent Variable
Regression with a Binary Dependent Variable Chapter 9 Michael Ash CPPA Lecture 22 Course Notes Endgame Take-home final Distributed Friday 19 May Due Tuesday 23 May (Paper or emailed PDF ok; no Word, Excel,
More informationLecture 3: Differences-in-Differences
Lecture 3: Differences-in-Differences Fabian Waldinger Waldinger () 1 / 55 Topics Covered in Lecture 1 Review of fixed effects regression models. 2 Differences-in-Differences Basics: Card & Krueger (1994).
More informationA Subset-Continuous-Updating Transformation on GMM Estimators for Dynamic Panel Data Models
Article A Subset-Continuous-Updating Transformation on GMM Estimators for Dynamic Panel Data Models Richard A. Ashley 1, and Xiaojin Sun 2,, 1 Department of Economics, Virginia Tech, Blacksburg, VA 24060;
More informationSales forecasting # 2
Sales forecasting # 2 Arthur Charpentier arthur.charpentier@univ-rennes1.fr 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting
More informationSolución del Examen Tipo: 1
Solución del Examen Tipo: 1 Universidad Carlos III de Madrid ECONOMETRICS Academic year 2009/10 FINAL EXAM May 17, 2010 DURATION: 2 HOURS 1. Assume that model (III) verifies the assumptions of the classical
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
More informationOnline Appendices to the Corporate Propensity to Save
Online Appendices to the Corporate Propensity to Save Appendix A: Monte Carlo Experiments In order to allay skepticism of empirical results that have been produced by unusual estimators on fairly small
More informationMaximum likelihood estimation of a bivariate ordered probit model: implementation and Monte Carlo simulations
The Stata Journal (yyyy) vv, Number ii, pp. 1 18 Maximum likelihood estimation of a bivariate ordered probit model: implementation and Monte Carlo simulations Zurab Sajaia The World Bank Washington, DC
More informationClustering in the Linear Model
Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple
More informationTime Series Analysis
Time Series Analysis Autoregressive, MA and ARMA processes Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 212 Alonso and García-Martos
More informationQuadratic forms Cochran s theorem, degrees of freedom, and all that
Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us
More informationUsing the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes
Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes JunXuJ.ScottLong Indiana University August 22, 2005 The paper provides technical details on
More informationSAMPLE SELECTION BIAS IN CREDIT SCORING MODELS
SAMPLE SELECTION BIAS IN CREDIT SCORING MODELS John Banasik, Jonathan Crook Credit Research Centre, University of Edinburgh Lyn Thomas University of Southampton ssm0 The Problem We wish to estimate an
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationChapter 4: Statistical Hypothesis Testing
Chapter 4: Statistical Hypothesis Testing Christophe Hurlin November 20, 2015 Christophe Hurlin () Advanced Econometrics - Master ESA November 20, 2015 1 / 225 Section 1 Introduction Christophe Hurlin
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationIntroduction to Path Analysis
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationUsing instrumental variables techniques in economics and finance
Using instrumental variables techniques in economics and finance Christopher F Baum 1 Boston College and DIW Berlin German Stata Users Group Meeting, Berlin, June 2008 1 Thanks to Mark Schaffer for a number
More informationUNIVERSITY OF WAIKATO. Hamilton New Zealand
UNIVERSITY OF WAIKATO Hamilton New Zealand Can We Trust Cluster-Corrected Standard Errors? An Application of Spatial Autocorrelation with Exact Locations Known John Gibson University of Waikato Bonggeun
More informationChapter 2. Dynamic panel data models
Chapter 2. Dynamic panel data models Master of Science in Economics - University of Geneva Christophe Hurlin, Université d Orléans Université d Orléans April 2010 Introduction De nition We now consider
More informationThe Bivariate Normal Distribution
The Bivariate Normal Distribution This is Section 4.7 of the st edition (2002) of the book Introduction to Probability, by D. P. Bertsekas and J. N. Tsitsiklis. The material in this section was not included
More informationFitting Subject-specific Curves to Grouped Longitudinal Data
Fitting Subject-specific Curves to Grouped Longitudinal Data Djeundje, Viani Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK E-mail: vad5@hw.ac.uk Currie,
More informationPS 271B: Quantitative Methods II. Lecture Notes
PS 271B: Quantitative Methods II Lecture Notes Langche Zeng zeng@ucsd.edu The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.
More informationStandard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
More informationExploratory Factor Analysis and Principal Components. Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016
and Principal Components Pekka Malo & Anton Frantsev 30E00500 Quantitative Empirical Research Spring 2016 Agenda Brief History and Introductory Example Factor Model Factor Equation Estimation of Loadings
More information15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More informationPost-Secondary Education in Canada: Can Ability Bias Explain the Earnings Gap Between College and University Graduates?
DISCUSSION PAPER SERIES IZA DP No. 2784 Post-Secondary Education in Canada: Can Ability Bias Explain the Earnings Gap Between College and University Graduates? Vincenzo Caponi Miana Plesca May 2007 Forschungsinstitut
More informationChapter 3: The Multiple Linear Regression Model
Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics
More informationForecasting in supply chains
1 Forecasting in supply chains Role of demand forecasting Effective transportation system or supply chain design is predicated on the availability of accurate inputs to the modeling process. One of the
More informationSYSTEMS OF REGRESSION EQUATIONS
SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations
More informationHOW EFFECTIVE IS TARGETED ADVERTISING?
HOW EFFECTIVE IS TARGETED ADVERTISING? Ayman Farahat and Michael Bailey Marketplace Architect Yahoo! July 28, 2011 Thanks Randall Lewis, Yahoo! Research Agenda An Introduction to Measuring Effectiveness
More informationLecture Note: Self-Selection The Roy Model. David H. Autor MIT 14.661 Spring 2003 November 14, 2003
Lecture Note: Self-Selection The Roy Model David H. Autor MIT 14.661 Spring 2003 November 14, 2003 1 1 Introduction A core topic in labor economics is self-selection. What this term means in theory is
More informationINDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)
INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More informationA General Approach to Variance Estimation under Imputation for Missing Survey Data
A General Approach to Variance Estimation under Imputation for Missing Survey Data J.N.K. Rao Carleton University Ottawa, Canada 1 2 1 Joint work with J.K. Kim at Iowa State University. 2 Workshop on Survey
More informationNote 2 to Computer class: Standard mis-specification tests
Note 2 to Computer class: Standard mis-specification tests Ragnar Nymoen September 2, 2013 1 Why mis-specification testing of econometric models? As econometricians we must relate to the fact that the
More informationAn Internal Model for Operational Risk Computation
An Internal Model for Operational Risk Computation Seminarios de Matemática Financiera Instituto MEFF-RiskLab, Madrid http://www.risklab-madrid.uam.es/ Nicolas Baud, Antoine Frachot & Thierry Roncalli
More informationThe Real Business Cycle Model
The Real Business Cycle Model Ester Faia Goethe University Frankfurt Nov 2015 Ester Faia (Goethe University Frankfurt) RBC Nov 2015 1 / 27 Introduction The RBC model explains the co-movements in the uctuations
More informationModels for Longitudinal and Clustered Data
Models for Longitudinal and Clustered Data Germán Rodríguez December 9, 2008, revised December 6, 2012 1 Introduction The most important assumption we have made in this course is that the observations
More informationNon-Inferiority Tests for Two Means using Differences
Chapter 450 on-inferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for non-inferiority tests in two-sample designs in which the outcome is a continuous
More informationChicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011
Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this
More informationCorrelated Random Effects Panel Data Models
INTRODUCTION AND LINEAR MODELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. The Linear
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationMaximum Likelihood Estimation
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for
More informationMultiple Choice Models II
Multiple Choice Models II Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Multiple Choice Models II 1 / 28 Categorical data Categorical
More informationIntroduction: Overview of Kernel Methods
Introduction: Overview of Kernel Methods Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University
More informationPremium Copayments and the Trade-off between Wages and Employer-Provided Health Insurance. February 2011
PRELIMINARY DRAFT DO NOT CITE OR CIRCULATE COMMENTS WELCOMED Premium Copayments and the Trade-off between Wages and Employer-Provided Health Insurance February 2011 By Darren Lubotsky School of Labor &
More informationChapter 10: Basic Linear Unobserved Effects Panel Data. Models:
Chapter 10: Basic Linear Unobserved Effects Panel Data Models: Microeconomic Econometrics I Spring 2010 10.1 Motivation: The Omitted Variables Problem We are interested in the partial effects of the observable
More informationWooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions
Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions What will happen if we violate the assumption that the errors are not serially
More informationUniversity of Maryland Fraternity & Sorority Life Spring 2015 Academic Report
University of Maryland Fraternity & Sorority Life Academic Report Academic and Population Statistics Population: # of Students: # of New Members: Avg. Size: Avg. GPA: % of the Undergraduate Population
More informationGender Effects in the Alaska Juvenile Justice System
Gender Effects in the Alaska Juvenile Justice System Report to the Justice and Statistics Research Association by André Rosay Justice Center University of Alaska Anchorage JC 0306.05 October 2003 Gender
More informationMortgage Lending Discrimination and Racial Differences in Loan Default
Mortgage Lending Discrimination and Racial Differences in Loan Default 117 Journal of Housing Research Volume 7, Issue 1 117 Fannie Mae Foundation 1996. All Rights Reserved. Mortgage Lending Discrimination
More information1 Teaching notes on GMM 1.
Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in
More informationLinear Models for Continuous Data
Chapter 2 Linear Models for Continuous Data The starting point in our exploration of statistical models in social research will be the classical linear model. Stops along the way include multiple linear
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More informationEmployer-Provided Health Insurance and Labor Supply of Married Women
Upjohn Institute Working Papers Upjohn Research home page 2011 Employer-Provided Health Insurance and Labor Supply of Married Women Merve Cebi University of Massachusetts - Dartmouth and W.E. Upjohn Institute
More informationOctober 3rd, 2012. Linear Algebra & Properties of the Covariance Matrix
Linear Algebra & Properties of the Covariance Matrix October 3rd, 2012 Estimation of r and C Let rn 1, rn, t..., rn T be the historical return rates on the n th asset. rn 1 rṇ 2 r n =. r T n n = 1, 2,...,
More informationPerformance Related Pay and Labor Productivity
DISCUSSION PAPER SERIES IZA DP No. 2211 Performance Related Pay and Labor Productivity Anne C. Gielen Marcel J.M. Kerkhofs Jan C. van Ours July 2006 Forschungsinstitut zur Zukunft der Arbeit Institute
More informationECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE
ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE YUAN TIAN This synopsis is designed merely for keep a record of the materials covered in lectures. Please refer to your own lecture notes for all proofs.
More informationFinancial Risk Management Exam Sample Questions/Answers
Financial Risk Management Exam Sample Questions/Answers Prepared by Daniel HERLEMONT 1 2 3 4 5 6 Chapter 3 Fundamentals of Statistics FRM-99, Question 4 Random walk assumes that returns from one time period
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationMagne Mogstad and Matthew Wiswall
Discussion Papers No. 586, May 2009 Statistics Norway, Research Department Magne Mogstad and Matthew Wiswall How Linear Models Can Mask Non-Linear Causal Relationships An Application to Family Size and
More informationVI. Real Business Cycles Models
VI. Real Business Cycles Models Introduction Business cycle research studies the causes and consequences of the recurrent expansions and contractions in aggregate economic activity that occur in most industrialized
More informationComparison of Estimation Methods for Complex Survey Data Analysis
Comparison of Estimation Methods for Complex Survey Data Analysis Tihomir Asparouhov 1 Muthen & Muthen Bengt Muthen 2 UCLA 1 Tihomir Asparouhov, Muthen & Muthen, 3463 Stoner Ave. Los Angeles, CA 90066.
More informationSection 1: Simple Linear Regression
Section 1: Simple Linear Regression Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationAverage Redistributional Effects. IFAI/IZA Conference on Labor Market Policy Evaluation
Average Redistributional Effects IFAI/IZA Conference on Labor Market Policy Evaluation Geert Ridder, Department of Economics, University of Southern California. October 10, 2006 1 Motivation Most papers
More informationFinance 400 A. Penati - G. Pennacchi Market Micro-Structure: Notes on the Kyle Model
Finance 400 A. Penati - G. Pennacchi Market Micro-Structure: Notes on the Kyle Model These notes consider the single-period model in Kyle (1985) Continuous Auctions and Insider Trading, Econometrica 15,
More informationComparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors
Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors Arthur Lewbel, Yingying Dong, and Thomas Tao Yang Boston College, University of California Irvine, and Boston
More informationIdentification and Inference in a Simultaneous Equation Under Alternative Information Sets and Sampling Schemes. Jan F. KIVIET
Division of Economics, EGC School of Humanities and Social Sciences Nanyang Technological University 4 Nanyang Drive Singapore 63733 Identification and Inference in a Simultaneous Equation Under Alternative
More informationWhat s New in Econometrics? Lecture 8 Cluster and Stratified Sampling
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and
More informationMissing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13
Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional
More informationSections 2.11 and 5.8
Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and
More informationProbability Calculator
Chapter 95 Introduction Most statisticians have a set of probability tables that they refer to in doing their statistical wor. This procedure provides you with a set of electronic statistical tables that
More informationAsymmetry and the Cost of Capital
Asymmetry and the Cost of Capital Javier García Sánchez, IAE Business School Lorenzo Preve, IAE Business School Virginia Sarria Allende, IAE Business School Abstract The expected cost of capital is a crucial
More informationESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics
ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Quantile Treatment Effects 2. Control Functions
More information