Chapter 3: The Multiple Linear Regression Model
|
|
|
- Emory Fleming
- 9 years ago
- Views:
Transcription
1 Chapter 3: The Multiple Linear Regression Model Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 23, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
2 Section 1 Introduction Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
3 1. Introduction The objectives of this chapter are the following: 1 De ne the multiple linear regression model. 2 Introduce the ordinary least squares (OLS) estimator. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
4 1. Introduction The outline of this chapter is the following: Section 2: The multiple linear regression model Section 3: The ordinary least squares estimator Section 4: Statistical properties of the OLS estimator Subsection 4.1: Finite sample properties Subsection 4:2: Asymptotic properties Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
5 1. Introduction References Amemiya T. (1985), Advanced Econometrics. Harvard University Press. Greene W. (2007), Econometric Analysis, sixth edition, Pearson - Prentice Hil (recommended) Pelgrin, F. (2010), Lecture notes Advanced Econometrics, HEC Lausanne (a special thank) Ruud P., (2000) An introduction to Classical Econometric Theory, Oxford University Press. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
6 1. Introduction Notations: In this chapter, I will (try to...) follow some conventions of notation. f Y (y) F Y (y) Pr () y Y probability density or mass function cumulative distribution function probability vector matrix Be careful: in this chapter, I don t distinguish between a random vector (matrix) and a vector (matrix) of deterministic elements. For more appropriate notations, see: Abadir and Magnus (2002), Notation in econometrics: a proposal for a standard, Econometrics Journal. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
7 Section 2 The Multiple Linear Regression Model Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
8 2. The Multiple Linear Regression Model Objectives 1 De ne the concept of Multiple linear regression model. 2 Semi-parametric and Parametric multiple linear regression model. 3 The multiple linear Gaussian model. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
9 2. The Multiple Linear Regression Model De nition (Multiple linear regression model) The multiple linear regression model is used to study the relationship between a dependent variable and one or more independent variables. The generic form of the linear regression model is y = x 1 β 1 + x 2 β x K β K + ε where y is the dependent or explained variable and x 1,.., x K are the independent or explanatory variables. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
10 2. The Multiple Linear Regression Model Notations 1 y is the dependent variable, the regressand or the explained variable. 2 x j is an explanatory variable, a regressor or a covariate. 3 ε is the error term or disturbance. IMPORTANT: do not use the term "residual".. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
11 2. The Multiple Linear Regression Model Notations (cont d) The term ε is a random disturbance, so named because it disturbs an otherwise stable relationship. The disturbance arises for several reasons: 1 Primarily because we cannot hope to capture every in uence on an economic variable in a model, no matter how elaborate. The net e ect, which can be positive or negative, of these omitted factors is captured in the disturbance. 2 There are many other contributors to the disturbance in an empirical model. Probably the most signi cant is errors of measurement. It is easy to theorize about the relationships among precisely de ned variables; it is quite another to obtain accurate measures of these variables. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
12 2. The Multiple Linear Regression Model Notations (cont d) We assume that each observation in a sample fy i, x i1, x i2..x ik g for i = 1,.., N is generated by an underlying process described by Remark: y i = x i1 β 1 + x i2 β x ik β K + ε i x ik = value of the k th explanatory variable for the i th unit of the sample x unit,variable Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
13 2. The Multiple Linear Regression Model Notations (cont d) Let the N 1 column vector x k be the N observations on variable x k, for k = 1,.., K. Let assemble these data in an N K data matrix, X. Let y be the N 1 column vector of the N observations, y 1, y 2,.., y N. Let ε be the N 1 column vector containing the N disturbances. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
14 2. The Multiple Linear Regression Model Notations (cont d) y = N 1 0 y 1 y 2.. y i.. y N 1 C A x k = N 1 0 x 1k x 2k.. x ik.. x Nk 1 C A ε = N 1 0 ε 1 ε 2.. ε i.. ε N 1 C A β = K 1 0 β 1 β 2.. β K 1 C A Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
15 2. The Multiple Linear Regression Model Notations (cont d) or equivalently X = N K X = (x 1 : x 2 :.. : x K ) N K 0 x 11 x 12.. x 1k.. x 1K x 21 x 22.. x 2k.. x 2K x i1 x i2.. x ik.. x ik x N1 x N2.. x Nk.. x NK 1 C A Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
16 2. The Multiple Linear Regression Model Fact In most of cases, the rst column of X is assumed to be a column of 1s so that β 1 is the constant term in the model. X = N K 0 x 1 = 1 N 1 N 1 1 x 12.. x 1k.. x 1K 1 x 22.. x 2k.. x 2K x i2.. x ik.. x ik x N2.. x Nk.. x NK 1 C A Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
17 2. The Multiple Linear Regression Model Remark More generally, the matrix X may as well contain stochastic and non stochastic elements such as: Constant; Time trend; Dummy variables (for speci c episodes in time); Etc. Therefore, X is generally a mixture of xed and random variables. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
18 2. The Multiple Linear Regression Model De nition (Simple linear regression model) The simple linear regression model is a model with only one stochastic regressor: K = 1 if there is no constant or K = 2 if there is a constant: y i = β 1 x i + ε i y i = β 1 + β 2 x i2 + ε i for i = 1,.., N, or y = β 1 + β 2 x 2 + ε Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
19 2. The Multiple Linear Regression Model De nition (Multiple linear regression model) The multiple linear regression model can be written y = X N 1 β N K K 1 + ε N 1 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
20 2. The Multiple Linear Regression Model One key di erence for the speci cation of the MLRM: Parametric/semi-parametric speci cation Parametric model: the distribution of the error terms is fully characterized, e.g. ε N (0, Ω) Semi-Parametric speci cation: only a few moments of the error terms are speci ed, e.g. E (ε) = 0 and V (ε) = E εε > = Ω. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
21 2. The Multiple Linear Regression Model This di erence does not matter for the derivation of the ordinary least square estimator But this di erence matters for (among others): 1 The characterization of the statistical properties of the OLS estimator (e.g., e ciency); 2 The choice of alternative estimators (e.g., the maximum likelihood estimator); 3 Etc. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
22 2. The Multiple Linear Regression Model De nition (Semi-parametric multiple linear regression model) The semi-parametric multiple linear regression model is de ned by where the error term ε satis es y = Xβ + ε E ( εj X) = 0 N 1 V ( εj X) = σ 2 I N N N and I N is the identity matrix of order N. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
23 2. The Multiple Linear Regression Model Remarks 1 If the matrix X is non stochastic ( xed), i.e. there are only xed regressors, then the conditions on the error term u read: E (ε) = 0 V (ε) = σ 2 I N 2 If the (conditional) variance covariance matrix of ε is not diagonal, i.e. if V ( εj X) = Ω the model is called the Multiple Generalized Linear Regression Model Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
24 2. The Multiple Linear Regression Model Remarks (cont d) The two conditions on the error term ε E ( εj X) = 0 N 1 V ( εj X) = σ 2 I N are equivalent to E (yj X) = Xβ V (yj X) = σ 2 I N Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
25 2. The Multiple Linear Regression Model De nition (The multiple linear Gaussian model) The (parametric) multiple linear Gaussian model is de ned by y = Xβ + ε where the error term ε is normally distributed ε N 0, σ 2 I N As a consequence, the vector y has a conditional normal distribution with yj X N Xβ, σ 2 I N Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
26 2. The Multiple Linear Regression Model Remarks 1 The multiple linear Gaussian model is (by de nition) a parametric model. 2 If the matrix X is non stochastic ( xed), i.e. there are only xed regressors, then the vector y has marginal normal distribution: y N Xβ, σ 2 I N Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
27 2. The Multiple Linear Regression Model The classical linear regression model consists of a set of assumptions that describes how the data set is produced by a data generating process (DGP) Assumption 1: Linearity Assumption 2: Full rank condition or identi cation Assumption 3: Exogeneity Assumption 4: Spherical error terms Assumption 5: Data generation Assumption 6: Normal distribution Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
28 2. The Multiple Linear Regression Model De nition (Assumption 1: Linearity) The model is linear with respect to the parameters β 1,.., β K. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
29 Remarks The model speci es a linear relationship between the dependent variable and the regressors. For instance, the models are all linear (with respect to β). y = β 0 + β 1 x + u y = β 0 + β 1 cos (x) + v y = β 0 + β 1 1 x + w In contrast, the model y = β 0 + β 1 x β 2 + ε is non linear Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
30 2. The Multiple Linear Regression Model Remark The model can be linear after some transformations. Starting from y = Ax β exp (ε), one has a log-linear speci cation: ln (y) = ln (A) + β ln (x) + ε Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
31 2. The Multiple Linear Regression Model De nition (Log-linear model) The loglinear model is ln (y i ) = β 1 ln (x i1 ) + β 2 ln (x i2 ) β K ln (x ik ) + ε i This equation is also known as the constant elasticity form as in this equation, the elasticity of y with respect to changes in x does not vary with x ik : β k = ln (y i ) ln (x ik ) = y i x ik x ik y i Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
32 2. The Multiple Linear Regression Model The classical linear regression model consists of a set of assumptions that describes how the data set is produced by a data generating process (DGP) Assumption 1: Linearity Assumption 2: Full rank condition or identi cation Assumption 3: Exogeneity Assumption 4: Spherical error terms Assumption 5: Data generation Assumption 6: Normal distribution Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
33 2. The Multiple Linear Regression Model De nition (Assumption 2: Full column rank) X is an N K matrix with rank K. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
34 2. The Multiple Linear Regression Model Interpretation 1 There is no exact relationship among any of the independent variables in the model. 2 The columns of X are linearly independent. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
35 2. The Multiple Linear Regression Model Example Suppose that a cross-section model satis es: y i = β 0 + β 1 non labor income i + β 2 salary i +β 3 total income i + ε i The identi cation condition does not hold since total income is exactly equal to salary plus non labor income (exact linear dependency in the model). Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
36 2. The Multiple Linear Regression Model Remarks 1 Perfect multi-collinearity is generally not di cult to spot and is signalled by most statistical software. 2 Imperfect multi-collinearity is a more serious issue (see further). Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
37 2. The Multiple Linear Regression Model De nition (Identi cation) The multiple linear regression model is identi able if and only if one the following equivalent assertions holds: (i) rank (X) = K (ii) The matrix X > X is invertible (iii) The columns of X form a basis of L (X) (iv) Xβ 1 = Xβ 2 =) β 1 = β 2 (v) Xβ = 0 =) β = 0 8β 2 R K (vi) ker (X) = f0g 8 (β 1, β 2 ) 2 R K R K Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
38 2. The Multiple Linear Regression Model The classical linear regression model consists of a set of assumptions that describes how the data set is produced by a data generating process (DGP) Assumption 1: Linearity Assumption 2: Full rank condition or identi cation Assumption 3: Exogeneity Assumption 4: Spherical error terms Assumption 5: Data generation Assumption 6: Normal distribution Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
39 2. The Multiple Linear Regression Model De nition (Assumption 3: Strict exogeneity of the regressors) The regressors are exogenous in the sense that: E ( εj X) = 0 N 1 or equivalently for all the units i 2 f1,..ng E ( ε i j X) = 0 or equivalently E ( ε i j x jk ) = 0 for any explanatory variable k 2 f1,..k g and any unit j 2 f1,..ng. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
40 2. The Multiple Linear Regression Model CommentsComments 1 The expected value of the error term at observation i (in the sample) is not a function of the independent variables observed at any observation (including the i th observation). The independent variables are not predictors of the error terms. 2 The strict exogeneity condition can be rewritten as: E (y j X) = Xβ 3 If the regressors are xed, this condition can be rewritten as: E (ε) = 0 N 1 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
41 2. The Multiple Linear Regression Model Implications The (strict) exogeneity condition E ( εj X) = 0 N 1 has two implications: 1 The zero conditional mean of ε implies that the unconditional mean of u is also zero (the reverse is not true): E (ε) = E X (E ( εj X)) = E X (0) = 0 2 The zero conditional mean of ε implies that (the reverse is not true): E (ε i x jk ) = 0 8i, j, k or Cov (ε i, X) = 0 8i Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
42 2. The Multiple Linear Regression Model The classical linear regression model consists of a set of assumptions that describes how the data set is produced by a data generating process (DGP) Assumption 1: Linearity Assumption 2: Full rank condition or identi cation Assumption 3: Exogeneity Assumption 4: Spherical error terms Assumption 5: Data generation Assumption 6: Normal distribution Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
43 2. The Multiple Linear Regression Model De nition (Assumption 4: Spherical disturbances) The error terms are such that: and V ( ε i j X) = E ε 2 i X = σ 2 for all i 2 f1,..ng Cov ( ε i, ε j j X) = E ( ε i ε j j X) = 0 for all i 6= j The condition of constant variances is called homoscedasticity. The uncorrelatedness across observations is called nonautocorrelation. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
44 2. The Multiple Linear Regression Model Comments 1 Spherical disturbances = homoscedasticity + nonautocorrelation 2 If the errors are not spherical, we call them nonspherical disturbances. 3 The assumption of homoscedasticity is a strong one: this is the exception rather than the rule! Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
45 2. The Multiple Linear Regression Model Comments Let us consider the (conditional) variance covariance matrix of the error terms: V ( εj X) = E εε > X {z } {z } N N N N 0 E ε X E ( ε 1 ε 2 j X).. E ( ε 1 ε j j X).. E ( ε 1 ε N j X) E ( ε 2 ε 1 j X) E ε 2 2 X.. E ( ε 2 ε j j X).. E ( ε 2 ε N j X) = B E ( ε i ε 1 j X).. E ( ε i ε j j X).. E ( ε i ε N j X) A E ( ε N ε 1 j X).. E ( ε N ε j j X).. E ε 2 N X Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
46 2. The Multiple Linear Regression Model Comments The two assumptions (homoscedasticity and nonautocorrelation) imply that: V ( εj X) = E εε > X = σ 2 I N {z } {z } N N N N 0 = σ σ σ 2 1 C A Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
47 2. The Multiple Linear Regression Model The classical linear regression model consists of a set of assumptions that describes how the data set is produced by a data generating process (DGP) Assumption 1: Linearity Assumption 2: Full rank condition or identi cation Assumption 3: Exogeneity Assumption 4: Spherical error terms Assumption 5: Data generation Assumption 6: Normal distribution Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
48 2. The Multiple Linear Regression Model De nition (Assumption 5: Data generation) The data in (x i1 x i2...x ik ) may be any mixture of constants and random variables. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
49 2. The Multiple Linear Regression Model Comments 1 Analysis will be done conditionally on the observed X, so whether the elements in X are xed constants or random draws from a stochastic process will not in uence the results. 2 In the case of stochastic regressors, the unconditional statistical properties of are obtained in two steps: (1) using the result conditioned on X and (2) nding the unconditional result by averaging (i.e., integrating over) the conditional distributions. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
50 2. The Multiple Linear Regression Model Comments Assumptions regarding (x i1 x i2...x ik y i ) for i = 1,.., N is also required. This is a statement about how the sample is drawn. In the sequel, we assume that (x i1 x i2...x ik y i ) for i = 1,.., N are independently and identically distributed (i.i.d). The observations are drawn by a simple random sampling from a large population. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
51 2. The Multiple Linear Regression Model The classical linear regression model consists of a set of assumptions that describes how the data set is produced by a data generating process (DGP) Assumption 1: Linearity Assumption 2: Full rank condition or identi cation Assumption 3: Exogeneity Assumption 4: Spherical error terms Assumption 5: Data generation Assumption 6: Normal distribution Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
52 2. The Multiple Linear Regression Model De nition (Assumption 6: Normal distribution) The disturbances are normally distributed. ε i j X N 0, σ 2 or equivalently εj X N 0 N 1, σ 2 I N Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
53 2. The Multiple Linear Regression Model Comments 1 Once again, this is a convenience that we will dispense with after some analysis of its implications. 2 Normality is not necessary to obtain many of the results presented below. 3 Assumption 6 implies assumptions 3 (exogeneity) and 4 (spherical disturbances). Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
54 2. The Multiple Linear Regression Model Summary The main assumptions of the multiple linear regression model A1: linearity The model is linear with β A2: identi cation X is an N K matrix with rank K A3: exogeneity E ( εj X) = 0 N 1 A4: spherical error terms V ( εj X) = σ 2 I N A5: data generation X may be xed or random A6: normal distribution εj X N 0 N 1, σ 2 I N Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
55 2. The Multiple Linear Regression Model Key Concepts 1 Simple linear regression model 2 Multiple linear regression model 3 Semi-parametric multiple linear regression model 4 Multiple linear Gaussian model 5 Assumptions of the multiple linear regression model 6 Linearity (A1), Identi cation (A2), Exogeneity (A3), Spherical error terms (A4), Data generation (A5) and Normal distribution (A6) Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
56 Section 3 The ordinary least squares estimator Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
57 3. The ordinary least squares estimator Introduction 1 The simple linear regression model assumes that the following speci cation is true in the population: y = Xβ + ε where other unobserved factors determining y are captured by the error term ε. 2 Consider a sample fx i1, x i2,.., x ik, y i g N i=1 of i.i.d. random variables (be careful to the change of notations here) and only one realization of this sample (your data set). 3 How to estimate the vector of parameters β? Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
58 3. The ordinary least squares estimator Introduction (cont d) 1 If we assume that assumptions A1-A6 hold, we have a multiple linear Gaussian model (parametric model), and a solution is to use the MLE. The MLE estimator for β coincides to the ordinary least squares (OLS) estimator (cf. chapter 2). 2 If we assume that only assumptions A1-A5 hold, we have a semi-parametric multiple linear regression model, the MLE is unfeasible. 3 In this case, the only solution is to use the ordinary least squares estimator (OLS). Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
59 3. The ordinary least squares estimator Intuition Let us consider the simple linear regression model and for simplicity denote x i = x i2 : y i = β 1 + β 2 x i + ε i The general idea of the OLS consists in minimizing the distance between the points (x i, y i ) and the regression line by i = bβ 1 + bβ 2 x i or the points (x i, by i ) for all i = 1,.., N Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
60 3. The ordinary least squares estimator Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
61 3. The ordinary least squares estimator Intuition (cont d) Estimates of β 1 and β 2 are chosen by minimizing the sum of the squared residuals (SSR): N This SSR can be written as: N i=1 i=1 bε 2 i 2 bε 2 i = y i bβ 1 bβ 2 x i Therefore, bβ 1 and bβ 2 are the solutions of the minimization problem b β 1, bβ 2 = arg min (β 1,β 2 ) N i=1 (y i β 1 β 2 x i ) 2 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
62 3. The ordinary least squares estimator De nition (OLS - simple linear regression model) In the simple linear regression model y i = β 1 + β 2 x i + ε i, the OLS estimators bβ 1 and bβ 2 are the solutions of the minimization problem b β 1, bβ 2 = arg min (β 1,β 2 ) N i=1 (y i β 1 β 2 x i ) 2 The solutions are: bβ 1 = y N bβ 2 x N bβ 2 = N i=1 (x i x N ) (y i y N ) N i=1 (x i x N ) 2 where y N = N 1 N i=1 y i and x N = N 1 N i=1 x i respectively denote the sample mean of the dependent variable y and the regressor x. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
63 3. The ordinary least squares estimator Remark The OLS estimator is a linear estimator (cf. chapter 1) since it can be expressed as a linear function of the observations y i : with in the case where y N = 0. ω i = bβ 2 = N ω i y i i=1 (x i x N ) N i=1 (x i x N ) 2 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
64 3. The ordinary least squares estimator De nition (Fitted value) The predicted or tted value for observation i is: by i = bβ 1 + bβ 2 x i with a sample mean equal to the sample average of the observations by N = 1 N N by i = y N = 1 N i=1 N y i i=1 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
65 3. The ordinary least squares estimator De nition (Fitted residual) The residual for observation i is: bε i = y i bβ 1 bβ 2 x i with a sample mean equal to zero by de nition. bε N = 1 N N i=1 bε i = 0 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
66 3. The ordinary least squares estimator Remarks 1 The t of the regression is good if the sum N i=1 bε2 i (or SSR) is small, i.e., the unexplained part of the variance of y is small. 2 The coe cient of determination or R 2 is given by: R 2 = N i=1 (by i y N ) 2 N N i=1 (y i y N ) 2 = 1 i=1 bε2 i N i=1 (y i y N ) 2 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
67 3. The ordinary least squares estimator Orthogonality conditions Under assumption A3 (strict exogeneity), we have E ( ε i j x i ) = 0. This condition implies that: E (ε i ) = 0 E (ε i x i ) = 0 Using the sample analog of this moment conditions (cf. chapter 6, GMM), one has: N 1 N y i i=1 bβ 1 bβ 2 x i = 0 N 1 N y i i=1 bβ 1 bβ 2 x i x i = 0 This is a system of two equations and two unknowns bβ 1 and bβ 2. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
68 3. The ordinary least squares estimator De nition (Orthogonality conditions) The ordinary least squares estimator can be de ned from the two sample analogs of the following moment conditions: E (ε i ) = 0 E (ε i x i ) = 0 The corresponding system of equations is just-identi ed. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
69 3. The ordinary least squares estimator OLS and multiple linear regression model Now consider the multiple linear regression model or y i = y = Xβ + ε K β k x ik + ε i k=1 Objective: nd an estimator (estimate) of β 1, β 2,.., β K and σ 2 under the assumptions A1-A5. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
70 3. The ordinary least squares estimator OLS and multiple linear regression model Di erent methods: 1 Minimize the sum of squared residuals (SSR) 2 Solve the same minimization problem with matrix notation. 3 Use moment conditions. 4 Geometrical interpretation Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
71 3. The ordinary least squares estimator 1. Minimize the sum of squared residuals (SSR): As in the simple linear regression, bβ = arg min β N i=1 ε 2 i = arg min β N i=1 y i! 2 K β k x ik k=1 One can derive the rst order conditions with respect to β k for k = 1,.., K and solve a system of K equations with K unknowns. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
72 3. The ordinary least squares estimator 2. Using matrix notations: De nition (OLS and multiple linear regression model) In the multiple linear regression model y i = x > i β + ε i, with x i = (x i1,.., x ik ) >, the OLS estimator b β is the solution of bβ = arg min β The OLS estimators of β is: bβ = N x i x i > i=1 N i=1 y i! 1 2 x i > β! N x i y i i=1 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
73 3. The ordinary least squares estimator 2. Using matrix notations: De nition (Normal equations) Under suitable regularity conditions, in the multiple linear regression model y i = x > i β + ε i, with x i = (x i1 :.. : x ik ) >, the normal equations are N x i y i x i > i=1 bβ = 0 K 1 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
74 3. The ordinary least squares estimator 2. Using matrix notations: De nition (OLS and multiple linear regression model) In the multiple linear regression model y = Xβ + ε, the OLS estimator b β is the solution of the minimization problem bβ = arg min β ε > ε = arg min β (y Xβ) > (y Xβ) The OLS estimators of β is: bβ = 1 X > X X > y Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
75 3. The ordinary least squares estimator 2. Using matrix notations: De nition The ordinary least squares estimator β b of β minimizes the following criteria s (β) = k(y Xβ)k 2 I N = (y Xβ) > (y Xβ) Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
76 3. The ordinary least squares estimator 2. Using matrix notations: The FOC (normal equations) are de ned by: s (β) β = 2 X > bβ {z} y X b β K N {z } N 1 = 0 K 1 The second-order conditions hold: s (β) β β > = 2 X {z > X} is de nite positive bβ K K since by de nition X > X is a positive de nite matrix. We have a minimum. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
77 3. The ordinary least squares estimator 2. Using matrix notations: De nition (Normal equations) Under suitable regularity conditions, in the multiple linear regression model y = Xβ + ε, the normal equations are given by: X > {z} (y Xβ) {z } K N N 1 = 0 K 1 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
78 3. The ordinary least squares estimator De nition (Unbiased variance estimator) In the multiple linear regression model y = Xβ + ε, the unbiased estimator of σ 2 is given by: bσ 2 = 1 N K N i=1 bε 2 i SSR N K Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
79 3. The ordinary least squares estimator 2. Using matrix notations: The estimator bσ 2 can also be written as: bσ 2 = 1 N K N y i i=1 x > i 2 bβ bσ 2 = (y Xβ)> (y Xβ) N K bσ 2 = k(y Xβ)k2 I N N K Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
80 3. The ordinary least squares estimator 3. Using moment conditions: Under assumption A3 (strict exogeneity), we have E ( εj X) = 0. This condition implies: E (ε i x i ) = 0 K 1 with x i = (x i1 :.. : x ik ) >. Using the sample analogs, one has: N 1 N x i y i x i > i=1 bβ = 0 K 1 We have K (normal) equations with K unknown parameters bβ 1,.., bβ K. The system is just identi ed. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
81 3. The ordinary least squares estimator 4. Geometric interpretation: 1 The ordinary least squares estimation methods consists in determining the adjusted vector, by, which is the closest to y (in a certain space...) such that the squared norm between y and by is minimized. 2 Finding by is equivalent to nd an estimator of β. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
82 3. The ordinary least squares estimator 4. Geometric interpretation: De nition (Geometric interpretation) The adjusted vector, by, is the (orthogonal) projection of y onto the column space of X. The tted error terms, bε, is the projection of y onto the orthogonal space engendered by the column space of X. The vectors by and bε are orthogonal. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
83 3. The ordinary least squares estimator 4. Geometric interpretation: Source: F. Pelgrin (2010), Lecture notes, Advanced Econometrics Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
84 3. The ordinary least squares estimator 4. Geometric interpretation: De nition (Projection matrices) The vectors by and bε are de ned to be: by = P y bε = M y where P and M denote the two following projection matrices: 1 P = X X > X X > 1 M = I N P = I N X X > X X > Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
85 3. The ordinary least squares estimator Other geometric interpretations: Suppose that there is a constant term in the model. 1 The least squares residuals sum to zero: N i=1 bε i = 0 2 The regression hyperplane passes through the point of means of the data (x N, y N ). 3 The mean of the tted (adjusted) values of y equals the mean of the actual values of y: by N = y N Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
86 3. The ordinary least squares estimator De nition (Coe cient of determination) The coe cient of determination of the multiple linear regression model (with a constant term) is the ratio of the total (empirical) variance explained by model to the total (empirical) variance of y: R 2 = N i=1 (by i y N ) 2 N N i=1 (y i y N ) 2 = 1 i=1 bε2 i N i=1 (y i y N ) 2 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
87 3. The ordinary least squares estimator Remark 1 The coe cient of determination measures the proportion of the total variance (or variability) in y that is accounted for by variation in the regressors (or the model). 2 Problem: the R 2 automatically and spuriously increases when extra explanatory variables are added to the model. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
88 3. The ordinary least squares estimator De nition (Adjusted R-squared) The adjusted R-squared coe cient is de ned to be: R 2 = 1 N 1 N p 1 1 R2 where p denotes the number of regressors (not counting the constant term, i.e., p = K 1 if there is a constant or p = K otherwise). Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
89 3. The ordinary least squares estimator Remark One can show that 1 R 2 <R 2 2 if N is large R 2 'R 2 3 The adjusted R-squared R 2 can be negative. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
90 3. The ordinary least squares estimator Key Concepts 1 OLS estimator and estimate 2 Fitted or predicted value 3 Residual or tted residual 4 Orthogonality conditions 5 Normal equations 6 Geometric interpretations of the OLS 7 Coe cient of determination and adjusted R-squared Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
91 Section 4 Statistical properties of the OLS estimator Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
92 4. Statistical properties of the OLS estimator In order to study the statistical properties of the OLS estimator, we have to distinguish (cf. chapter 1): 1 The nite sample properties 2 The large sample or asymptotic properties Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
93 4. Statistical properties of the OLS estimator But, we have also to distinguish the properties given the assumptions made on the linear regression model 1 Semi-parametric linear regression model (the exact distribution of ε is unknown) versus parametric linear regression model (and especially Gaussian linear regression model, assumption A6). 2 X is a matrix of random regressors versus X is a matrix of xed regressors. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
94 4. Statistical properties of the OLS estimator Fact (Assumptions) In the rest of this section, we assume that assumptions A1-A5 hold. A1: linearity The model is linear with β A2: identi cation X is an N K matrix with rank K A3: exogeneity E ( εj X) = 0 N 1 A4: spherical error terms V ( εj X) = σ 2 I N A5: data generation X may be xed or random Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
95 Subsection 4.1. Finite sample properties of the OLS estimator Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
96 4.1. Finite sample properties Objectives The objectives of this subsection are the following: 1 Compute the two rst moments of the (unknown) nite sample distribution of the OLS estimators b β and bσ 2 2 Determine the nite sample distribution of the OLS estimators b β and bσ under particular assumptions (A6). 3 Determine if the OLS estimators are "good": e cient estimator versus BLUE. 4 Introduce the Gauss-Markov theorem. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
97 4.1. Finite sample properties First moments of the OLS estimators Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
98 4.1. Finite sample properties Moments In a rst step, we will derive the rst moments of the OLS estimators 1 Step 1: compute E bβ and V bβ 2 Step 2: compute E bσ 2 and V bσ 2 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
99 4.1. Finite sample properties De nition (Unbiased estimator) In the multiple linear regression model y = Xβ 0 + ε, under the assumption A3 (strict exogeneity), the OLS estimator β b is unbiased: E bβ = β 0 where β 0 denotes the true value of the vector of parameters. This result holds whether or not the matrix X is considered as random. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
100 4.1. Finite sample properties Proof Case 1: xed regressors (cf. chapter 1) bβ = 1 1 X > X X > y = β 0 + X > X X > ε So, if X is a matrix of xed regressors: 1 E bβ = β 0 + X > X X > E (ε) Under assumption A3 (exogeneity), E ( εj X) = E (ε) = 0. Then, we get: E bβ = β 0 The OLS estimator is unbiased. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
101 4.1. Finite sample properties Proof (cont d) Case 2: random regressors bβ = 1 1 X > X X > y = β 0 + X > X X > ε If X is includes some random elements: E bβ 1 X = β 0 + X > X X > E ( εj X) Under assumption A3 (exogeneity), E ( εj X) = 0. Then, we get: E bβ X = β 0 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
102 4.1. Finite sample properties Proof (cont d) Case 2: random regressors The OLS estimator β b is conditionally unbiased. E bβ X = β 0 Besides, we have: E bβ = E X E bβ X = E X (β 0 ) = β 0 where E X denotes the expectation with respect to the distribution of X. So, the OLS estimator β b is unbiased. E bβ = β 0 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
103 4.1. Finite sample properties De nition (Variance of the OLS estimator, non-stochastic regressors) In the multiple linear regression model y = Xβ + ε, if the matrix X is non-stochastic, the unconditional variance covariance matrix of the OLS estimator β b is: 1 V bβ = σ 2 X > X Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
104 4.1. Finite sample properties Proof bβ = 1 1 X > X X > y = β 0 + X > X X > ε So, if X is a matrix of xed regressors: bβ > V bβ = E β0 bβ β0 1 1 = E X X > X > εε > X X > X = 1 X X > X > E εε > 1 X X > X Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
105 4.1. Finite sample properties Proof (cont d) Under assumption A4 (spherical disturbances), we have: V (ε) = E εε > = σ 2 I N The variance covariance matrix of the OLS estimator is de ned by: V bβ 1 = X X > X > E εε > X X > X 1 1 = X X > X > σ 2 I N X X > X 1 1 = σ 2 X > X X > X X > X = σ 2 X > X 1 1 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
106 4.1. Finite sample properties De nition (Variance of the OLS estimator, stochastic regressors) In the multiple linear regression model y = Xβ 0 + ε, if the matrix X is stochastic, the conditional variance covariance matrix of the OLS estimator β b is: V bβ 1 X = σ 2 X > X The unconditional variance covariance matrix is equal to: 1 V bβ = σ 2 E X X > X where E X denotes the expectation with respect to the distribution of X. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
107 4.1. Finite sample properties Proof bβ = 1 1 X > X X > y = β 0 + X > X X > ε So, if X is a stochastic matrix: bβ V bβ > X = E β0 bβ β0 X 1 1 = E X X > X > εε > X X X > X = 1 X X > X > E εε > 1 X X X > X Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
108 4.1. Finite sample properties Proof (cont d) Under assumption A4 (spherical disturbances), we have: V ( εj X) = E εε > X = σ 2 I N The conditional variance covariance matrix of the OLS estimator is de ned by: V bβ X = = 1 X X > X > E εε > X X X > X 1 1 X X > X > σ 2 I N X X > X = σ 2 X > X 1 1 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
109 4.1. Finite sample properties Proof (cont d) We have: V bβ 1 X = σ 2 X > X The (unconditional) variance covariance matrix of the OLS estimator is de ned by: V bβ = E X V bβ 1 X = σ 2 E X X > X where E X denotes the expectation with respect to the distribution of X. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
110 4.1. Finite sample properties Summary Mean Variance Cond. mean Cond. var Case 1: X stochastic Case 2: X nonstochastic E bβ = β 0 E bβ = β V bβ = σ 2 E X X > X V bβ = σ 2 X > X E bβ X = β 0 V bβ 1 X = σ 2 X X > Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
111 4.1. Finite sample properties Question How to estimate the variance covariance matrix of the OLS estimator? 1 V bβols = σ 2 X X > if X is nonstochastic 1 V bβols = σ 2 E X X > X if X is stochastic Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
112 4.1. Finite sample properties Question (cont d) De nition (Variance estimator) An unbiased estimator of the variance covariance matrix of the OLS estimator is given: bv 1 bβols = bσ 2 X > X where bσ 2 = (N K ) 1 bε > bε is an unbiased estimator of σ 2. This result holds whether X is stochastic or non stochastic. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
113 4.1. Finite sample properties Summary Variance Estimator Case 1: X stochastic 1 V bβ = σ 2 E X X > X Case 2: X nonstochastic V bβ = σ 2 X > X 1 bv 1 bβols = bσ 2 X X > bv 1 bβols = bσ 2 X > X Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
114 4.1. Finite sample properties De nition (Estimator of the variance of disturbances) Under the assumption A1-A5, in the multiple linear regression model y = Xβ + ε, the estimator bσ 2 is unbiased: E bσ 2 = σ 2 where bσ 2 = 1 N K N i=1 bε 2 i = bε> bε N K This result holds whether or not the matrix X is considered as random. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
115 4.1. Finite sample properties Proof We assume that X is stochastic. Let M denotes the projection matrix ( residual maker ) de ned by: with M = I N bε = (N,1) 1 X X > X X > M y (N,N )(N,1) The N N matrix M satis es the following properties: 1 if X is regressed on X, a perfect t will result and the residuals will be zero, so M X = 0 2 The matrix M is symmetric M > = M and idempotent M M = M Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
116 4.1. Finite sample properties Proof (cont d) The residuals are de ned as to be: bε = M y Since y = Xβ + ε, we have bε = M (Xβ + ε) = MXβ + Mε Since MX = 0, we have bε = Mε Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
117 4.1. Finite sample properties Proof (cont d) The estimator bσ 2 is based on the sum of squared residuals (SSR) bσ 2 = The expected value of the SSR is E bε > bε X bε> bε N K = ε> Mε N K = E ε > Mε X The scalar ε > Mε is a 1 1 scalar, so it is equal to its trace. tr E ε > Mε X = tr E εε > M X = tr E Mεε > X since tr (AB) = tr (AB). Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
118 4.1. Finite sample properties Proof (cont d) Since M = I N E bε > bε X 1 X X X > X > depends on X, we have: = tr E Mεε > X = tr M E εε > X Under assumptions A3 and A4, we have E εε > X = σ 2 I N As a consequence E bε > bε X = tr σ 2 M I N = σ 2 tr (M) Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
119 4.1. Finite sample properties Proof (cont d) E bε > bε X = σ 2 tr (M) = σ 2 tr I N X X > X = σ 2 tr (I N ) σ 2 tr = σ 2 tr (I N ) σ 2 tr 1 X > 1 X > X X > X 1 X > X X > X = σ 2 tr (I N ) σ 2 tr (I K ) = σ 2 (N K ) Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
120 4.1. Finite sample properties Proof (cont d) By de nition of bσ 2, we have: E bσ 2 X = E bε > bε X N K = σ2 (N K ) N K = σ 2 So, the estimator bσ 2 is conditionally unbiased. E bσ 2 = E X E bσ 2 X = E X σ 2 = σ 2 The estimator bσ 2 is unbiased: E bσ 2 = σ 2 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
121 4.1. Finite sample properties Remark Given the same principle, we can compute the variance of the estimator bσ 2. As a consequence, we have: bσ 2 = V bσ 2 X = V But, it takes... at least ten slides... bε> bε N K = ε> Mε N K 1 (N K ) V ε > Mε X bσ 2 = E X V bσ 2 X Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
122 4.1. Finite sample properties De nition (Variance of the estimator bσ 2 ) In the multiple linear regression model y = Xβ 0 + ε, the variance of the estimator bσ 2 is V bσ 2 = 2σ4 N K where σ 2 denotes the true value of variance of the error terms. This result holds whether or not the matrix X is considered as random. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
123 4.1. Finite sample properties Summary Mean Variance Cond. mean Cond. var Case 1: X stochastic Case 2: X nonstochastic E bσ 2 = σ 2 E bσ 2 = σ 2 V bσ 2 = 2σ4 N K V bσ 2 = 2σ4 N K E bσ 2 X = σ 2 V bσ 2 X = 2σ4 N K Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
124 4.1. Finite sample properties Finite sample distributions Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
125 4.1. Finite sample properties Summary 1 Under assumptions A1-A5, we can derive the two rst moments of the (unknown) nite sample distribution of the OLS estimator β, b i.e. E bβ and V bβ. 2 Are we able to characterize the nite sample distribution (or exact sampling distribution) of b β? 3 For that, we need to put an assumption on the distribution of ε and use a parametric speci cation. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
126 4.1. Finite sample properties Finite sample distribution Fact (Finite sample distribution, I) (1) In a parametric multiple linear regression model with stochastic regressors, the conditional nite sample distribution of β b is known. The unconditional nite sample distribution is generally unknown: bβ X D β b?? where D is a multivariate distribution. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
127 4.1. Finite sample properties Finite sample distribution Fact (Finite sample distribution, II) (2) In a parametric multiple linear regression model with nonstochastic regressors, the marginal (unconditional) nite sample distribution of b β is known: bβ D Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
128 4.1. Finite sample properties Finite sample distribution Fact (Finite sample distribution, III) (3) In a semi-parametric multiple linear regression model, with xed or random regressors, the nite sample distribution of β b is unknown bβ X?? β b?? Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
129 4.1. Finite sample properties Example (Linear Gaussian regression model) Under the assumption A6 (normality - linear Gaussian regression model), ε N 0,σ 2 I N bβ = 1 1 X > X X > y = β + X > X X > ε If X is a stochastic matrix, the conditional distribution of β b is bβ 1 X N β,σ 2 X > X If X is a nonstochastic matrix, the marginal distribution of β b is bβ 1 N β,σ 2 X > X Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
130 4.1. Finite sample properties Theorem (Linear gaussian regression model) Under the assumption A6 (normality), the estimators β b and bσ 2 have a nite sample distribution given by: bβ 1 N β,σ 2 X > X bσ 2 σ 2 (N K ) χ2 (N K ) Moreover, β b and bσ 2 are independent. This result holds whether or not the matrix X is considered as random. In this last case, the distribution of β b is conditional to X with V bβ 1 X = σ 2 X X >. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
131 4.1. Finite sample properties E cient estimator or BLUE? Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
132 4.1. Finite sample properties Question: OLS estimator = "good" estimator of β? The question is to know if there this estimator is preferred to other unbiased estimators? V bβols 7 V bβother In general, to answer to this question we use the FDCR or Cramer-Rao bound and study the e ciency of the estimator (cf. chapters 1 and 2). Problem: the computation of the FDCR bound requires an assumption on the distribution of ε. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
133 4.1. Finite sample properties Two cases: 1 In a parametric model, it is possible to show that the OLS estimator is e cient - its variance reaches the FDCR bound. The OLS is the Best Unbiased Estimator (BUE) 2 In a semi-parametric model: 1 we don t known the conditional distribution of y ven X and the likelihood of the sample. 2 The FDCR bound is unknown: so it is impossible to show the e ciency of the OLS estimator. 3 In a semi-parametric model, we introduce another concept: the Best Linear Unbiased Estimator (BLUE) Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
134 4.1. Finite sample properties Problem (E ciency of the OLS estimator) In a semi-parametric multiple linear regression model (with no assumption on the distribution of ε), it is impossible to compute the FDCR or Cramer Rao bound and to show that the e ciency of the OLS estimator. The e ciency of the OLS estimator can only be shown in a parametric multiple linear regression model. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
135 4.1. Finite sample properties Theorem (E ciency - Gaussian model) In a Gaussian multiple linear regression model y = Xβ 0 + ε, with ε N 0,σ 2 I N (assumption A6), the OLS estimator β b is e cient. Its variance attains the FDCR or Cramer-Rao bound: V bβ 1 = I N (β 0 ) This result holds whether or not the matrix X is considered as random. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
136 4.1. Finite sample properties Proof Let us consider the case where X is nonstochastic. Under the assumption A6, the distribution is known, since y N Xβ,σ 2 I N. The log-likelihood of the sample fy i, x i g N i=1, with x i = (x i1 :.. : x ik ) >, is then equal to: `N (β; y, x) = N 2 ln σ2 N 2 ln (2π) (y Xβ) > (y Xβ) 2σ 2 If we denote θ = β > : σ 2 >, we have (cf. chapter 2): I N (θ 0 ) = 2`N (θ; y, x) E θ θ θ > = 1 X > X 0 σ 2 21 T σ 4! Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
137 4.1. Finite sample properties Proof (cont d) I N (θ 0 ) = 1 X > X 0 σ 2 21 T σ 4 The inverse of this block matrix is given by I N (θ 0) σ2 X X > 0 21 A = I 1 N (β 0 ) I σ 2 Then we have: V bβ σ 4 T 1! 1 = I N (β 0 ) = σ2 X X > N! Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
138 4.1. Finite sample properties Problem 1 In a semi-parametric model (with no assumption on the distribution of ε), it is impossible to compute the FDCR bound and to show the e ciency of the OLS estimator. 2 The solution consists in introducing the concept of best linear unbiased estimator (BLUE): the Gauss-Markov theorem... Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
139 4.1. Finite sample properties Theorem (Gauss-Markov theorem) In the linear regression model under assumptions A1-A5, the least squares estimator b β is the best linear unbiased estimator (BLUE) of β 0 whether X is stochastic or nonstochastic. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
140 4.1. Finite sample properties Comment The estimator bβ k for k = 1,..K is the BLUE of β k Best = smallest variance Linear (in y or y i ) : bβ k = N i=1 ω ki y i Unbiased: E b β k = β k Estimator: bβ k = f (y 1,.., y N ) Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
141 4.1. Finite sample properties Remark The fact that the OLS is the BLUE means that there is no linear unbiased estimator with a smallest variance than b β BUT, nothing guarantees that a nonlinear unbiased estimator may have a smallest variance than b β Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
142 4.1. Finite sample properties Summary Model Case 1: X stochastic Case 2: X nonstochastic Semi-parametric β b is the BLUE β b is the BLUE (Gauss-Markov) (Gauss-Markov) Parametric β b is e cient and BUE β b is e cient and BUE (FDCR bound) (FDCR bound) BUE: Best Unbiased Estimator Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
143 4.1. Finite sample properties Other remarks Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
144 4.1. Finite sample properties Remarks 1 Including irrelevant variables (i.e., coe cient = 0) does not a ect the unbiasedness of the estimator. 2 Irrelevant variables decrease the precision of OLS estimator, but do not result in bias. 3 Omitting relevant regressors obviously does a ect the estimator, since the zero conditional mean assumption does not hold anymore. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
145 4.1. Finite sample properties Remarks 1 Omitting relevant variables cause bias (under some conditions on these variables). 1 The direction of the bias can be (sometimes) characterized. 2 The size of the bias is more complicated to derive. 2 More generally, correlation between a single regressor and the error term (endogeneity) usually results in all OLS estimators being biased Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
146 4.1. Finite sample properties Key Concepts 1 OLS unbiased estimator under assumption A3 (exogeneity) 2 Mean and variance of the OLS estimator under assumptions A3-A4 (exogeneity - spherical disturbances) 3 Finite sample distribution 4 Estimator of the variance of the error terms 5 BLUE estimator and e cient estimator (FDCR bound) 6 Gauss-Markov theorem Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
147 Subsection 4.2. Asymptotic properties of the OLS estimator Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
148 4.2. Asymptotic properties Objectives The objectives of this subsection are the following 1 Study the consistency of the OLS estimator. 2 Derive the asymptotic distribution of the OLS estimator β. b 3 Estimate the asymptotic variance covariance matrix of β. b Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
149 4.2. Asymptotic properties Theorem (Consistency) Let consider the multiple linear regression model y i = x > i β 0 + ε i, with x i = (x i1 :.. : x ik ) >. Under assumptions A1-A5, the OLS estimators bβ and bσ 2 are ( weakly) consistent: bβ p! β 0 bσ 2 p! σ 2 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
150 4.2. Asymptotic Properties Proof (cf. chapter 1) We have: bβ = N x i x i > i=1! 1! N x i y i = β 0 + i=1 N x i x i > i=1! 1! N x i ε i i=1 By multiplying and dividing by N, we get: bβ = β + 1 N N x i x i > i=1! 1! N 1 N x i ε i i=1 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
151 4.2. Asymptotic Properties Proof (cf. chapter 1) bβ = β + 1 N N x i x i > i=1! 1! N 1 N x i ε i i=1 1 By using the (weak) law of large number (Kitchine s theorem), we have: N 1 N x i x i > i=1 p! E x i x i > N 1 p N x i ε i! E (xi ε i ) i=1 2 By using the Slutsky s theorem: bβ p! β + E 1 x i x > i E (x i ε i ) Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
152 4.2. Asymptotic Properties Proof (cf. chapter 1) bβ p! β + E 1 x i x > i E (x i µ i ) Under assumption A3 (exogeneity): E ( ε i j x i ) = 0 ) E (εi x i ) = 0 K 1 E (ε i ) = 0 We have bβ p! β The OLS estimator b β is (weakly) consistent. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
153 4.2. Asymptotic properties Theorem (Asymptotic distribution) Consider the multiple linear regression model y i = x i > β 0 + ε i, with x i = (x i1 :.. : x ik ) >. Under assumptions A1-A5, the OLS estimator β b is asymptotically normally distributed: p N bβ β0 d! N 0, σ 2 E 1 X x i x > i or equivalently bβ asy N β 0, σ2 N E X 1 x i x i > Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
154 4.2. Asymptotic properties Comments 1 Whatever the speci cation used (parametric or semi-parametric), it is always possible to characterize the asymptotic distribution of the OLS estimator. 2 Under assumptions A1-A5, the OLS estimator b β is asymptotically normally distributed, even if ε is not normally distributed. 3 This result (asymptotic normality) is valid whether X is stochastic or nonstochastic. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
155 4.2. Asymptotic properties Proof (cf. chapter 1) This proof has be shown in the chapter 1 (Estimation theory) 1 Let us rewrite the OLS estimator as: bβ = β 0 + N x i x i > i=1! 1! N x i ε i i=1 2 Normalize the vector b β β 0 p N bβ β0 = 1 N N x i x i > i=1! 1! p N 1 N N x i ε i = AN 1 b N i=1 A N = 1 N N x i x i > b N = p N 1 N i=1 N x i ε i i=1 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
156 4.2. Asymptotic properties Proof (cf. chapter 1) 3. Using the WLLN (Kinchine s theorem) and the CMP (continuous mapping theorem). Remark: if x i this part is useless.! 1 N 1 N x i x i > p! E 1 X x i x i > i=1 4. Using the CLT: B N = p N 1 N! N x i ε i E (x i ε i ) d! N (0, V (x i ε i )) i=1 Under assumption A3 E ( ε i j x i ) = 0 =) E (x i ε i ) = 0 and V (x i ε i ) = E x i ε i ε i x i > = E X E x i ε i ε i x i > x i = E X x i V ( ε i j x i ) x i > = σ 2 E X x i x i > Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
157 4.2. Asymptotic properties Proof (cf. chapter 1) So we have p N bβ β0 with A 1 N p! A 1 = A 1 N b N A 1 = E 1 X x i x > i b N d! N (Eb, V b ) E b = 0 V b = σ 2 E X x i x > i = σ 2 A Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
158 4.2. Asymptotic properties Proof (cf. chapter 1) A 1 N p! A 1 A 1 = E 1 x i x > i b N d! N (Eb, V b ) E b = 0 V b = σ 2 E X x i x > i = σ 2 A By using the Slusky s theorem (for a convergence in distribution), we have: A 1 N b N d! N (E c, V c ) E c = A 1 E b = A 1 0 = 0 V c = A 1 V b A 1 = A 1 σ 2 A A 1 = σ 2 A 1 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
159 4.2. Asymptotic properties Proof (cf. chapter 1) p N bβ β0 = A 1 N b N d! N 0, σ 2 A 1 with A 1 = EX 1 x i > x i (be careful: it is the inverse of the expectation and not the reverse). Finally, we have: p N bβ d! β0 N 0, σ 2 EX 1 x i x i > Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
160 4.2. Asymptotic properties Remark 1 In some textbooks (Greene, 2007 for instance), the asymptotic variance covariance matrix of the OLS estimator is expressed as a function of the plim of N 1 X > X denoted by Q. Q = p lim 1 N X> X 2 Both notations are equivalent since: 1 N X> X = 1 N N x i x i > i=1 By using the WLLN (Kinchine s theorem), we have: 1 N X> X! p Q = E X x i x i > Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
161 4.2. Asymptotic properties Theorem (Asymptotic distribution) Consider the multiple linear regression model y = Xβ 0 + ε. Under assumptions A1-A5, the OLS estimator b β is asymptotically normally distributed: p N bβ β0 d! N 0, σ 2 Q 1 where or equivalently Q = p lim 1 N X> X = E X x > i x i bβ asy N β 0, σ2 N Q 1 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
162 4.2. Asymptotic properties De nition (Asymptotic variance covariance matrix) The asymptotic variance covariance matrix of the OLS estimators β b is equal to V asy bβ = σ2 N E X 1 x i x i > or equivalently V asy bβ = σ2 N Q 1 where Q = p lim N 1 X > X. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
163 4.2. Asymptotic properties Remark Finite sample variance 1 V bβ = σ 2 E X X > X Asymptotic variance V asy bβ = σ2 N E X 1 x i x i > Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
164 4.2. Asymptotic properties Remark How to estimate the asymptotic variance covariance matrix of the OLS estimator? V asy bβ = σ2 N E X 1 x i x i > Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
165 4.2. Asymptotic properties De nition (Estimator of the asymptotic variance matrix) A consistent estimator of the asymptotic variance covariance matrix is given by: bv 1 asy bβ = bσ 2 X > X where bσ 2 is consistent estimator of σ 2 : bσ 2 = bε> bε N K Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
166 4.2. Asymptotic properties Proof We known that: X > X N V asy bβ = 1 N N x i x i > i=1 = σ2 N E X 1 x i x i > bσ 2 p! σ 2 p! E X x i x > i By using the CMP and the Slutsky theorem, we get: = Q bσ 2! X > 1 X p! σ 2 EX 1 x i x i > N Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
167 4.2. Asymptotic properties Proof (cont d) As consequence bσ 2 V asy bβ X > X N = σ2 N E X 1! 1 p! σ 2 EX 1 x i x > i x i x > i bσ 2 N! X > 1 X p σ 2! N N E X 1 x i x i > 1 bσ 2 X > X p! Vasy bβ or equivalently bv asy bβ p! Vasy bβ Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
168 4.2. Asymptotic properties Remark Under assumptions A1-A5, for a stochastic matrix X, we have Variance Estimator Finite sample 1 V bβ = σ 2 E X X > X bv 1 bβ = bσ 2 X > X Asymptotic V asy bβ = σ 2 N 1 Q 1 with Q = p lim 1 N X> X bv asy bβ = bσ 2 X > X 1 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
169 4.2. Asymptotic properties Example (CAPM) The empirical analogue of the CAPM is given by: er it = α i + β i er mt + ε t er it = r it r {z ft } excess return of security i at time t where ε t is an i.i.d. error term with: er mt = (r mt r ft ) {z } market excess return at time t E (ε t ) = 0 V (ε t ) = σ 2 E ( ε t j er mt ) = 0 Data le: capm.xls for Microsoft, SP500 and Tbill (closing prices) from 11/1/1993 to 04/03/2003 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
170 4.2. Asymptotic sample properties Example (CAPM, cont d) Question: write a Matlab code in order to nd (1) the estimates of α and β for Microsoft, (2) the corresponding standard errors, (3) the SE of the regression, (4) the SSR, (5) the coe cient of determination and (6) the adjusted R 2. Compare your results to the following table (Eviews) Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
171 4.2. Asymptotic sample properties Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
172 4.2. Asymptotic sample properties Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
173 4.2. Asymptotic sample properties Key Concepts 1 The OLS estimator is weakly consist 2 The OLS estimator is asymptotically normally distributed 3 Asymptotic variance covariance matrix 4 Estimator of the asymptotic variance Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
174 End of Chapter 3 Christophe Hurlin (University of Orléans) Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 23, / 174
Chapter 4: Statistical Hypothesis Testing
Chapter 4: Statistical Hypothesis Testing Christophe Hurlin November 20, 2015 Christophe Hurlin () Advanced Econometrics - Master ESA November 20, 2015 1 / 225 Section 1 Introduction Christophe Hurlin
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
Chapter 2. Dynamic panel data models
Chapter 2. Dynamic panel data models Master of Science in Economics - University of Geneva Christophe Hurlin, Université d Orléans Université d Orléans April 2010 Introduction De nition We now consider
Econometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
Introduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
1 Another method of estimation: least squares
1 Another method of estimation: least squares erm: -estim.tex, Dec8, 009: 6 p.m. (draft - typos/writos likely exist) Corrections, comments, suggestions welcome. 1.1 Least squares in general Assume Y i
Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes
Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes Yong Bao a, Aman Ullah b, Yun Wang c, and Jun Yu d a Purdue University, IN, USA b University of California, Riverside, CA, USA
Panel Data Econometrics
Panel Data Econometrics Master of Science in Economics - University of Geneva Christophe Hurlin, Université d Orléans University of Orléans January 2010 De nition A longitudinal, or panel, data set is
CAPM, Arbitrage, and Linear Factor Models
CAPM, Arbitrage, and Linear Factor Models CAPM, Arbitrage, Linear Factor Models 1/ 41 Introduction We now assume all investors actually choose mean-variance e cient portfolios. By equating these investors
160 CHAPTER 4. VECTOR SPACES
160 CHAPTER 4. VECTOR SPACES 4. Rank and Nullity In this section, we look at relationships between the row space, column space, null space of a matrix and its transpose. We will derive fundamental results
IDENTIFICATION IN A CLASS OF NONPARAMETRIC SIMULTANEOUS EQUATIONS MODELS. Steven T. Berry and Philip A. Haile. March 2011 Revised April 2011
IDENTIFICATION IN A CLASS OF NONPARAMETRIC SIMULTANEOUS EQUATIONS MODELS By Steven T. Berry and Philip A. Haile March 2011 Revised April 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1787R COWLES FOUNDATION
2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems
Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems Robert J. Rossana Department of Economics, 04 F/AB, Wayne State University, Detroit MI 480 E-Mail: [email protected]
SYSTEMS OF REGRESSION EQUATIONS
SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations
3.1 Least squares in matrix form
118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression
The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables. Kathleen M. Lang* Boston College.
The Loss in Efficiency from Using Grouped Data to Estimate Coefficients of Group Level Variables Kathleen M. Lang* Boston College and Peter Gottschalk Boston College Abstract We derive the efficiency loss
Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem
Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become
Lecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
Multivariate Normal Distribution
Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues
Portfolio selection based on upper and lower exponential possibility distributions
European Journal of Operational Research 114 (1999) 115±126 Theory and Methodology Portfolio selection based on upper and lower exponential possibility distributions Hideo Tanaka *, Peijun Guo Department
Quadratic forms Cochran s theorem, degrees of freedom, and all that
Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, [email protected] Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us
Multivariate normal distribution and testing for means (see MKB Ch 3)
Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................
1 Introduction to Matrices
1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns
1.2 Solving a System of Linear Equations
1.. SOLVING A SYSTEM OF LINEAR EQUATIONS 1. Solving a System of Linear Equations 1..1 Simple Systems - Basic De nitions As noticed above, the general form of a linear system of m equations in n variables
Topic 5: Stochastic Growth and Real Business Cycles
Topic 5: Stochastic Growth and Real Business Cycles Yulei Luo SEF of HKU October 1, 2015 Luo, Y. (SEF of HKU) Macro Theory October 1, 2015 1 / 45 Lag Operators The lag operator (L) is de ned as Similar
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
Exact Nonparametric Tests for Comparing Means - A Personal Summary
Exact Nonparametric Tests for Comparing Means - A Personal Summary Karl H. Schlag European University Institute 1 December 14, 2006 1 Economics Department, European University Institute. Via della Piazzuola
Review Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
Empirical Methods in Applied Economics
Empirical Methods in Applied Economics Jörn-Ste en Pischke LSE October 2005 1 Observational Studies and Regression 1.1 Conditional Randomization Again When we discussed experiments, we discussed already
INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)
INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its
Review of Bivariate Regression
Review of Bivariate Regression A.Colin Cameron Department of Economics University of California - Davis [email protected] October 27, 2006 Abstract This provides a review of material covered in an
Clustering in the Linear Model
Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple
SF2940: Probability theory Lecture 8: Multivariate Normal Distribution
SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,
c 2008 Je rey A. Miron We have described the constraints that a consumer faces, i.e., discussed the budget constraint.
Lecture 2b: Utility c 2008 Je rey A. Miron Outline: 1. Introduction 2. Utility: A De nition 3. Monotonic Transformations 4. Cardinal Utility 5. Constructing a Utility Function 6. Examples of Utility Functions
Master s Theory Exam Spring 2006
Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem
problem arises when only a non-random sample is available differs from censored regression model in that x i is also unobserved
4 Data Issues 4.1 Truncated Regression population model y i = x i β + ε i, ε i N(0, σ 2 ) given a random sample, {y i, x i } N i=1, then OLS is consistent and efficient problem arises when only a non-random
Econometric Methods for Panel Data
Based on the books by Baltagi: Econometric Analysis of Panel Data and by Hsiao: Analysis of Panel Data Robert M. Kunst [email protected] University of Vienna and Institute for Advanced Studies
On Marginal Effects in Semiparametric Censored Regression Models
On Marginal Effects in Semiparametric Censored Regression Models Bo E. Honoré September 3, 2008 Introduction It is often argued that estimation of semiparametric censored regression models such as the
1 Short Introduction to Time Series
ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The
Advanced Microeconomics
Advanced Microeconomics Ordinal preference theory Harald Wiese University of Leipzig Harald Wiese (University of Leipzig) Advanced Microeconomics 1 / 68 Part A. Basic decision and preference theory 1 Decisions
1 Teaching notes on GMM 1.
Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in
Mean squared error matrix comparison of least aquares and Stein-rule estimators for regression coefficients under non-normal disturbances
METRON - International Journal of Statistics 2008, vol. LXVI, n. 3, pp. 285-298 SHALABH HELGE TOUTENBURG CHRISTIAN HEUMANN Mean squared error matrix comparison of least aquares and Stein-rule estimators
Poisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
The Real Business Cycle Model
The Real Business Cycle Model Ester Faia Goethe University Frankfurt Nov 2015 Ester Faia (Goethe University Frankfurt) RBC Nov 2015 1 / 27 Introduction The RBC model explains the co-movements in the uctuations
Credit Risk Models: An Overview
Credit Risk Models: An Overview Paul Embrechts, Rüdiger Frey, Alexander McNeil ETH Zürich c 2003 (Embrechts, Frey, McNeil) A. Multivariate Models for Portfolio Credit Risk 1. Modelling Dependent Defaults:
15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
A SURVEY ON CONTINUOUS ELLIPTICAL VECTOR DISTRIBUTIONS
A SURVEY ON CONTINUOUS ELLIPTICAL VECTOR DISTRIBUTIONS Eusebio GÓMEZ, Miguel A. GÓMEZ-VILLEGAS and J. Miguel MARÍN Abstract In this paper it is taken up a revision and characterization of the class of
Representation of functions as power series
Representation of functions as power series Dr. Philippe B. Laval Kennesaw State University November 9, 008 Abstract This document is a summary of the theory and techniques used to represent functions
Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
Inner Product Spaces
Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and
Solución del Examen Tipo: 1
Solución del Examen Tipo: 1 Universidad Carlos III de Madrid ECONOMETRICS Academic year 2009/10 FINAL EXAM May 17, 2010 DURATION: 2 HOURS 1. Assume that model (III) verifies the assumptions of the classical
1. Introduction to multivariate data
. Introduction to multivariate data. Books Chat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall Krzanowski, W.J. Principles of multivariate analysis. Oxford.000 Johnson,
Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components
Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they
Section 2.4: Equations of Lines and Planes
Section.4: Equations of Lines and Planes An equation of three variable F (x, y, z) 0 is called an equation of a surface S if For instance, (x 1, y 1, z 1 ) S if and only if F (x 1, y 1, z 1 ) 0. x + y
THE LEAST SQUARES ESTIMATOR Q
4 THE LEAST SQUARES ESTIMATOR Q 4.1 INTRODUCTION Chapter 3 treated fitting the linear regression to the data by least squares as a purely algebraic exercise. In this chapter, we will examine in detail
Elasticity Theory Basics
G22.3033-002: Topics in Computer Graphics: Lecture #7 Geometric Modeling New York University Elasticity Theory Basics Lecture #7: 20 October 2003 Lecturer: Denis Zorin Scribe: Adrian Secord, Yotam Gingold
Least Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
NOTES ON LINEAR TRANSFORMATIONS
NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all
Consistent cotrending rank selection when both stochastic and. nonlinear deterministic trends are present
Consistent cotrending rank selection when both stochastic and nonlinear deterministic trends are present Zheng-Feng Guo and Mototsugu Shintani y This version: April 2010 Abstract This paper proposes a
Conditional Investment-Cash Flow Sensitivities and Financing Constraints
WORING PAPERS IN ECONOMICS No 448 Conditional Investment-Cash Flow Sensitivities and Financing Constraints Stephen R. Bond and Måns Söderbom May 2010 ISSN 1403-2473 (print) ISSN 1403-2465 (online) Department
Orthogonal Projections
Orthogonal Projections and Reflections (with exercises) by D. Klain Version.. Corrections and comments are welcome! Orthogonal Projections Let X,..., X k be a family of linearly independent (column) vectors
Some probability and statistics
Appendix A Some probability and statistics A Probabilities, random variables and their distribution We summarize a few of the basic concepts of random variables, usually denoted by capital letters, X,Y,
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and
171:290 Model Selection Lecture II: The Akaike Information Criterion
171:290 Model Selection Lecture II: The Akaike Information Criterion Department of Biostatistics Department of Statistics and Actuarial Science August 28, 2012 Introduction AIC, the Akaike Information
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a
Similarity and Diagonalization. Similar Matrices
MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that
Chapter 10: Basic Linear Unobserved Effects Panel Data. Models:
Chapter 10: Basic Linear Unobserved Effects Panel Data Models: Microeconomic Econometrics I Spring 2010 10.1 Motivation: The Omitted Variables Problem We are interested in the partial effects of the observable
Note 2 to Computer class: Standard mis-specification tests
Note 2 to Computer class: Standard mis-specification tests Ragnar Nymoen September 2, 2013 1 Why mis-specification testing of econometric models? As econometricians we must relate to the fact that the
α = u v. In other words, Orthogonal Projection
Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v
THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok
THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE Alexer Barvinok Papers are available at http://www.math.lsa.umich.edu/ barvinok/papers.html This is a joint work with J.A. Hartigan
Linear Algebra Notes for Marsden and Tromba Vector Calculus
Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of
Simultaneous Prediction of Actual and Average Values of Study Variable Using Stein-rule Estimators
Shalabh & Christian Heumann Simultaneous Prediction of Actual and Average Values of Study Variable Using Stein-rule Estimators Technical Report Number 104, 2011 Department of Statistics University of Munich
Time Series Analysis
Time Series Analysis [email protected] Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby 1 Outline of the lecture Identification of univariate time series models, cont.:
Chapter 1. Linear Panel Models and Heterogeneity
Chapter 1. Linear Panel Models and Heterogeneity Master of Science in Economics - University of Geneva Christophe Hurlin, Université d Orléans Université d Orléans January 2010 Introduction Speci cation
Linear Algebra Notes
Linear Algebra Notes Chapter 19 KERNEL AND IMAGE OF A MATRIX Take an n m matrix a 11 a 12 a 1m a 21 a 22 a 2m a n1 a n2 a nm and think of it as a function A : R m R n The kernel of A is defined as Note
Inner product. Definition of inner product
Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product
Notes on Symmetric Matrices
CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.
FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS
FIXED EFFECTS AND RELATED ESTIMATORS FOR CORRELATED RANDOM COEFFICIENT AND TREATMENT EFFECT PANEL DATA MODELS Jeffrey M. Wooldridge Department of Economics Michigan State University East Lansing, MI 48824-1038
Section 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
From the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
Lecture 5 Least-squares
EE263 Autumn 2007-08 Stephen Boyd Lecture 5 Least-squares least-squares (approximate) solution of overdetermined equations projection and orthogonality principle least-squares estimation BLUE property
The Binomial Distribution
The Binomial Distribution James H. Steiger November 10, 00 1 Topics for this Module 1. The Binomial Process. The Binomial Random Variable. The Binomial Distribution (a) Computing the Binomial pdf (b) Computing
MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.
MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column
PS 271B: Quantitative Methods II. Lecture Notes
PS 271B: Quantitative Methods II Lecture Notes Langche Zeng [email protected] The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.
Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers
Variance Reduction The statistical efficiency of Monte Carlo simulation can be measured by the variance of its output If this variance can be lowered without changing the expected value, fewer replications
HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009
HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal
Vector and Matrix Norms
Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty
2DI36 Statistics. 2DI36 Part II (Chapter 7 of MR)
2DI36 Statistics 2DI36 Part II (Chapter 7 of MR) What Have we Done so Far? Last time we introduced the concept of a dataset and seen how we can represent it in various ways But, how did this dataset came
Chapter 4: Vector Autoregressive Models
Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...
Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors
Comparing Features of Convenient Estimators for Binary Choice Models With Endogenous Regressors Arthur Lewbel, Yingying Dong, and Thomas Tao Yang Boston College, University of California Irvine, and Boston
y t by left multiplication with 1 (L) as y t = 1 (L) t =ª(L) t 2.5 Variance decomposition and innovation accounting Consider the VAR(p) model where
. Variance decomposition and innovation accounting Consider the VAR(p) model where (L)y t = t, (L) =I m L L p L p is the lag polynomial of order p with m m coe±cient matrices i, i =,...p. Provided that
Math 312 Homework 1 Solutions
Math 31 Homework 1 Solutions Last modified: July 15, 01 This homework is due on Thursday, July 1th, 01 at 1:10pm Please turn it in during class, or in my mailbox in the main math office (next to 4W1) Please
Lecture 1: Systems of Linear Equations
MTH Elementary Matrix Algebra Professor Chao Huang Department of Mathematics and Statistics Wright State University Lecture 1 Systems of Linear Equations ² Systems of two linear equations with two variables
Chapter 6: Multivariate Cointegration Analysis
Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration
Introduction to Matrix Algebra
Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary
Linear Algebra Review. Vectors
Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka [email protected] http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length
