2 Testing under Normality Assumption


 Cameron Howard Harrison
 1 years ago
 Views:
Transcription
1 1 Hypothesis testing A statistical test is a method of making a decision about one hypothesis (the null hypothesis in comparison with another one (the alternative using a sample of observations of known size. A statistical test is not a proof per se. Accepting the null hypothesis (H 0 doesn t mean it is true, but just that the available observations are not incompatible with this hypothesis, and that there is not enough evidence to favour the alternative hypothesis over the null hypothesis. There are 3 steps: 1. First specify a ull Hypothesis, usually denoted H 0, which describes a model of interest. Usually, we express H 0 as a restricted version of a more general model. 2. Then, construct a test statistic, which is a random variable (because it is a function of other random variables with two features: (a it has a known distribution under the ull Hypothesis (usually, normal or chisquare, t or F. Its distribution is known either because we assume enough about the distribution of the model disturbances to get smallsample distributions, or we assume enough to get asymptotic distributions. (b this known distribution may depend on data, but not on parameters (this is called pivotality: a test statistic is pivotal if it satisfies this condition. 3. Check whether or not the sample value of the test statistic is very far out in its sampling distribution. When we perform a test, we may end up rejecting the null hypothesis even though it is true. In this case we are committing the socalled Type I error. The probability of type I error is the significance level (or size of the test. It is also possible that we fail to reject the null hypothesis even though it is false. In this case we are committing the socalled Type II error. The probability of type II error is the power of the test. 2 Testing under ormality Assumption 2.1 Properties of OLS estimators Suppose Y = Xβ + ε, ε (0, σ 2 I, 1
2 and X is full rank with rank K. Then, (i β = (X X 1 X Y ( β, σ 2 (X X 1 (ii e e σ 2 χ2 ( K (iii s 2 = e e K is an unbiased estimator of σ2 and is independent of β, where e = Y βx. Proof. (i The fact that the disturbances are independent meanzero normals, ε (0, σ 2 I, implies E [X ε] = 0 K and E [εε ] = σ 2 I,so the OLS estimator is still BLUE: ] E [ β = β, V = σ 2 (X X 1. Write out β as a function of ε as follows: β = (X X 1 X Y = β + (X X 1 X ε is a linear combination of a normally distributed vector. Since, for any vector x (µ, Σ, (a + Ax (a + µ, AΣA (See Kennedy s All About Variances Appendix, we have β ( β + 0 K, (X X 1 X σ 2 I X(X X 1, or β ( β, σ 2 (X X 1. (ii The residual vector can be written as e = Y X β = Y P X Y = (I P X Y = M X Y = M X Xβ + M X ε = M X ε where M X is the residual projection matrix that creates the residuals from a regression of something on X: M X = I X(X X 1 X = I P X The matrix M X is a projection matrix that is, M X is symmetric (i.e. M X = M X, idempotent (i.e. M 2 X = M X, and rank(m X =rank(irank(p X = tr(p X = K. 2
3 We can then write e e σ 2 = ε σ M XM ε X σ = ε σ M X Since ε σ (0, I, it follows that e e σ 2 = ε σ M ε X σ χ2 (rank(m X = χ 2 ( K [ e ] e (iii From (ii we have E σ 2 = K, so that E [ [ s 2] e ] e = E = σ 2. K To proof that s 2 and β are independent, it is sufficient to show that the normal random variables e and β are uncorrelated. cov(e, β = cov(m X ε, (X X 1 X y = M X cov(ε, yx(x X 1 = σ 2 M X X(X X 1 = 0 because M X X = Test of equalities We consider 3 types of tests of equalities: single linear, multiple linear, and general nonlinear. Tests of equalities are fully specified when you specify the ull hypothesis: the ull is either true or not true, and you don t care how exactly it isn t true, just that it isn t true Single linear tests: ztest and ttest A single linear test could be written as Rβ r = 0, where R is 1 K and r is a scalar. The discrepancy vector, d, is the sample value of the null hypothesis evaluated at the sample estimate of β, β. Using the terminology above, for a linear hypothesis, d = R β r, Even if the hypothesis is true, we would not expect d to be exactly zero, because β is not exactly equal to β. However, if the hypothesis is true, we would expect d to be close to 0. In contrast if the hypothesis is false, we d have no real prior about where we d see d. 1. The ztest is performed by the zstatistic defined by T z = 1 σ ε σ ( R(X X 1 R 1/2 d (0, 1, It is called a ztest because it follows a standard normal distribution and standard normal variables are often denoted z. 3
4 2. The ttest is performed by using s in place of σ in the above formula when σ 2 is unknown. This corresponds to taking σ s times T z. Tz s = σ s T z σ (0, 1. s So, if we need to figure out the distribution of σ s (0, 1. Recall that: then ( K s2 σ 2 = e e σ 2 χ2 K, s s σ = 2 σ 2 = χ 2 K K, the squareroot of a chisquare divided by its own degrees of freedom. Returning to the expression, we have Tz s = σ s T z σ (0, 1 s (0, 1. χ 2 K K The distribution of a normal divided by a square root of a chisquare divided by its own degrees of freedom is called a Student s t distribution, denoted t K, where K is the number of degrees of freedom in the denominator. The test statistic is then called a t Test Statistic T t = 1 ( R(X X 1 R 1/2 (0, 1 d = t s χ 2 K, K K where t K means t distribution with K degrees of freedom which means a standard normal divided by the square root of a chisquare divided by its own degrees of freedom. 3. The ztest and the ttest are related in practise. The ztest requires knowledge of σ 2, whereas the ttest uses an estimate of it. However, when the sample is very large, the estimate of σ 2 is very close to its true value, so the estimate is almost the same as prior knowledge. This means that the ztest and ttest statistics have nearly the same distribution when the sample is large. 4. Examples: (a An exclusion restriction, e.g., that the second variable does not belong in the model would have R = [ ], r = 0. 4
5 (b A symmetry restriction, e.g., that the second and third variables had identical effects, would have R = [ ], r = 0. (c A value restriction, e.g, that the second variable s coefficient is 1, would have R = [ ], r = Multiple Linear Tests: the finitesample Ftest A multiple linear test could be written as Rβ r = 0, where R is a J K matrix of rank J and r is a J vector. 1. Since T z = 1 σ ( R(X X 1 R 1/2 d (0, IJ, we have 1 σ 2 d ( R(X X 1 R 1 d = T z T z χ 2 J. This provides a test statistic distributed as a χ 2 J, called a Wald Test. This test requires knowledge of σ If we substitute in s 2 for σ 2, then we have 1 s d ( 2 R(X X 1 R 1 σ d 2 χ 2 = s 2 χ2 J J = χ 2 K / K by the same reasoning as for the Single Linear ttest. If we divide the numerator by J, we get a ratio of chisquareds divided by their own degrees of freedom which follows the socalled F distribution, with degrees of freedom given by its numerator and denominator degrees of freedom. This test uses the estimate s 2. The resulting test statistic is then called a Ftest statistic 3. Examples: F = 1 Js 2 d ( R(X X 1 R 1 d = χ 2 J /J χ 2 K / K F J, K. 5
6 (a A set of exclusion restrictions, e.g., that the second and third variables do not belong in the model, would have [ ] R =, [ ] 0 r =. 0 (b A set of symmetry restrictions, that the first, second and third variables all have the same coefficients, would have [ ] R =, [ ] 0 r =. 0 (c Given that we write the restriction as Rβ r = 0 for both single and multiple linear hypotheses, you can think of the single hypothesis as a case of the multiple hypothesis. 3 Testing without the ormality Assumption Without the ormality assumption, we can still get the approximate distributions of test statistics in large samples. In this case, the law of large numbers, the central limit theorem and the Slutsky s lemma can be invoked. 3.1 Properties of OLS estimators Suppose Y = Xβ + ε, E [X ε] = 0 K, E [εε ] = σ 2 I, Let β = (X X 1 X Y and e = Y βx be the OLS estimator and residual. Then, the following properties follow from the law of large numbers, the central limit theorem and the Slutsky s lemma: (i β = approx ( β, σ 2 (X X 1 e e (ii s 2 = K is a consistent estimator of σ2. Proof. (i β = β + (X X 1 X ε, so ( X 1 X X ε β =. 6
7 By the central limit theorem, X ε = ( 1 where X i is the i th row of X. (ii s 2 = ( e e K = ε M X ε K = ε ε K = K ε ε By the law of large numbers, = 1 1 i=1 X P iε i E [X ε] = 0 K, as. 3.2 Wald Tests Linear hypothesis i=1 X iε i approx ( 0, σ 2 (X X, ε X(X X 1 X ε ( ε ( ε ε X X X i=1 ε2 i 1 X ε (1 (2 (3 P E [ ε 2] = σ 2, and X ε = Consider a multiple linear hypothesis with a linear model and possibly nonnormal (but finite variance ε: Y = Xβ + ε, E [X ε] = 0 K, E [εε ] = σ 2 I, H 0 : Rβ r = 0. Here, ε may be nonnormal (for example uniform as long as ε is finitevariance. Since β approx (β, σ 2 (X X 1 the Wald vector defined by ( T W v = 1 σ R(X X 1 R 1/2 d is asympotically approximately a vector of standard normals (0 J, I J. Hence its inner product, called a Wald Test, is asymptotically approximately a chisquare: T W = T W vt W v = 1 σ 2 d ( R(X X 1 R 1 d approx χ2 J. Moreover, because s 2 is asymptotically equal to σ 2 the approximation is not affected by replacing σ 2 with s 2 when σ 2 is unknown. Thus, we have that T s W = T s W vt s W v = 1 s 2 d ( R(X X 1 R 1 d approx χ2 J. The Wald Statistic approximately follows the chisquare distribution as the sample size gets really large, even if one uses s 2. 7
8 3.2.2 onlinear hypothesis A multiple nonlinear test could be written as c(β = 0, where c is a J vector function of β. Consider the model in which we have a set of J nonlinear restrictions c(β = 0 that we wish to test: Y = Xβ + ε, E [X ε] = 0 K, E [εε ] = σ 2 I, H 0 : c(β = 0. The discrepancy vector d gives the distance between the sample value of the hypothesis and its hypothesized value of 0: d = c. Since we have not assumed normality of the ε, all we have for the finitesample distribution of β is a mean and variance: ] E [ β = β, [ β] V = σ 2 (X X 1. Application of the deltamethod allows us to calcuate an approximate asymptotic distribution of c :By firstorder Taylor approximation, we have c c(β + β c (β β = β c (β β where β c (β is the matrix of derivatives of the vectorfunction c(β with respect to the rowvector β (each row of β c (β gives the derivatives of an element of c(β with respect to β. Since β β approx (0, σ2 (X X 1, then c approx ( 0 J, σ 2 β c (β (X X 1 ( β c (β. Since β goes to β asymptotically, we can replace β c (β with β c c approx (0 J, σ 2 β c ( (X X 1 β c. : 8
9 ow, we use this information to create the Wald Statistic. Premultiplying the sample value of the hypothesis by the minusonehalf matrix of its variance gives the Wald Vector distributed as a vector of standard normals: T W v = 1 σ ( β c ( 1/2 (X X 1 β c c approx (0 J, I J. Finally, we take the inner product of this to create the Wald Statistic T W = 1 ( σ 2 c β c ( 1 (X X 1 β c c approx χ2 J. Since this is an approximate asymptotic result, it also works with s instead of σ: T W = 1 ( s 2 c β c ( 1 (X X 1 β c c approx χ2 J. 4 Testing the exogeneity of regressors: the Hausman test Suppose we have the model Y = Xβ + ε, E [εε ] = σ 2 I, This model can be estimated by Instrumental variables using 2SLS method. However, if the regressors X are exogeneous, then 2SLS estimator is less efficient (i.e. larger variance. Therefore, it is important to test for exogeneity first, in order to avoid using an IV estimator that is: (i More computationally intensive (two stages is more difficult than one and (ii less efficient. An exogeneity test for the regresssors X could be written as H 0 : E [X ε] = 0 K, Instrumental variable estimation requires to find an instrument Z such that E (Z X 0, and E (Z ε = 0 where rank(z=j > K. Let β OLS be the ordinary least squares estimator and β 2SLS be twostage least squares instrumental variable estimation of β: β OLS = (X X 1 X Y, β2sls = (X P Z X 1 X P Z Y 9
10 where P Z = Z(Z Z 1 Z If the regressors X are endogeneous, then the OLS estimates should differ from the endogeneitycorrected 2SLS estimates (as long as the instruments are exogenous. An exogeneity test can therefore be based on the difference between β 2SLS and β OLS. The test of this hypothesis is called a Hausman Test. By the central limit theorem, we have, under H 0, β OLS β approx ( 0, σ 2 (X X 1, β2sls β approx ( 0, σ 2 (X P Z X 1 Hence, β 2SLS β OLS approx ( 0, σ 2 [ (X P Z X 1 (X X 1] ow, we use this information to create a Wald Statistic. Premultiplying the difference of the two estimators by the minusonehalf matrix of its variance gives the Wald Vector distributed as a vector of standard normals: T w = 1 σ [ (X P Z X 1 (X X 1] 1/2 2SLS β OLS approx (0, I K Hence its inner product is asymptotically approximately a chisquare: T wt w = 1 σ 2 2SLS β OLS [(X P Z X 1 (X X 1] 1 2SLS β OLS Since this is an approximate asymptotic result, it also works with s 2 instead of σ 2. So the HausmanWald test statistic for the exogeneity of regressors is H w = 1 s 2 2SLS β OLS [(X P Z X 1 (X X 1] 1 2SLS β OLS ote: we are assuming here that the matrix V = (X P Z X 1 (X X 1 is positive definite. The more general approach is to take a generalized inverse (instead of the inverse in the formula of H w. In that case the statistic H w has an asymptotic approximate chisquared distribution with degree of freedom equal to the rank of the matrix V. 5 Testing the validity of instruments: Overidentification test This is a test that will tell you if the instruments Z are uncorrelated with the error term ε, an essential condition for the validity of instrumental variables. When this condition is not satisfied, the IV estimator could be inconsistent. approx χ2 (K approx χ2 (K 10
11 The degree of overidenfication of an overidentified linear regression model is defined to be J K, where J=rank(Z is the number of instruments and K=rank(X the number of regressors. The model we wish to test is Y = Xβ + ε, E [εε ] = σ 2 I, H 0 : E [Z ε] = 0 Denote e = Y X β 2SLS the residual of the 2SLS estimation of the model. The test statistic is based on the vector 1 σ P Ze = [ P Z P Z X(X P Z X 1 X ] ε P Z σ which is the projection of the residual vector e over the vector space of instruments Z. Orthogonality between e and Z therefore implies that P Z e is equal to 0. An appropriate test statistic can therefore be based on the squared eulidean distance between P Z e and 0, that is, (P Z e P Z e. The rank of P Z is J and the rank of P Z X(X P Z X 1 X P Z is K, so the rank of the (idempotent, symmetric matrix P Z P Z X(X P Z X 1 X P Z is J K. Since ε σ is of mean 0 and variance matrix I, 1 σ 2 e P Z e = ( 1 σ P Ze ( 1 σ P Ze is the sum of squares of J K things that have mean 0 and variance 1. By the central limit theorem, each of these things is approximately asymptotically normal (0, 1. Hence 1 σ 2 e P Z e is approximately asymptotically chisquare with J K degrees of freedom. Since this is an approximate asymptotic result, it also works with s 2 instead of σ 2, where σ 2 = e e = is a consistent estimator of σ2. So the test statistic for the validity of instruments is given by Q = 1 σ 2 e P Z e approx χ2 (J K The same test statistic can be obtained by considering the synthetic regression e = Zγ + u The estimate of γ is γ = (Z Z 1 Z e and the predicted value of e from the synthetic regression is ê = Z γ = Z(Z Z 1 Z e. The sum of squares of this 11
12 value (the explained sum of squares is by definition ESS = ê ê = e Z(Z Z 1 Z Z(Z Z 1 Z e = e Z(Z Z 1 Z e = e P Z e, while the total sum of squares is T SS = e e. By definition, the R 2 of this synthetic model is R 2 = ESS T SS = e P Z e e e. So Q = e P Z e σ 2 = e P Z e e e = R 2 where R 2 is from the second stage regression of the predicted errors e on all exogeneous factors in the model. 6 Testing heteroskedasticity: the White test The model we wish to test is Y = Xβ + ε, E [X ε] = 0 K, H 0 : E [εε ] = σ 2 I, The alternative hypothesis of the White test is that the error variance is affected by any of the regressors and their squares (and possibly their cross product. It therefore tests whether or not any heteroskedasticity present causes the variance matrix of the OLS estimator to differ from its usual formula. The steps to build the test are as follows: 1. Run the OLS regression and take the vector of residuals e = Y X β 2. Run the synthetic regression of the residuals over the regressors and their squares e = Xγ + X 2 θ + u where X 2 is the K 1 matrix of the squares of the regressors (excluding the column of constants. ow, take the coefficient of determination of this synthetic regression, R The White test statistic is defined by W = R 2 and is approximately asymptotically distributed as χ 2 (2K. 12
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationInstrumental Variables & 2SLS
Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20  Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental
More informationInstrumental Variables & 2SLS
Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20  Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental
More informationInstrumental Variables Regression. Instrumental Variables (IV) estimation is used when the model has endogenous s.
Instrumental Variables Regression Instrumental Variables (IV) estimation is used when the model has endogenous s. IV can thus be used to address the following important threats to internal validity: Omitted
More informationRecall that two vectors in are perpendicular or orthogonal provided that their dot
Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal
More informationVariance of OLS Estimators and Hypothesis Testing. Randomness in the model. GM assumptions. Notes. Notes. Notes. Charlie Gibbons ARE 212.
Variance of OLS Estimators and Hypothesis Testing Charlie Gibbons ARE 212 Spring 2011 Randomness in the model Considering the model what is random? Y = X β + ɛ, β is a parameter and not random, X may be
More informationHypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam
Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationEstimation and Inference in Cointegration Models Economics 582
Estimation and Inference in Cointegration Models Economics 582 Eric Zivot May 17, 2012 Tests for Cointegration Let the ( 1) vector Y be (1). Recall, Y is cointegrated with 0 cointegrating vectors if there
More informationHYPOTHESIS TESTS AND MODEL SELECTION Q
5 HYPOTHESIS TESTS AND MODEL SELECTION Q 5.1 INTRODUCTION The linear regression model is used for three major purposes: estimation and prediction, which were the subjects of the previous chapter, and hypothesis
More informationRegression III: Advanced Methods
Lecture 5: Linear leastsquares Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Simple Linear Regression
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models  part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK2800 Kgs. Lyngby
More informationSYSTEMS OF REGRESSION EQUATIONS
SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations
More informationttests and Ftests in regression
ttests and Ftests in regression Johan A. Elkink University College Dublin 5 April 2012 Johan A. Elkink (UCD) t and Ftests 5 April 2012 1 / 25 Outline 1 Simple linear regression Model Variance and R
More informationCointegration. Basic Ideas and Key results. Egon Zakrajšek Division of Monetary Affairs Federal Reserve Board
Cointegration Basic Ideas and Key results Egon Zakrajšek Division of Monetary Affairs Federal Reserve Board Summer School in Financial Mathematics Faculty of Mathematics & Physics University of Ljubljana
More informationCHAPTER 6. SIMULTANEOUS EQUATIONS
Economics 24B Daniel McFadden 1999 1. INTRODUCTION CHAPTER 6. SIMULTANEOUS EQUATIONS Economic systems are usually described in terms of the behavior of various economic agents, and the equilibrium that
More informationS6: Spatial regression models: : OLS estimation and testing
: Spatial regression models: OLS estimation and testing Course on Spatial Econometrics with Applications Profesora: Coro Chasco Yrigoyen Universidad Autónoma de Madrid Lugar: Universidad Politécnica de
More informationELECE8104 Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems
Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems Minimum Mean Square Error (MMSE) MMSE estimation of Gaussian random vectors Linear MMSE estimator for arbitrarily distributed
More informationInferential Statistics
Inferential Statistics Sampling and the normal distribution Zscores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are
More informationMultiple Hypothesis Testing: The Ftest
Multiple Hypothesis Testing: The Ftest Matt Blackwell December 3, 2008 1 A bit of review When moving into the matrix version of linear regression, it is easy to lose sight of the big picture and get lost
More informationThe basic unit in matrix algebra is a matrix, generally expressed as: a 11 a 12. a 13 A = a 21 a 22 a 23
(copyright by Scott M Lynch, February 2003) Brief Matrix Algebra Review (Soc 504) Matrix algebra is a form of mathematics that allows compact notation for, and mathematical manipulation of, highdimensional
More informationWooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares
Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit
More information1 The Problem: Endogeneity There are two kinds of variables in our models: exogenous variables and endogenous variables. Endogenous Variables: These a
Notes on Simultaneous Equations and Two Stage Least Squares Estimates Copyright  Jonathan Nagler; April 19, 1999 1. Basic Description of 2SLS ffl The endogeneity problem, and the bias of OLS. ffl The
More informationSimilarity and Diagonalization. Similar Matrices
MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that
More informationTesting Linearity against Nonlinearity and Detecting Common Nonlinear Components for Industrial Production of Sweden and Finland
Testing Linearity against Nonlinearity and Detecting Common Nonlinear Components for Industrial Production of Sweden and Finland Feng Li Supervisor: Changli He Master thesis in statistics School of Economics
More informationInner products on R n, and more
Inner products on R n, and more Peyam Ryan Tabrizian Friday, April 12th, 2013 1 Introduction You might be wondering: Are there inner products on R n that are not the usual dot product x y = x 1 y 1 + +
More informationECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
More informationNOTES ON LINEAR TRANSFORMATIONS
NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all
More informationPlease follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
More informationEconometric Analysis of Cross Section and Panel Data Second Edition. Jeffrey M. Wooldridge. The MIT Press Cambridge, Massachusetts London, England
Econometric Analysis of Cross Section and Panel Data Second Edition Jeffrey M. Wooldridge The MIT Press Cambridge, Massachusetts London, England Preface Acknowledgments xxi xxix I INTRODUCTION AND BACKGROUND
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a
More informationSolución del Examen Tipo: 1
Solución del Examen Tipo: 1 Universidad Carlos III de Madrid ECONOMETRICS Academic year 2009/10 FINAL EXAM May 17, 2010 DURATION: 2 HOURS 1. Assume that model (III) verifies the assumptions of the classical
More informationSolving Linear Systems, Continued and The Inverse of a Matrix
, Continued and The of a Matrix Calculus III Summer 2013, Session II Monday, July 15, 2013 Agenda 1. The rank of a matrix 2. The inverse of a square matrix Gaussian Gaussian solves a linear system by reducing
More informationThe Delta Method and Applications
Chapter 5 The Delta Method and Applications 5.1 Linear approximations of functions In the simplest form of the central limit theorem, Theorem 4.18, we consider a sequence X 1, X,... of independent and
More informationLecture 5 Hypothesis Testing in Multiple Linear Regression
Lecture 5 Hypothesis Testing in Multiple Linear Regression BIOST 515 January 20, 2004 Types of tests 1 Overall test Test for addition of a single variable Test for addition of a group of variables Overall
More information1 Introduction to Matrices
1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns
More informationOrthogonal Diagonalization of Symmetric Matrices
MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding
More informationMeasuring the Power of a Test
Textbook Reference: Chapter 9.5 Measuring the Power of a Test An economic problem motivates the statement of a null and alternative hypothesis. For a numeric data set, a decision rule can lead to the rejection
More informationPOLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.
Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression
More informationMath 312 Homework 1 Solutions
Math 31 Homework 1 Solutions Last modified: July 15, 01 This homework is due on Thursday, July 1th, 01 at 1:10pm Please turn it in during class, or in my mailbox in the main math office (next to 4W1) Please
More informationMatrix Norms. Tom Lyche. September 28, Centre of Mathematics for Applications, Department of Informatics, University of Oslo
Matrix Norms Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo September 28, 2009 Matrix Norms We consider matrix norms on (C m,n, C). All results holds for
More informationEigenvalues, Eigenvectors, Matrix Factoring, and Principal Components
Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they
More informationFrom the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More information1 Teaching notes on GMM 1.
Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in
More informationStandard Deviation Calculator
CSS.com Chapter 35 Standard Deviation Calculator Introduction The is a tool to calculate the standard deviation from the data, the standard error, the range, percentiles, the COV, confidence limits, or
More information3.1 Least squares in matrix form
118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression
More informationCofactor Expansion: Cramer s Rule
Cofactor Expansion: Cramer s Rule MATH 322, Linear Algebra I J. Robert Buchanan Department of Mathematics Spring 2015 Introduction Today we will focus on developing: an efficient method for calculating
More informationDEPARTMENT OF ECONOMICS. Unit ECON 12122 Introduction to Econometrics. Notes 4 2. R and F tests
DEPARTMENT OF ECONOMICS Unit ECON 11 Introduction to Econometrics Notes 4 R and F tests These notes provide a summary of the lectures. They are not a complete account of the unit material. You should also
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN13: 9780470860809 ISBN10: 0470860804 Editors Brian S Everitt & David
More informationLAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.
More informationSec 4.1 Vector Spaces and Subspaces
Sec 4. Vector Spaces and Subspaces Motivation Let S be the set of all solutions to the differential equation y + y =. Let T be the set of all 2 3 matrices with real entries. These two sets share many common
More informationNotes for STA 437/1005 Methods for Multivariate Data
Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.
More informationThe VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.
Cointegration The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Economic theory, however, often implies equilibrium
More informationMultivariate normal distribution and testing for means (see MKB Ch 3)
Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 Onesample ttest (univariate).................................................. 3 Twosample ttest (univariate).................................................
More informationRegression, least squares
Regression, least squares Joe Felsenstein Department of Genome Sciences and Department of Biology Regression, least squares p.1/24 Fitting a straight line X Two distinct cases: The X values are chosen
More informationWhat is the interpretation of R 2?
What is the interpretation of R 2? Karl G. Jöreskog October 2, 1999 Consider a regression equation between a dependent variable y and a set of explanatory variables x'=(x 1, x 2,..., x q ): or in matrix
More informationOneWay Analysis of Variance
OneWay Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We
More informationSIMPLE REGRESSION ANALYSIS
SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two
More informationMATH10212 Linear Algebra. Systems of Linear Equations. Definition. An ndimensional vector is a row or a column of n numbers (or letters): a 1.
MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0534405967. Systems of Linear Equations Definition. An ndimensional vector is a row or a column
More informationChapter 6: Multivariate Cointegration Analysis
Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration
More informationMA 242 LINEAR ALGEBRA C1, Solutions to Second Midterm Exam
MA 4 LINEAR ALGEBRA C, Solutions to Second Midterm Exam Prof. Nikola Popovic, November 9, 6, 9:3am  :5am Problem (5 points). Let the matrix A be given by 5 6 5 4 5 (a) Find the inverse A of A, if it exists.
More informationDiagonal, Symmetric and Triangular Matrices
Contents 1 Diagonal, Symmetric Triangular Matrices 2 Diagonal Matrices 2.1 Products, Powers Inverses of Diagonal Matrices 2.1.1 Theorem (Powers of Matrices) 2.2 Multiplying Matrices on the Left Right by
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +
More informationOnline Appendices to the Corporate Propensity to Save
Online Appendices to the Corporate Propensity to Save Appendix A: Monte Carlo Experiments In order to allay skepticism of empirical results that have been produced by unusual estimators on fairly small
More informationChapter 4: Vector Autoregressive Models
Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...
More informationDETERMINANTS. b 2. x 2
DETERMINANTS 1 Systems of two equations in two unknowns A system of two equations in two unknowns has the form a 11 x 1 + a 12 x 2 = b 1 a 21 x 1 + a 22 x 2 = b 2 This can be written more concisely in
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationLinear Dependence Tests
Linear Dependence Tests The book omits a few key tests for checking the linear dependence of vectors. These short notes discuss these tests, as well as the reasoning behind them. Our first test checks
More informationMAT 200, Midterm Exam Solution. a. (5 points) Compute the determinant of the matrix A =
MAT 200, Midterm Exam Solution. (0 points total) a. (5 points) Compute the determinant of the matrix 2 2 0 A = 0 3 0 3 0 Answer: det A = 3. The most efficient way is to develop the determinant along the
More information17. Inner product spaces Definition 17.1. Let V be a real vector space. An inner product on V is a function
17. Inner product spaces Definition 17.1. Let V be a real vector space. An inner product on V is a function, : V V R, which is symmetric, that is u, v = v, u. bilinear, that is linear (in both factors):
More information7 Hypothesis testing  one sample tests
7 Hypothesis testing  one sample tests 7.1 Introduction Definition 7.1 A hypothesis is a statement about a population parameter. Example A hypothesis might be that the mean age of students taking MAS113X
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationHeteroskedasticity and Weighted Least Squares
Econ 507. Econometric Analysis. Spring 2009 April 14, 2009 The Classical Linear Model: 1 Linearity: Y = Xβ + u. 2 Strict exogeneity: E(u) = 0 3 No Multicollinearity: ρ(x) = K. 4 No heteroskedasticity/
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationIntroduction to Matrix Algebra I
Appendix A Introduction to Matrix Algebra I Today we will begin the course with a discussion of matrix algebra. Why are we studying this? We will use matrix algebra to derive the linear regression model
More informationStatistics 112 Regression Cheatsheet Section 1B  Ryan Rosario
Statistics 112 Regression Cheatsheet Section 1B  Ryan Rosario I have found that the best way to practice regression is by brute force That is, given nothing but a dataset and your mind, compute everything
More informationThese axioms must hold for all vectors ū, v, and w in V and all scalars c and d.
DEFINITION: A vector space is a nonempty set V of objects, called vectors, on which are defined two operations, called addition and multiplication by scalars (real numbers), subject to the following axioms
More information2.1: MATRIX OPERATIONS
.: MATRIX OPERATIONS What are diagonal entries and the main diagonal of a matrix? What is a diagonal matrix? When are matrices equal? Scalar Multiplication 45 Matrix Addition Theorem (pg 0) Let A, B, and
More informationThe Method of Lagrange Multipliers
The Method of Lagrange Multipliers S. Sawyer October 25, 2002 1. Lagrange s Theorem. Suppose that we want to maximize (or imize a function of n variables f(x = f(x 1, x 2,..., x n for x = (x 1, x 2,...,
More informationSolutions to Math 51 First Exam January 29, 2015
Solutions to Math 5 First Exam January 29, 25. ( points) (a) Complete the following sentence: A set of vectors {v,..., v k } is defined to be linearly dependent if (2 points) there exist c,... c k R, not
More informationLecture 14: Section 3.3
Lecture 14: Section 3.3 Shuanglin Shao October 23, 2013 Definition. Two nonzero vectors u and v in R n are said to be orthogonal (or perpendicular) if u v = 0. We will also agree that the zero vector in
More informationOutline. Correlation & Regression, III. Review. Relationship between r and regression
Outline Correlation & Regression, III 9.07 4/6/004 Relationship between correlation and regression, along with notes on the correlation coefficient Effect size, and the meaning of r Other kinds of correlation
More informationLinear Algebra Notes
Linear Algebra Notes Chapter 19 KERNEL AND IMAGE OF A MATRIX Take an n m matrix a 11 a 12 a 1m a 21 a 22 a 2m a n1 a n2 a nm and think of it as a function A : R m R n The kernel of A is defined as Note
More informationUCLA STAT 13 Statistical Methods  Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates
UCLA STAT 13 Statistical Methods  Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates 1. (a) (i) µ µ (ii) σ σ n is exactly Normally distributed. (c) (i) is approximately Normally
More informationScientific Computing: An Introductory Survey
Scientific Computing: An Introductory Survey Chapter 3 Linear Least Squares Prof. Michael T. Heath Department of Computer Science University of Illinois at UrbanaChampaign Copyright c 2002. Reproduction
More informationCITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION
No: CITY UNIVERSITY LONDON BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION ENGINEERING MATHEMATICS 2 (resit) EX2005 Date: August
More informationSystems of Linear Equations
Systems of Linear Equations Beifang Chen Systems of linear equations Linear systems A linear equation in variables x, x,, x n is an equation of the form a x + a x + + a n x n = b, where a, a,, a n and
More informationExamination 110 Probability and Statistics Examination
Examination 0 Probability and Statistics Examination Sample Examination Questions The Probability and Statistics Examination consists of 5 multiplechoice test questions. The test is a threehour examination
More informationHow to Conduct a Hypothesis Test
How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some
More informationSYSTEMS OF EQUATIONS
SYSTEMS OF EQUATIONS 1. Examples of systems of equations Here are some examples of systems of equations. Each system has a number of equations and a number (not necessarily the same) of variables for which
More informationSubspaces of R n LECTURE 7. 1. Subspaces
LECTURE 7 Subspaces of R n Subspaces Definition 7 A subset W of R n is said to be closed under vector addition if for all u, v W, u + v is also in W If rv is in W for all vectors v W and all scalars r
More informationTesting for serial correlation in linear paneldata models
The Stata Journal (2003) 3, Number 2, pp. 168 177 Testing for serial correlation in linear paneldata models David M. Drukker Stata Corporation Abstract. Because serial correlation in linear paneldata
More informationTypes of Specification Errors 1. Omitted Variables. 2. Including an irrelevant variable. 3. Incorrect functional form. 2
Notes on Model Specification To go with Gujarati, Basic Econometrics, Chapter 13 Copyright  Jonathan Nagler; April 4, 1999 Attributes of a Good Econometric Model A.C. Harvey: (in Gujarati, p. 4534) ffl
More informationLinear Algebra Notes for Marsden and Tromba Vector Calculus
Linear Algebra Notes for Marsden and Tromba Vector Calculus ndimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of
More informationModule 5 Hypotheses Tests: Comparing Two Groups
Module 5 Hypotheses Tests: Comparing Two Groups Objective: In medical research, we often compare the outcomes between two groups of patients, namely exposed and unexposed groups. At the completion of this
More informationMATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.
MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix. Nullspace Let A = (a ij ) be an m n matrix. Definition. The nullspace of the matrix A, denoted N(A), is the set of all ndimensional column
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationClustering in the Linear Model
Short Guides to Microeconometrics Fall 2014 Kurt Schmidheiny Universität Basel Clustering in the Linear Model 2 1 Introduction Clustering in the Linear Model This handout extends the handout on The Multiple
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measuresoffit in multiple regression Assumptions
More information