# Hypothesis Testing in the Classical Regression Model

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 LECTURE 5 Hypothesis Testing in the Classical Regression Model The Normal Distribution and the Sampling Distributions It is often appropriate to assume that the elements of the disturbance vector ε within the regression equations y = Xβ + ε are distributed independently and identically according to a normal law. Under this assumption, the sampling distributions of the estimates may be derived and various hypotheses relating to the underlying parameters may be tested. To denote that x is a normally distributed random variable with a mean of E(x) =µ and a dispersion matrix of D(x) = Σ, we shall write x N(µ, Σ). A vector z N(0,I) with a mean of zero and a dispersion matrix of D(z) =I is described as a standard normal vector. Any normal vector x N(µ, Σ) can be standardised: (1) If T is a transformation such that T ΣT = I and T T =Σ 1, then T (x µ) N(0,I). Associated with the normal distribution are a variety of so-called sampling distributions, which occur frequently in problems of statistical inference. Amongst these are the chi-square distribution, the F distribution and the t distribution. If z N(0,I) is a standard normal vector of n elements, then the sum of squares of its elements has a chi-square distribution of n degrees of freedom; and this is denoted by z z χ 2 (n). With the help of the standardising transformation, it can be shown that, (2) If x N(µ, Σ) is a vector of order n, then (x µ) Σ 1 (x µ) χ 2 (n). The sum of any two independent chi-square variates is itself a chi-square variate whose degrees of freedom equal the sum of the degrees of freedom of its constituents. Thus, (3) If u χ 2 (m) and v χ 2 (n) are independent chi-square variates of m and n degrees of freedom respectively, then (u+v) χ 2 (m+n) is a chi-square variate of m + n degrees of freedom. 1

2 D.S.G. POLLOCK: ECONOMETRICS The ratio of two independent chi-square variates divided by their respective degrees of freedom has a F distribution which is completely characterised by these degrees of freedom. Thus, (4) If u χ 2 (m) and v χ 2 (n) are independent chi-square variates, then the variate F =(u/m)/(v/n) has an F distribution of m and n degrees of freedom; and this is denoted by writing F F (m, n). The sampling distribution which is most frequently used is the t distribution. A t variate is a ratio of a standard normal variate and the root of an independent chi-square variate divided by its degrees of freedom. Thus, (5) If z N(0, 1) and v χ 2 (n) are independent variates, then t = z/ (v/n) has a t distribution of n degrees of freedom; and this is denoted by writing t t(n). It is clear that t 2 F (1,n). Hypothesis Concerning the Coefficients A linear function of a normally distributed vector is itself normally distributed. The oridinary least squares estimate β =(X X) 1 X y of the parameter vector β in the regression model (y; Xβ, I) is a linear function of y, which has an expected value of E( ˆβ) = β and a dispersion matrix of D( ˆβ) = (X X) 1, Thus, it follows that, if y N(Xβ, I) is normally distributed, then (6) ˆβ Nk {β, (X X) 1 }. Likewise, the marginal distributions of ˆβ 1, ˆβ 2 within ˆβ =[ˆβ 1, ˆβ 2] are given by (7) (8) ˆβ 1 N k1 ( β1, {X 1(I P 2 )X 1 } 1), ˆβ 2 N k2 ( β2, {X 2(I P 1 )X 2 } 1). From the results under (2) to (6), it follows that (9) σ 2 ( ˆβ β) X X( ˆβ β) χ 2 (k). Similarly, it follows from (7) and (8) that (10) (11) σ 2 ( ˆβ 1 β 1 ) X 1(I P 2 )X 1 ( ˆβ 1 β 1 ) χ 2 (k 1 ), σ 2 ( ˆβ 2 β 2 ) X 2(I P 1 )X 2 ( ˆβ 2 β 2 ) χ 2 (k 2 ). The distribution of the residual vector e = y X ˆβ is degenerate in the sense that the mapping e =(I P )ε from the disturbance vector ε to the residual 2

3 5. HYPOTHESIS TESTS Figure 1. The critical region, at the 10% significance level, of an F (5, 60) statistic. vector e entails a singular transformation. Nevertheless, it is possible to obtain a factorisation of the transformation in the form of I P = CC, where C is matrix of order T (T k) comprising T k orthonormal columns, which are orthogonal to the columns of X such that C X = 0. Since C C = I T k, it follows that, on premultiplying y N T (Xβ, I) by C, we get C y N T k (0, I). Hence (12) σ 2 y CC y = σ 2 (y X ˆβ) (y X ˆβ) χ 2 (T k). The vectors X ˆβ = Py and y X ˆβ =(I P )y have a zero-valued covariance matrix. If two normally distributed random vectors have a zero covariance matrix, then they are statistically independent. Therefore, it follows that (13) σ 2 ( ˆβ β) X X( ˆβ β) χ 2 (k) σ 2 (y X ˆβ) (y X ˆβ) χ 2 (T k) and are mutually independent chi-square variates. From this, it can be deduced that { ( F = ˆβ β) X X( ˆβ / β) (y X ˆβ) (y X ˆβ) } k T k (14) = 1 ˆ k ( ˆβ β) X X( ˆβ β) F (k, T k). To test an hypothesis specifying that β = β, the hypothesised parameter vector can be inserted in the above statistic and the resulting value can be compared 3

4 D.S.G. POLLOCK: ECONOMETRICS with the critical values of an F distribution of k and T k degrees of freedom. If a critical value is exceeded, then the hypothesis is liable to be rejected. The test is readily intelligible, since it is based on a measure of the distance between the hypothesised value Xβ of the systematic component of the regression and the value X ˆβ that is suggested by the data. If the two values are remote from each other, then we may suspect that the hypothesis is at fault. It is usual to suppose that a subset of the elements of the parameter vector β are zeros. This represents an instance of a class of hypotheses that specify values for a subvector β 2 within the partitioned model y = X 1 β 1 + Xβ 2 + ε without asserting anything about the values of the remaining elements in the subvector β 1. The appropriate test statistic for testing the hypothesis that β 2 = β 2 is (15) F = 1 ˆ k 2 ( ˆβ 2 β 2 ) X 2(I P 1 )X 2 ( ˆβ 2 β 2 ). This will have an F (k 2,T k) distribution if the hypothesis is true. A limiting case of the F statistic concerns the test of an hypothesis affecting a single element β i within the vector β. By specialising the expression under (15), a statistic may be derived in the form of (16) F = ( ˆβ i β i ) 2 ˆ w ii, wherein w ii stands for the ith diagonal element of (X X) 1. If the hypothesis is true, then this will be distributed according to the F (1,T k) law. However, the usual way of assessing such an hypothesis is to relate the value of the statistic (17) t = ˆβ i β i (ˆσ2 w ii ) to the tables of the t(t k) distribution. The advantage of the t statistic is that it shows the direction in which the estimate of β i deviates from the hypothesised value as well as the size of the deviation. Cochrane s Theorem and the Decomposition of a Chi-Square Variate The standard test of an hypothesis regarding the vector β in the model N(y; Xβ, I) entails a multi-dimensional version of Pythagoras Theorem. Consider the decomposition of the vector y into the systematic component and the residual vector. This gives (18) y = X ˆβ +(y X ˆβ) and y Xβ =(X ˆβ Xβ)+(y X ˆβ), 4

5 5. HYPOTHESIS TESTS e y X"^! Figure 2. The vector Py = X ˆβ is formed by the orthogonal projection of the vector y onto the subspace spanned by the columns of the matrix X. where the second equation comes from subtracting the unknown mean vector Xβ from both sides of the first. These equations can also be expressed in terms of the projector P = X(X X) 1 X which gives Py = X ˆβ and (I P )y = y X ˆβ = e. Also, the definition ε = y Xβ can be used within the second of the equations. Thus, (19) y = Py+(I P )y ε = Pε +(I P )ε. and The reason for adopting this notation is that it enables us to envisage more clearly the Pythagorean relationship between the vectors. Thus, from the fact that P = P = P 2 and that P (I P ) = 0, it can be established that (20) ε ε = ε Pε + ε (I P )ε or, equivalently, ε ε =(X ˆβ Xβ) (X ˆβ Xβ)+(y X ˆβ) (y X ˆβ). The terms in these expressions represent squared lengths; and the vectors themselves form the sides of a right-angled triangle with Pε at the base, (I P )ε as the vertical side and ε as the hypotenuse. These relationship are represented by Figure 2, where γ = Xβ and where ε = y γ. The usual test of an hypothesis regarding the elements of the vector β is based on the foregoing relationships. Imagine that the hypothesis postulates 5

6 D.S.G. POLLOCK: ECONOMETRICS that the true value of the parameter vector is β. To test this proposition, the value of Xβ is compared with the estimated mean vector X ˆβ. The test is a matter of assessing the proximity of the two vectors, which is measured by the square of the distance that separates them. This would be given by (21) ε Pε =(X ˆβ Xβ ) (X ˆβ Xβ ). If the hypothesis is untrue and if Xβ is remote from the true value of Xβ, then the distance is liable to be excessive. The distance can only be assessed in comparison with the variance of the disturbance term or with an estimate thereof. Usually, one has to make do with the estimate of which is provided by (22) ˆ = (y X ˆβ) (y X ˆβ) T k = ε (I P )ε. T k The numerator of this estimate is simply the squared length of the vector e =(I P )y =(I P )ε, which constitutes the vertical side of the right-angled triangle. Simple arguments, which have been given in the previous section, serve to demonstrate that (a) ε ε =(y Xβ) (y Xβ) χ 2 (T ), (23) (b) (c) ε Pε =(ˆβ β) X X( ˆβ β) χ 2 (k), ε (I P )ε =(y X ˆβ) (y X ˆβ) χ 2 (T k), where (b) and (c) represent statistically independent random variables whose sum is the random variable of (a). These quadratic forms, divided by their respective degrees of freedom, find their way into the F statistic of (14) which is { / } ε Pε ε (I P )ε (24) F = F (k, T k). k T k This result depends upon Cochrane s Theorem concerning the decomposition of a chi-square random variate. The following is a statement of the theorem which is attuned to the present requirements: 6

7 5. HYPOTHESIS TESTS (25) Let ε N(0, I T ) be a random vector of T independently and identically distributed elements. Also, let P = X(X X) 1 X be a symmetric idempotent matrix, such that P = P = P 2, which is constructed from a matrix X of order T k with Rank(X) =k. Then ε Pε + ε (I P )ε = ε ε χ2 (T ), which is a chi-square variate of T degrees of freedom, represents the sum of two independent chi-square variates ε P ε/ χ 2 (k) and ε (I P )ε/ χ 2 (T k) of k and T k degrees of freedom respectively. To prove this result, we begin by finding an alternative expression for the projector P = X(X X) 1 X. First, consider the fact that X X is a symmetric positive-definite matrix. It follows that there exists a matrix transformation T such that T (X X)T = I and T T =(X X) 1. Therefore, P = XT TX = C 1 C 1, where C 1 = XT is a T k matrix comprising k orthonormal vectors such that C 1C 1 = I k is the identity matrix of order k. Now define C 2 to be a complementary matrix of T k orthonormal vectors. Then, C =[C 1,C 2 ] is an orthonormal matrix of order T such that (26) CC = C 1 C 1 + C 2 C 2 = I T and [ ] [ C C C = 1 C 1 C 1C 2 Ik 0 C 2C 1 C 2C = 2 0 I T k ]. The first of these results allows us to set I P = I C 1 C 1 = C 2 C 2. Now, if ε N(0, I T ) and if C is an orthonormal matrix such that C C = I T, then it follows that C ε N(0, I T ). In effect, if ε is a normally distributed random vector with a density function which is centred on zero and which has spherical contours, and if C is the matrix of a rotation, then nothing is altered by applying the rotation to the random vector. On partitioning C ε, we find that (27) [ ] C 1 ε C 2ε N ([ ] [ ]) 0 σ, 2 I k 0 0 0, I T k which is to say that C 1ε N(0, I k ) and C 2ε N(0, I T k ) are independently distributed normal vectors. It follows that (28) ε C 1 C 1ε = ε Pε χ 2 (k) and ε C 2 C 2ε = ε (I P )ε χ 2 (T k) 7

8 D.S.G. POLLOCK: ECONOMETRICS are independent chi-square variates. Since C 1 C 1 +C 2 C 2 = I T, the sum of these two variates is (29) ε C 1 C 1ε + ε C 2 C 2ε = ε ε χ2 (T ); and thus the theorem is proved. The statistic under (14) can now be expressed in the form of (30) F = { / } ε Pε ε (I P )ε. k T k This is manifestly the ratio of two chi-square variates divided by their respective degrees of freedom; and so it has an F distribution with these degrees of freedom. This result provides the means for testing the hypothesis concerning the parameter vector β. Hypotheses Concerning Subsets of the Regression Coefficients Consider a set of linear restrictions on the vector β of a classical linear regression model N(y; Xβ, I), which take the form of (31) Rβ = r, where R is a matrix of order j k and of rank j, which is to say that the j restrictions are independent of each other and are fewer in number than the parameters within β. Given that the ordinary least-squares estimator of β is a normally distributed vector ˆβ N{β, (X X) 1 }, it follows that (32) R ˆβ N { Rβ = r, R(X X) 1 R } ; and, from this, it can be inferred immediately that (33) (R ˆβ r) { R(X X) 1 R } 1 (R ˆβ r) It has already established been established that χ 2 (j). (34) (T k)ˆ = (y X ˆβ) (y X ˆβ) χ 2 (T k) is a chi-square variate that is statistically independent of the chi-square variate (35) ( ˆβ β) X X( ˆβ β) 8 χ 2 (k)

9 5. HYPOTHESIS TESTS derived from the estimator of the regression parameters. The variate of (33) must also be independent of the chi-square of (34); and it is straightforward to deduce that { (R ˆβ { r) R(X X) 1 R } / 1 (R ˆβ r) (y X F = ˆβ) (y X ˆβ) } j T k (36) = (R ˆβ r) { R(X X) 1 R } 1 (R ˆβ r) ˆ F (j, T k), j which is to say that the ratio of the two independent chi-square variates, divided by their respective degrees of freedom, is an F statistic. This statistic, which embodies only know and observable quantities, can be used in testing the validity of the hypothesised restrictions Rβ = r. A specialisation of the statistic under (36) can also be used in testing an hypothesis concerning a subset of the elements of the vector β. Let β = [β 1,β 2]. Then, the condition that the subvector β 1 assumes the value of β1 can be expressed via the equation [ ] β1 (37) [I k1, 0] = β1. β 2 This can be construed as a case of the equation Rβ = r, where R =[I k1, 0] and r = β1. In order to discover the specialised form of the requisite test statistic, let us consider the following partitioned form of an inverse matrix: (38) [ X (X X) 1 1 X 1 X ] 1 1X 2 = X 2X 1 X 2X 2 [ ] {X 1 (I P 2 )X 1 } 1 {X 1(I P 2 )X 1 } 1 X 1X 2 (X 2X 2 ) 1 =, {X 2(I P 1 )X 2 } 1 X 2X 1 (X 1X 1 ) 1 {X 2(I P 1 )X 2 } 1 Then, with R =[I,0], we find that (39) R(X X) 1 R = { X 1(I P 2 )X 1 } 1. It follows, in a straightforward manner, that the specialised form of the F statistic of (36) is { ( ˆβ1 β F = 1) { } / X 1(I P 2 )X 1 ( ˆβ1 β1) (y X ˆβ) (y X ˆβ) } k 1 T k (40) = ( ˆβ 1 β1) { } X 1(I P 2 )X 1 ( ˆβ1 β1) ˆ F (k 1,T k). k 1 9

10 D.S.G. POLLOCK: ECONOMETRICS Finally, for the jth element of ˆβ, there is (41) ( ˆβ j β j ) 2 / w jj F (1,T k) or, equivalently, ( ˆβ j β j ) w jj t(t k), where w jj is the jth diagonal element of (X X) 1 and t(t k) denotes the t distribution of T k degrees of freedom. An Alternative Formulation of the F statistic An alternative way of forming the F statistic uses the products of two separate regressions. Consider the formula for the restricted least-squares estimator that has been given under (2.76): (42) β = ˆβ (X X) 1 R {R(X X) 1 R } 1 (R ˆβ r). From this, the following expression for the residual sum of squares of the restricted regression is is derived: (43) y Xβ =(y X ˆβ)+X(X X) 1 R {R(X X) 1 R } 1 (R ˆβ r). The two terms on the RHS are mutually orthogonal on account of the defining condition of an ordinary least-squares regression, which is that (y X ˆβ) X = 0. Therefore, the residual sum of squares of the restricted regression is (44) (y Xβ ) (y Xβ )=(y X ˆβ) (y X ˆβ) + This equation can be rewritten as (R ˆβ r) { R(X X) 1 R } 1 (R ˆβ r). (45) RSS USS =(R ˆβ r) { R(X X) 1 R } 1 (R ˆβ r), where RSS denotes the restricted sum of squares an USS denotes the unrestricted sum of squares. It follows that the test statistic of (36) can be written as { / } RSS USS USS (46) F =. j T k This formulation can be used, for example, in testing the restriction that β 1 = 0 in the partitioned model N(y; X 1 β 1 + X 2 β 2, I). Then, in terms of equation (37), there is R =[I k1, 0] and there is r = β 1 = 0, which gives (47) RSS USS = ˆβ 1X 1(I P 2 )X 1 ˆβ1 = y (I P 2 )X 1 {X 1(I P 2 )X 1 } 1 X 1(I P 2 )y. 10

11 5. HYPOTHESIS TESTS y X"^ X" * Figure 3. The test of the hypothesis entailed by the resticted model is based on a measure of the proximity of the restricted estimate Xβ, and the unrestricted estimate X ˆβ. The USS is the squared distance y X ˆβ 2. The RSS is the squared distance y Xβ 2. On the other hand, there is (48) RSS USS = y (I P 2 )y y (I P )y = y (P P 2 )y, Since the two expressions must be identical for all values of y, the comparison of (36) and (37) is sufficient to establish the following identity: (49) (I P 2 )X 1 {X 1(I P 2 )X 1 } 1 X 1(I P 2 )=P P 2. The geometric interpretation of the alternative formulation of the test statistic is straightforward. It can be understood, in reference to Figure 3, that the square of the distance between the restricted estimate Xβ and the unrestricted estimate X ˆβ, denoted by X ˆβ Xβ 2, which is the basis of the original formulation of the test statistic, is equal to the restricted sum of squares y Xβ 2 less the unrestricted sum of squares y X ˆβ 2. The latter is the basis of the alternative formulation. The Partitioned Inverse and Associated Identities The first objective is to derive the formula for the partitioned inverse of X X that has been given in equation (38). Write [ ] [ ] A B X (50) = 1 X 1 X 1 1X 2 B C X 2X 1 X 2X 2 11

12 D.S.G. POLLOCK: ECONOMETRICS and consider, the equation [ ][ ] X (51) 1 X 1 X 1X 2 A B X 2X 1 X 2X = 2 B C [ ] I 0. 0 I From this system, the following two equations can be extracted: (52) (53) X 1X 1 A + X 1X 2 B = I, X 2X 1 A + X 2X 2 B =0. To isolate A, equation (53) is premultiplied by X 1X 2 (X 2X 2 ) 1 to give (54) X 1X 2 (X 2X 2 ) 1 X 2X 1 A + X 1X 2 B =0, and this is taken from (52) to give (55) {X 1X 1 X 1X 2 (X 2X 2 ) 1 X 2X 1 }A = I whence (56) A = {X 1(I P 2 )X 1 } 1 with P 2 = X 2 (X 2X 2 ) 1 X 2. An argument of symmetry will serve to show that (57) C = {X 2(I P 1 )X 2 } 1 with P 1 = X 1 (X 1X 1 ) 1 X 1. To find B, and therefore B, the expression for A from (56) is substituted into (53). This gives (58) X 2X 1 {X 1(I P 2 )X 1 } 1 + X 2X 2 B =0, whence (59) B = {X 1(I P 2 )X 1 } 1 X 1X 2 (X 2X 2 ) 1 The matrix B is the transpose of this, but an argument of symmetry will serve to show that this is also given by the expression (60) B = {X 2(I P 1 )X 2 } 1 X 2X 1 (X 1X 1 ) 1 When the expression for A, B, B and C are put in place, the result is [ X 1 X 1 X 1X ] 1 [ ] 2 A B = (61) X 2X 1 X 2X 2 B C [ ] {X 1 (I P 2 )X 1 } 1 {X 1(I P 2 )X 1 } 1 X 1X 2 (X 2X 2 ) 1 =, {X 2(I P 1 )X 2 } 1 X 2X 1 (X 1X 1 ) 1 {X 2(I P 1 )X 2 } 1 12

13 5. HYPOTHESIS TESTS Next consider [ ][ ] X(X X) 1 X A B X =[X 1 X 2 ] 1 (62) B C X 2 = {X 1 AX 1 + X 1 BX 2} + {X 2 CX 2 + X 2 B X 1} Substituting the expressions for A, B, C and B shows that (63) X(X X) 1 X =X 1 {X 1(I P 2 )X 1 } 1 X 1(I P 2 ) + X 2 {X 2(I P 1 )X 2 } 1 X 2(I P 1 ), which can be written as (64) P = P 1/2 + P 2/1 with (65) (66) P 1/2 = X 1 {X 1(I P 2 )X 1 } 1 X 1(I P 2 ), P 2/1 = X 2 {X 2(I P 1 )X 2 } 1 X 2(I P 1 ). Next, observe that there are (67) (68) P 1/2 P 1 = P 1, P 1/2 P 2 =0, P 2/1 P 2 = P 2, P 2/1 P 1 =0. It follows that (69) PP 1 =(P 1/2 + P 2/1 )P 1 = P 1 = P 1 P, where the final equality follows in consequence of the symmetry of P 1 and P. It also follows, by an argument of symmetry, that (70) P (I P 1 )=P P 1 =(I P 1 )P Therefore, since (I P 1 )P =(I P 1 )P 2/1, there is (71) P P 1 =(I P 1 )X 2 {X 2(I P 1 )X 2 } 1 X 2 (I P 1 ). By interchanging the two subscripts, the identity of (49) is derived. 13

### Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

### 1 Introduction to Matrices

1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns

### Regression Analysis LECTURE 2. The Multiple Regression Model in Matrices Consider the regression equation. (1) y = β 0 + β 1 x 1 + + β k x k + ε,

LECTURE 2 Regression Analysis The Multiple Regression Model in Matrices Consider the regression equation (1) y = β 0 + β 1 x 1 + + β k x k + ε, and imagine that T observations on the variables y, x 1,,x

### Orthogonal Diagonalization of Symmetric Matrices

MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### Estimation and Inference in Cointegration Models Economics 582

Estimation and Inference in Cointegration Models Economics 582 Eric Zivot May 17, 2012 Tests for Cointegration Let the ( 1) vector Y be (1). Recall, Y is cointegrated with 0 cointegrating vectors if there

### Inner Product Spaces

Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

### Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

Notes on Orthogonal and Symmetric Matrices MENU, Winter 201 These notes summarize the main properties and uses of orthogonal and symmetric matrices. We covered quite a bit of material regarding these topics,

### Notes for STA 437/1005 Methods for Multivariate Data

Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.

### Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round \$200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

### Diagonal, Symmetric and Triangular Matrices

Contents 1 Diagonal, Symmetric Triangular Matrices 2 Diagonal Matrices 2.1 Products, Powers Inverses of Diagonal Matrices 2.1.1 Theorem (Powers of Matrices) 2.2 Multiplying Matrices on the Left Right by

### 9 MATRICES AND TRANSFORMATIONS

9 MATRICES AND TRANSFORMATIONS Chapter 9 Matrices and Transformations Objectives After studying this chapter you should be able to handle matrix (and vector) algebra with confidence, and understand the

### MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

### Linear Algebra Review. Vectors

Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length

### JUST THE MATHS UNIT NUMBER 1.8. ALGEBRA 8 (Polynomials) A.J.Hobson

JUST THE MATHS UNIT NUMBER 1.8 ALGEBRA 8 (Polynomials) by A.J.Hobson 1.8.1 The factor theorem 1.8.2 Application to quadratic and cubic expressions 1.8.3 Cubic equations 1.8.4 Long division of polynomials

### Regression, least squares

Regression, least squares Joe Felsenstein Department of Genome Sciences and Department of Biology Regression, least squares p.1/24 Fitting a straight line X Two distinct cases: The X values are chosen

### Similarity and Diagonalization. Similar Matrices

MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

### Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Chapter 3 Linear Least Squares Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction

### MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

### ELEC-E8104 Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems

Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems Minimum Mean Square Error (MMSE) MMSE estimation of Gaussian random vectors Linear MMSE estimator for arbitrarily distributed

### Inner product. Definition of inner product

Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product

### Helpsheet. Giblin Eunson Library MATRIX ALGEBRA. library.unimelb.edu.au/libraries/bee. Use this sheet to help you:

Helpsheet Giblin Eunson Library ATRIX ALGEBRA Use this sheet to help you: Understand the basic concepts and definitions of matrix algebra Express a set of linear equations in matrix notation Evaluate determinants

Lecture 5: Linear least-squares Regression III: Advanced Methods William G. Jacoby Department of Political Science Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Simple Linear Regression

Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

### Least Squares Estimation

Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

### The determinant of a skew-symmetric matrix is a square. This can be seen in small cases by direct calculation: 0 a. 12 a. a 13 a 24 a 14 a 23 a 14

4 Symplectic groups In this and the next two sections, we begin the study of the groups preserving reflexive sesquilinear forms or quadratic forms. We begin with the symplectic groups, associated with

### 1 Introduction. 2 Matrices: Definition. Matrix Algebra. Hervé Abdi Lynne J. Williams

In Neil Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage. 00 Matrix Algebra Hervé Abdi Lynne J. Williams Introduction Sylvester developed the modern concept of matrices in the 9th

### 3.1 Least squares in matrix form

118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression

### Linear Algebra Notes for Marsden and Tromba Vector Calculus

Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of

### Understanding and Applying Kalman Filtering

Understanding and Applying Kalman Filtering Lindsay Kleeman Department of Electrical and Computer Systems Engineering Monash University, Clayton 1 Introduction Objectives: 1. Provide a basic understanding

### SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,

### Multivariate normal distribution and testing for means (see MKB Ch 3)

Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................

### α = u v. In other words, Orthogonal Projection

Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v

### DEPARTMENT OF ECONOMICS. Unit ECON 12122 Introduction to Econometrics. Notes 4 2. R and F tests

DEPARTMENT OF ECONOMICS Unit ECON 11 Introduction to Econometrics Notes 4 R and F tests These notes provide a summary of the lectures. They are not a complete account of the unit material. You should also

### CURVE FITTING LEAST SQUARES APPROXIMATION

CURVE FITTING LEAST SQUARES APPROXIMATION Data analysis and curve fitting: Imagine that we are studying a physical system involving two quantities: x and y Also suppose that we expect a linear relationship

### 1 2 3 1 1 2 x = + x 2 + x 4 1 0 1

(d) If the vector b is the sum of the four columns of A, write down the complete solution to Ax = b. 1 2 3 1 1 2 x = + x 2 + x 4 1 0 0 1 0 1 2. (11 points) This problem finds the curve y = C + D 2 t which

### Diagonalisation. Chapter 3. Introduction. Eigenvalues and eigenvectors. Reading. Definitions

Chapter 3 Diagonalisation Eigenvalues and eigenvectors, diagonalisation of a matrix, orthogonal diagonalisation fo symmetric matrices Reading As in the previous chapter, there is no specific essential

### Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 Lecture 3: QR, least squares, linear regression Linear Algebra Methods for Data Mining, Spring 2007, University

### Introduction to Matrix Algebra

Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

### t-tests and F-tests in regression

t-tests and F-tests in regression Johan A. Elkink University College Dublin 5 April 2012 Johan A. Elkink (UCD) t and F-tests 5 April 2012 1 / 25 Outline 1 Simple linear regression Model Variance and R

### Recall that two vectors in are perpendicular or orthogonal provided that their dot

Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal

### The basic unit in matrix algebra is a matrix, generally expressed as: a 11 a 12. a 13 A = a 21 a 22 a 23

(copyright by Scott M Lynch, February 2003) Brief Matrix Algebra Review (Soc 504) Matrix algebra is a form of mathematics that allows compact notation for, and mathematical manipulation of, high-dimensional

### Estimation with Minimum Mean Square Error

C H A P T E R 8 Estimation with Minimum Mean Square Error INTRODUCTION A recurring theme in this text and in much of communication, control and signal processing is that of making systematic estimates,

### Coefficient of Potential and Capacitance

Coefficient of Potential and Capacitance Lecture 12: Electromagnetic Theory Professor D. K. Ghosh, Physics Department, I.I.T., Bombay We know that inside a conductor there is no electric field and that

### Matrix Differentiation

1 Introduction Matrix Differentiation ( and some other stuff ) Randal J. Barnes Department of Civil Engineering, University of Minnesota Minneapolis, Minnesota, USA Throughout this presentation I have

### SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2014 Timo Koski () Mathematisk statistik 24.09.2014 1 / 75 Learning outcomes Random vectors, mean vector, covariance

### MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

### HYPOTHESIS TESTS AND MODEL SELECTION Q

5 HYPOTHESIS TESTS AND MODEL SELECTION Q 5.1 INTRODUCTION The linear regression model is used for three major purposes: estimation and prediction, which were the subjects of the previous chapter, and hypothesis

### MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets. Norm The notion of norm generalizes the notion of length of a vector in R n. Definition. Let V be a vector space. A function α

### NOTES ON LINEAR TRANSFORMATIONS

NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all

### E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

### Quadratic forms Cochran s theorem, degrees of freedom, and all that

Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us

### Solutions to Review Problems

Chapter 1 Solutions to Review Problems Chapter 1 Exercise 42 Which of the following equations are not linear and why: (a x 2 1 + 3x 2 2x 3 = 5. (b x 1 + x 1 x 2 + 2x 3 = 1. (c x 1 + 2 x 2 + x 3 = 5. (a

### DETERMINANTS. b 2. x 2

DETERMINANTS 1 Systems of two equations in two unknowns A system of two equations in two unknowns has the form a 11 x 1 + a 12 x 2 = b 1 a 21 x 1 + a 22 x 2 = b 2 This can be written more concisely in

### Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Peter J. Olver 5. Inner Products and Norms The norm of a vector is a measure of its size. Besides the familiar Euclidean norm based on the dot product, there are a number

### CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION

No: CITY UNIVERSITY LONDON BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION ENGINEERING MATHEMATICS 2 (resit) EX2005 Date: August

### Matrices and Polynomials

APPENDIX 9 Matrices and Polynomials he Multiplication of Polynomials Let α(z) =α 0 +α 1 z+α 2 z 2 + α p z p and y(z) =y 0 +y 1 z+y 2 z 2 + y n z n be two polynomials of degrees p and n respectively. hen,

### Statistiek (WISB361)

Statistiek (WISB361) Final exam June 29, 2015 Schrijf uw naam op elk in te leveren vel. Schrijf ook uw studentnummer op blad 1. The maximum number of points is 100. Points distribution: 23 20 20 20 17

### Inner Product Spaces and Orthogonality

Inner Product Spaces and Orthogonality week 3-4 Fall 2006 Dot product of R n The inner product or dot product of R n is a function, defined by u, v a b + a 2 b 2 + + a n b n for u a, a 2,, a n T, v b,

### 4. MATRICES Matrices

4. MATRICES 170 4. Matrices 4.1. Definitions. Definition 4.1.1. A matrix is a rectangular array of numbers. A matrix with m rows and n columns is said to have dimension m n and may be represented as follows:

### T ( a i x i ) = a i T (x i ).

Chapter 2 Defn 1. (p. 65) Let V and W be vector spaces (over F ). We call a function T : V W a linear transformation form V to W if, for all x, y V and c F, we have (a) T (x + y) = T (x) + T (y) and (b)

### Adding vectors We can do arithmetic with vectors. We ll start with vector addition and related operations. Suppose you have two vectors

1 Chapter 13. VECTORS IN THREE DIMENSIONAL SPACE Let s begin with some names and notation for things: R is the set (collection) of real numbers. We write x R to mean that x is a real number. A real number

### Master s Theory Exam Spring 2006

Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

### SYSTEMS OF REGRESSION EQUATIONS

SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations

### WHICH LINEAR-FRACTIONAL TRANSFORMATIONS INDUCE ROTATIONS OF THE SPHERE?

WHICH LINEAR-FRACTIONAL TRANSFORMATIONS INDUCE ROTATIONS OF THE SPHERE? JOEL H. SHAPIRO Abstract. These notes supplement the discussion of linear fractional mappings presented in a beginning graduate course

### Variance of OLS Estimators and Hypothesis Testing. Randomness in the model. GM assumptions. Notes. Notes. Notes. Charlie Gibbons ARE 212.

Variance of OLS Estimators and Hypothesis Testing Charlie Gibbons ARE 212 Spring 2011 Randomness in the model Considering the model what is random? Y = X β + ɛ, β is a parameter and not random, X may be

### CS229 Lecture notes. Andrew Ng

CS229 Lecture notes Andrew Ng Part X Factor analysis Whenwehavedatax (i) R n thatcomesfromamixtureofseveral Gaussians, the EM algorithm can be applied to fit a mixture model. In this setting, we usually

### Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

### ASEN 3112 - Structures. MDOF Dynamic Systems. ASEN 3112 Lecture 1 Slide 1

19 MDOF Dynamic Systems ASEN 3112 Lecture 1 Slide 1 A Two-DOF Mass-Spring-Dashpot Dynamic System Consider the lumped-parameter, mass-spring-dashpot dynamic system shown in the Figure. It has two point

### 15.062 Data Mining: Algorithms and Applications Matrix Math Review

.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

### LINEAR ALGEBRA. September 23, 2010

LINEAR ALGEBRA September 3, 00 Contents 0. LU-decomposition.................................... 0. Inverses and Transposes................................. 0.3 Column Spaces and NullSpaces.............................

### Introduction to Matrix Algebra I

Appendix A Introduction to Matrix Algebra I Today we will begin the course with a discussion of matrix algebra. Why are we studying this? We will use matrix algebra to derive the linear regression model

### Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

### Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

### Vector Spaces II: Finite Dimensional Linear Algebra 1

John Nachbar September 2, 2014 Vector Spaces II: Finite Dimensional Linear Algebra 1 1 Definitions and Basic Theorems. For basic properties and notation for R N, see the notes Vector Spaces I. Definition

### FE670 Algorithmic Trading Strategies. Stevens Institute of Technology

FE670 Algorithmic Trading Strategies Lecture 6. Portfolio Optimization: Basic Theory and Practice Steve Yang Stevens Institute of Technology 10/03/2013 Outline 1 Mean-Variance Analysis: Overview 2 Classical

### Torgerson s Classical MDS derivation: 1: Determining Coordinates from Euclidean Distances

Torgerson s Classical MDS derivation: 1: Determining Coordinates from Euclidean Distances It is possible to construct a matrix X of Cartesian coordinates of points in Euclidean space when we know the Euclidean

### Linear Algebra: Matrices

B Linear Algebra: Matrices B 1 Appendix B: LINEAR ALGEBRA: MATRICES TABLE OF CONTENTS Page B.1. Matrices B 3 B.1.1. Concept................... B 3 B.1.2. Real and Complex Matrices............ B 3 B.1.3.

### Chapter 17. Orthogonal Matrices and Symmetries of Space

Chapter 17. Orthogonal Matrices and Symmetries of Space Take a random matrix, say 1 3 A = 4 5 6, 7 8 9 and compare the lengths of e 1 and Ae 1. The vector e 1 has length 1, while Ae 1 = (1, 4, 7) has length

### POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Section 1.1. Introduction to R n

The Calculus of Functions of Several Variables Section. Introduction to R n Calculus is the study of functional relationships and how related quantities change with each other. In your first exposure to

### The Moore-Penrose Inverse and Least Squares

University of Puget Sound MATH 420: Advanced Topics in Linear Algebra The Moore-Penrose Inverse and Least Squares April 16, 2014 Creative Commons License c 2014 Permission is granted to others to copy,

### Factor analysis. Angela Montanari

Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

### Systems of Linear Equations

Systems of Linear Equations Beifang Chen Systems of linear equations Linear systems A linear equation in variables x, x,, x n is an equation of the form a x + a x + + a n x n = b, where a, a,, a n and

### EC9A0: Pre-sessional Advanced Mathematics Course

University of Warwick, EC9A0: Pre-sessional Advanced Mathematics Course Peter J. Hammond & Pablo F. Beker 1 of 55 EC9A0: Pre-sessional Advanced Mathematics Course Slides 1: Matrix Algebra Peter J. Hammond

### Multiple Hypothesis Testing: The F-test

Multiple Hypothesis Testing: The F-test Matt Blackwell December 3, 2008 1 A bit of review When moving into the matrix version of linear regression, it is easy to lose sight of the big picture and get lost

### Matrix Algebra, Class Notes (part 1) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved.

Matrix Algebra, Class Notes (part 1) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved. 1 Sum, Product and Transpose of Matrices. If a ij with

### Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 1 Joint Probability Distributions 1 1.1 Two Discrete

### 1 Lecture: Integration of rational functions by decomposition

Lecture: Integration of rational functions by decomposition into partial fractions Recognize and integrate basic rational functions, except when the denominator is a power of an irreducible quadratic.

### Lecture 8: Signal Detection and Noise Assumption

ECE 83 Fall Statistical Signal Processing instructor: R. Nowak, scribe: Feng Ju Lecture 8: Signal Detection and Noise Assumption Signal Detection : X = W H : X = S + W where W N(, σ I n n and S = [s, s,...,

### Simple Regression Theory II 2010 Samuel L. Baker

SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

### Matrix Algebra and Applications

Matrix Algebra and Applications Dudley Cooke Trinity College Dublin Dudley Cooke (Trinity College Dublin) Matrix Algebra and Applications 1 / 49 EC2040 Topic 2 - Matrices and Matrix Algebra Reading 1 Chapters

### 17. Inner product spaces Definition 17.1. Let V be a real vector space. An inner product on V is a function

17. Inner product spaces Definition 17.1. Let V be a real vector space. An inner product on V is a function, : V V R, which is symmetric, that is u, v = v, u. bilinear, that is linear (in both factors):

### Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

### Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

### 3.2. Solving quadratic equations. Introduction. Prerequisites. Learning Outcomes. Learning Style

Solving quadratic equations 3.2 Introduction A quadratic equation is one which can be written in the form ax 2 + bx + c = 0 where a, b and c are numbers and x is the unknown whose value(s) we wish to find.

### Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics

INTERNATIONAL BLACK SEA UNIVERSITY COMPUTER TECHNOLOGIES AND ENGINEERING FACULTY ELABORATION OF AN ALGORITHM OF DETECTING TESTS DIMENSIONALITY Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree