Linear Models: The less than full rank model estimation and estimability

Size: px
Start display at page:

Download "Linear Models: The less than full rank model estimation and estimability"

Transcription

1 Linear Models: The less than full rank model estimation and estimability

2 The less than full rank model In previous sections we use the linear model y = X β + ε in the knowledge (or assumption) that X, of dimension n p, is of full rank, i.e. r(x ) = p. This assumption allows for easy(er) analysis, because a full rank X implies that X T X is invertible, and therefore the normal equations have a unique solution. X T X b = X T y

3 Unfortunately, not all linear models fall into this category. If this happens, we must develop other techniques to analyse the model. Example. A common (and commonly known) example of a less than full rank model is the one-way classification model with fixed effects. In this model, the samples come from k distinct populations, with different characteristics. We wish to determine the differences in these populations.

4 For example: A medical researcher might want to compare three different types of pain relievers for effectiveness in relieving arthritis; A biologist might study the effects of four experimental treatments used to enhance the growth of tomato plants; or An engineer might want to investigate the sulfur content in the five major coal seams in a particular geographic region. Often, the populations arise as the result of applying k different treatments to groups of similar subjects.

5 For this model, we give each response variable two indices, to denote both the population from which it is taken and its position in the samples from that population. So y ij is the jth sample taken from the ith population. The model we use is y ij = µ + τ i + ε ij, for i = 1, 2,..., k and j = 1, 2,..., n i, where k is the number of populations / treatments; n i is the number of samples from the ith population.

6 Although it might not look exactly like a linear model, it can be written in that form quite easily by taking Writing it out: (next page) β = µ τ 1 τ 2. τ k. Because the first column of X is the sum of the remaining columns, the columns are not linearly independent, and therefore X is not of full rank.

7 y 11 y 12. y 21 y 22. y k,nk = µ τ 1 τ 2. τ k + ε 11 ε 12. ε 21 ε 22. ε k,nk y = X β + ε

8 Example. Three different treatment methods for removing organic carbon from tar sand wastewater are compared: airflotation, foam separation, and ferric-chloride coagulation. A study is conducted and the amounts of carbon removed are as follows: AF FS FCC

9 The linear model is = µ τ 1 τ 2 τ 3 τ k + ε 11 ε 12 ε 13 ε 21 ε 22 ε 23 ε 31 ε 32 ε 33 y = X β + ε

10 As noted before, the big difficulty with a less than full rank model is that X T X is now singular. This means that the normal equations do not have a unique solution. We will show later that in fact, the normal equations now have an infinite number of solutions. However, the problem goes deeper than that: not only can we not estimate the parameters, but the parameters themselves do not have any fixed value! We show this in an example.

11 Example. Suppose that we have a one-way classification model with k = 3 populations. The response variable from each population is centred around µ + τ i. Now suppose that from a study, it is found that µ + τ 1 = 1 µ + τ 2 = 12 µ + τ 3 = 8 Then our parameters might be µ = 1, τ 1 =, τ 2 = 2, τ 3 = 2. However, we can also have µ = 3, τ 1 = 7, τ 2 = 9, τ 3 = 5! In fact we can choose µ to be any real number, and still describe the system.

12 Reparametrization One way we can tackle the less than full rank model is by the simple means of converting to a full rank model. We can then use all the machinery we have developed on the converted model. Example. Consider the one-way classification model with k = 3. The less than full rank model for this is y ij = µ + τ i + ε ij, for i = 1, 2, 3, j = 1, 2,..., n i. However, we can write the mean of each population as µ i = µ + τ i.

13 Then we can recast the model as y ij = µ i + ε ij with corresponding matrices X = 1, β = µ 1 µ 2 µ 3.

14 It is apparent now that the columns of X are linearly independent, and so this is a full rank model that we can fiddle with. Simple matrix calculations give us X T X = 1 n 1 n 2, (X T X ) 1 n 1 = 1 n 2 n 1 3 n 3 X T y = n1 i=1 y 1i n2 i=1 y 2i n3 i=1 y 3i, b = (X T X ) 1 X T y = n1 i=1 y 1i/ n1 n2 i=1 y 2i/ n2 n3 i=1 y 3i/ n3.

15 Therefore, the least squares estimates for each of the population means are the means of the samples drawn from that population: ˆµ i = 1 n i y ij. n i j=1 Linear functions of the parameters, of the form t T β, are estimated using t T b. For example, the function µ 1 µ 2 is estimated by 1 n 1 n 1 i=1 y 1i 1 n 2 y 2i. n 2 i=2

16 The standard assumption that the errors are normally distributed with mean and variance σ 2 I is interpreted in this context to mean that all populations have a common variance σ 2 (but different means). The standard estimator for this variance is s 2 = yt y y T X (X T X ) 1 X T y n p = yt y y T X b. n 3

17 Then s 2 = = = n 3 1 n 3 1 n n 3X X i i=1 i=1 j=1 n 3X X i yij 2 j=1 2 n 3X X i 4 i=1 j=1 y 2 ij + ˆ P n 1 i=1 y 1i y 2 ij 1 n i n 3X 1 X n i i=1 n X j=1 j=1 y ij P n2 i=1 y 2i y ij A 5 A 5. P n3 i=1 y 3i 2 4 P n1 i=1 y 1i n1 Pn2 i=1 y 2i n2 Pn3 i=1 y 3i n

18 This can be written as a pooled variance where s 2 i s 2 = (n 1 1)s (n 2 1)s (n 3 1)s 2 3 (n 1 1) + (n 2 1) + (n 3 1) are the individual population variance estimators s 2 i = 1 n i 1 ( ) n i y ij 1 n i 2 y ik. n i j=1 k=1

19 In general, it is always possible to reparametrize a less than full rank model into a full rank model. However, this is not always desirable. For the one-way classification model, we have a nice interpretation of the (re-)parameters as the population means. But this is not always possible. Example. Consider the two-way classification model (without interaction), with one sample from each combination of factors and two levels of each factor: y ij = µ + τ i + β j + ε ij, i, j = 1, 2. We will study this model with more generality later.

20 The design matrix for this model is X = It is obvious that the first column is the sum of the next two columns, so the rank of X is at most 4. However, the sum of the 2nd and 3rd columns is equal to the sum of the 4th and 5th, so in fact r(x ) = 3. This means that we have to remove 2 parameters making interpretability much harder! Fortunately, we do not have to reparametrize our models we can develop theory for the less than full rank model.

21 Conditional inverses The starting point of our theory is (as might be guessed) more linear algebra. This time we introduce the concept of conditional inverses. Definition Let A be a n p matrix. The p n matrix A c is called a conditional inverse for A if and only if AA c A = A.

22 The first thing we note is that if A is nonsingular and square, then A 1 = A c so conditional inverses are just an extension of regular inverses for non-square and singular matrices. Example. Consider the matrices A = 1 1, A 1 =

23 Then AA 1 A = = = = A Therefore A 1 is a conditional inverse for A.

24 But it can also be shown that A 2 = is also a conditional inverse for A! So conditional inverses are not unique. That is why we speak of a conditional inverse for A, not the conditional inverse for A. Of course, if A is nonsingular, then the conditional inverse is uniquely the regular inverse. We can use this in the above example to show that A is singular.

25 For a square matrix to have a regular inverse, it must satisfy other conditions, namely nonsingularity. However, this is not the case for a conditional inverse. Theorem Let A be a n p matrix. Then A has a conditional inverse.

26 Proof. Let A have rank r. It is possible to perform a serise of elementary row and column operations (multiplication, transposition, and addition) on A to reduce it to the form B = [ Ir If we denote the matrices of the row and column operations by P and Q (which are nonsingular), then we get ]. PAQ = B.

27 Now consider the p n matrix B T = [ Ir where the s are appropriately dimensioned. ] It is not too much work to see that BB T B = B, so B T is a conditional inverse of B.

28 Now since P and Q are nonsingular, A = P 1 BQ 1. Then A(QB T P)A = P 1 BQ 1 QB T PP 1 BQ 1 = P 1 BB T BQ 1 = P 1 BQ 1 = A. By definition, QB T P is a conditional inverse for A. Therefore A has a conditional inverse.

29 Finding a conditional inverse How do we find a conditional inverse? The above theorem gives one way, but there is an easier way: 1 Find a minor M of A which is nonsingular and of dimension r(a) r(a). 2 Find M 1 and (M 1 ) T. 3 Replace M in A with (M 1 ) T and the other entries with zeros. 4 Transpose the resulting matrix.

30 Example. In the earlier example, we have A = It can be seen that r(a) = 2, so we take the principal 2 2 minor M = [ ].

31 and (M 1 ) T = 1 4 A c = [ T ] T = = [ This is the conditional inverse A 1 in the earlier example, so we can see that it works. On the other hand, if we take the lower left 2 2 minor, following the procedure gives us A 2. So this procedure can produce more than one conditional inverse.. ]

32 Conditional inverse properties Let A be a n p matrix of rank r, where n p r. Then A c A and AA c are idempotent; r(aa c ) = r(a c A) = r; (A c ) T = (A T ) c ; A = A(A T A) c (A T A) and A T = (A T A)(A T A) c A T ; I A c A is idempotent.

33 More properties We say that an expression involving a conditional inverse is unique if it is the same no matter what conditional inverse we use. A(A T A) c A T is unique, symmetric, and idempotent; r(a(a T A) c A T ) = r; I A(A T A) c A T is unique, symmetric and idempotent; r(i A(A T A) c A T ) = n r.

34 Example proof. [A(A T A) c A T ] T = A[(A T A) c ] T A T = A[(A T A) T ] c A T = A(A T A) c A T. A(A T A) c A T A(A T A) c A T = [ A(A T A) c A T A ] (A T A) c A T = A(A T A) c A T.

35 Solving the normal equations Now that we have developed the machinery, we can try to solve the normal equations X T X b = X T y. First, we must make sure that they have a solution! Theorem The system Ax = g is consistent if and only if the rank of [ A g ] is equal to the rank of A.

36 Proof. ( ) Assume that r( [ A g ] ) = r(a). Because adding g does not add to the rank, this must mean that g is a linear combination of the columns of A. Therefore there exist constants x 1, x 2,..., x p, not all zero, so that x 1 a 1 + x 2 a x p a p = g where a i is the ith column of A. But if we put this into matrix notation and set x 1 x 2 x =., then this is exactly the system Ax = g. Therefore the system is consistent. x p

37 Theorem Let y = X β + ε be a linear model. Then the normal equations are consistent. X T X b = X T y Proof. It is obvious that r(x T X ) r( [ X T X X T y ] ), as adding a column cannot decrease the number of linearly independent columns.

38 However, using rank properties from earlier on, r( [ X T X X T y ] ) = r(x T [ X y ] ) r(x T ) = r(x T X ). Therefore r( [ X T X X T y ] ) = r(x T X ) and the previous theorem shows that the normal equations are consistent.

39 Now that we know the normal equations have a solution, how do we find it?

40 Now that we know the normal equations have a solution, how do we find it? We use conditional inverses. Theorem Let Ax = g be a consistent system. Then A c g is a solution to the system, where A c is any conditional inverse for A.

41 Proof. Since Ax = g, AA c g = AA c Ax = Ax = g. Therefore, A c g solves the system. From this theorem, we see that b = (X T X ) c X T y solves the normal equations, for any conditional inverse. However, in the less than full rank model, different conditional inverses may result in different solutions.

42 Example. Suppose that for a particular linear model, we derive X T X = 1 1, X T y = This could potentially arise from a two-class classification model with one sample from each class: X = [ ].

43 The normal equations are then b b 1 b 2 = Since the last column of X T X is the sum of the first two, X T X is not of full rank. However, since the first two columns are not multiples of each other, r(x T X ) = 2.

44 To find a conditional inverse[ of X T X ], we apply the algorithm, 2 1 using the nonsingular minor. This gives us (X T X ) c = 1 2 and therefore b = (X T X ) c X T y = = 8 2.

45 [ ] 1 However, using the minor gives the conditional inverse 1 (X T X ) c = 1, 1 which gives the solution b = = Both these solutions solve the normal equations, and are equally valid! This is the problem with the less than full rank model. 6 8.

46 Example. Consider the earlier carbon removal example. We have X T X = so a conditional inverse is (X T X ) c =

47 We can also calculate X T y = Using the conditional inverse gives us a solution (but not the solution) to the normal equations: b = (X T X ) c X T y =

48 In fact, if the model is less than full rank, the normal equations have an infinite number of solutions. Theorem Let Ax = g be a consistent system. Then x = A c g + (I A c A)z solves the system, where z is an arbitrary p 1 vector.

49 Proof. We know that A c g solves the system, so Ax = A [A c g + (I A c A)z] = AA c g + (A AA c A)z = g + (A A)z = g. For the normal equations, this means that any vector of the form b = (X T X ) c X T y + [I (X T X ) c X T X ]z also satisfies the equations.

50 Example. In the two-class example above, one solution to the normal equations was [ 8 2 ] T. Using the first conditional inverse found, (X T X ) c X T X = = Let z = [ ] T, arbitrarily. Then another solution to the normal equations is b = =

51 Example. In the carbon removal example, our conditional inverse gives us (X T X ) c X T X = and so another solution to the normal equations is b = =

52 The converse of the above theorem is also true: all solutions to the system can be expressed in this form. Theorem Let Ax = g be a consistent system and let x be any solution to the system. Then where z = x. x = A c g + (I A c A)z

53 Proof. Since x solves the system, A c g + (I A c A)z = A c g + (I A c A)x = A c g + x A c Ax = A c g + x A c g = x. For the normal equations, this means that any solution can be expressed as b = (X T X ) c X T y + [I (X T X ) c X T X ]z for any conditional inverse (X T X ) c.

54 Example. In the two-class example, we found the solution 8 b 1 = 2 using our original conditional inverse. But we also noted that the conditional inverse (X T X ) c 2 = 1 1 produces the solution b 2 = 6 8.

55 Using the theorem, the first solution can be written in terms of the second solution: b 1 = (X T X ) c 2X T y + (I (X T X ) c 2X T X )z 1 = = =

56 Estimability Now we know how to solve the normal equations; furthermore, we know how to find all solutions for them. But which solution do we want? Which one is the best?

57 Estimability Now we know how to solve the normal equations; furthermore, we know how to find all solutions for them. But which solution do we want? Which one is the best? All of them! They are all equally valid. This means that we can never estimate the parameters.

58 However, not all hope is lost. There is at least one thing which is not arbitrary.

59 However, not all hope is lost. There is at least one thing which is not arbitrary. It is the value of the response variable, y. No matter what the parameters are estimated to be, y will never change! In fact, there exist linear combinations of the parameters will always be estimated at the same value no matter what solution we use for the normal equations. We call these linear combinations estimable.

60 As we might guess, combinations which are estimable can be linked to the response variable in some way. Formally: Definition Let y = X β + ε be a linear model. A function t T β is said to be estimable if there exists a vector c such that E[c T y] = t T β. Another way of looking at it is that there must exist a linear unbiased estimator for t T β.

61 We look at some equivalent conditions to estimability. Theorem Let y = X β + ε be a linear model where ε has mean and variance σ 2 I. Then t T β is estimable if and only if there is a solution to the linear system X T X z = t. Proof. ( ) Let z be a solution to X T X z = t and put c = X z. Then E[c T y] = E[z T X T y] = z T X T E[y] = z T X T X β = t T β, so t T β is estimable.

62 Example. Consider our two-class example. As a reminder, we had X T X = Consider the combination of parameters β 1 β 2. This corresponds to t T β where t = 1 1.

63 Now we look for a solution to the system z z 2 = z 3 1 A little thought shows that this solution has the system z 1 =, z 2 = 1, z 3 = 1, so β 1 β 2 is estimable..

64 Theorem Let y = X β + ε be a linear model where ε has mean and variance σ 2 I. Then t T β is estimable if and only if t T (X T X ) c X T X = t T for any conditional inverse of (X T X ). Proof. ( ) Assume that t T (X T X ) c X T X = t T, so X T X ((X T X ) c ) T t = X T X (X T X ) c t = t. This means that (X T X ) c t is a solution to the system X T X z = t, and the previous theorem implies that t T β is estimable.

65 ( ) Suppose that t T β is estimable. By the previous theorem, there exists a solution to the system X T X z = t. Using the conditional inverse, we know that a solution is z = (X T X ) c t. In other words, X T X (X T X ) c t = t and by taking transposes as above, we see that this gives t T (X T X ) c X T X = t T.

66 Example. Consider the previous example. Let us take the conditional inverse (X T X ) c = and consider the same quantity, β 1 β 2, which corresponds to t = [ 1 1 ] T. Then t T (X T X ) c (X T X ) = [ 1 1 ] = [ 1 1 ] = [ 1 1 ] = t T, so again we see that β 1 β 2 is estimable.

67 On the other hand, suppose we take t = [ 1 ] T so that t T β = β. Then we have t T (X T X ) c (X T X ) = [ 1 ] so β is not estimable. = [ 1 ] = [ 1 1 ] t T,

68 Example. We return to the carbon removal example. We are interested in seeing if the various carbon removal treatments have (significantly) different means. To test this, we look at the quantities τ 1 τ 2 and τ 1 τ 3. If both of these are, then the treatments are the same. We have (X T X ) c X T X = and the coefficient vectors t 1 = 1 1, t 2 =

69 t T 1 (X T X ) c X T X = [ 1 1 ] so t T 1 β = τ 1 τ 2 is estimable. t T 1 (X T X ) c X T X = [ 1 1 ] so t T 2 β = τ 1 τ 3 is also estimable = [ 1 1 ] = [ 1 1 ]

70 Using our definition of estimable, we can prove formally that no matter what conditional inverse we use, we will still generate the same estimate for an estimable quantity. First we will state a supporting lemma. Lemma Let y = X β + ε where ε has mean and variance σ 2 I. The best linear unbiased estimator for any estimable function t T β is z T X T y, where z is a solution to the system X T X z = t.

71 Theorem (A Gauss-Markov Theorem) Let y = X β + ε be a linear model where ε has mean and variance σ 2 I. Suppose t T β is estimable. Then any solution to the system X T X z = t gives the same estimate for t T β. Furthermore, this estimate is t T b, where b is any solution to the normal equations. Lastly, this estimate is BLUE. Proof. Suppose we have two solutions to the system X T X z = t, called z and z 1. Let b be any solution to the normal equations, which means that X T X b = X T y.

72 From the previous lemma, the best linear unbiased estimator of t T β is Similarly, z T X T y = z T X T X b = (X T X z ) T b = t T b. z T 1 X T y = t T b = z T X T y. This shows that the best linear unbiased estimator is unique, and equal to t T b.

73 Example. Let s look again at the two-class example. We have shown that β 1 β 2 is estimable. We also know that solutions to the normal equations include b = 8 2, b = If we want to estimate β 1 β 2, we would use t T b = [ 1 1 ] 8 2 =

74 However, from the above theorem, we can also use t T b = [ 1 1 ] 6 = 2. 8 It is not a coincidence that this estimate is the same as the previous one! The theorem shows that any solution to the normal equation, using any conditional inverse, will produce exactly the same estimate. In other words, the estimator is unique.

75 Example. We look at the carbon removal example. We have shown that τ 1 τ 2 and τ 1 τ 3 are estimable. We estimate them by and t T 1 b = [ 1 1 ] t T 2 b = [ 1 1 ] = 4.3 = 8.2 respectively. The Gauss-Markov theorem shows that no matter what conditional inverse we use to calculate b, these estimates will always remain the same.

76 Estimability theorems Now that we have defined estimability, we would like to know which quantities are estimable and which are not (so that we can decide what we want to find out before we start the study!). The first quantities which are definitely estimable are elements of y this is how we defined estimability, after all! Theorem Let y = X β + ε be a linear model. Then elements of X β are estimable.

77 Proof. We know that E[y] = X β. Therefore, we can multiply X β by each of 1. T, 1. T,...,. 1 T to get functions which are estimable. But these are the elements of X β, so the elements of X β are estimable.

78 Example. Consider the carbon removal example. We have the vectors X = , β = µ τ 1 τ 2 τ 3. We showed earlier that we cannot estimate the parameter vector β.

79 However, the real quantities of interest in this model are the mean responses from the three treatments. These are µ + τ 1, µ + τ 2, and µ + τ 3. We can see that µ + τ 1 = [ 1 1 ] β µ + τ 2 = [ 1 1 ] β µ + τ 3 = [ 1 1 ] β and each of these are elements of X β. Therefore, they are estimable. We would estimate them by replacing β with b, where b is any solution to the normal equations (it does not matter which). In fact, in a classification model with any k, µ + τ i is always estimable.

80 We know that elements of X β are estimable; what else?

81 We know that elements of X β are estimable; what else? If we combine estimable quantities (in a linear manner), the result should be estimable. Theorem Let t T 1 β, tt 2 β,..., tt k β all be estimable functions, and let z = a 1 t T 1 β + a 2 t T 2 β a k t T k β. Then z is estimable, and the best linear unbiased estimator for z is a 1 t T 1 b + a 2 t T 2 b a k t T k b.

82 Proof. By definition, z = (a 1 t 1 + a 2 t a k t k ) T β. Since all the functions are estimable, (a 1 t 1 + a 2 t a k t k ) T (X T X ) c X T X = a 1 t T 1 (X T X ) c X T X + a 2 t T 2 (X T X ) c X T X a k t T k (X T X ) c X T X = (a 1 t 1 + a 2 t a k t k ) T.

83 Therefore z is estimable, with estimator (a 1 t 1 + a 2 t a k t k ) T b. Of particular interest in many studies is the way different populations compare against each other. To attach a numerical value to these comparisons, we form linear combinations a 1 τ 1 + a 2 τ a k τ k, where k i=1 a i =. These treatment contrasts wipe out the effect of the overall mean response, so as to get a better picture of the differences between populations.

84 In a one-way classification model, any treatment contrast is estimable. We show this by noting that if is a treatment contrast, then z = a 1 τ 1 + a 2 τ a k τ k z = k a k µ + a 1 τ 1 + a 2 τ a k τ k i=1 = a 1 (µ + τ 1 ) + a 2 (µ + τ 2 ) a k (µ + τ k ) is a linear combination of the estimable functions µ + τ i, and is therefore itself estimable.

85 Of particular interest among treatment contrasts is the contrast of the form τ i τ j, for some i j. This is because τ i τ j = (µ + τ i ) (µ + τ j ) is the difference between the mean response in population i and the mean response in population j. If we write ȳ i for the sample mean from population i, then we would expect to estimate this contrast by the corresponding difference in sample means, ȳ i ȳ j. We can show using the theory we have developed that this is in fact the case.

86 Example. We do this for k = 3 and the contrast τ 1 τ 2. Our matrices are X =...., y = y 11 y 12. y 1n1 y 21 y 22. y 2n2 y 31 y 32., β = µ τ 1 τ 2 τ 3. Direct multiplication gives y 3n3

87 X T y = 3 i=1 nj j=1 y ij j=1 y 1j j=1 y 2j j=1 y 3j, X T X = n n 1 n 2 n 3 n 1 n 1 n 2 n 2 n 3 n 3 We can use the conditional inverse algorithm on the lower right corner of X T X to get (X T X ) c 1 = n 1 1 n 2. 1 n 3.

88 Therefore a solution to the normal equations is b = (X T X ) c X T y = ȳ 1 ȳ 2 ȳ 3. We can write τ 1 τ 2 as [ 1 1 ] β, so the best linear unbiased estimator for τ 1 τ 2 is [ ] 1 1 ȳ 1 ȳ 2 ȳ 3 = ȳ 1 ȳ 2. If we took any other conditional inverse, we would get the same result.

89 Example. In the carbon removal example, we showed that τ 1 τ 2 and τ 1 τ 3 are estimable. Both of these are contrasts, so we can say straight off that they are estimable (without doing the calculations).

90 Estimating σ 2 in the less than full rank model In the full rank model, we estimated σ 2 by s 2 = SS Res n p, where n is the sample size, p is the number of parameters, and SS Res is the sum of squares of the residuals: SS Res = (y X b) T (y X b) = y T [I X (X T X ) 1 X T ]y. We would like to find a corresponding expression for the less than full rank model, but obviously it will not be the same (since (X T X ) 1 does not exist).

91 We still define the residual sum of squares as SS Res = (y X b) T (y X b), where b is any solution to the normal equations. The important thing is that although b can vary, X b will not, because the elements of X β are estimable. Therefore SS Res is invariant to the choice of b. Next, we find the equivalent expression for SS Res. Theorem SS Res = y T [I X (X T X ) c X T ]y.

92 Proof. Let b = (X T X ) c X T y and recall that X (X T X ) c X T X = X. Then SS Res = (y T b T X T )(y X b) = y T y 2y T X b + b T X T X b = y T y 2y T X (X T X ) c X T y + y T X (X T X ) c X T X (X T X ) c X T y = y T y 2y T X (X T X ) c X T y + y T X (X T X ) c X T y = y T [I X (X T X ) c X T ]y.

93 How do we now find an estimator for σ 2? Using the quadratic forms theory that we developed earlier, we know that E[SS Res ] = E[y T (I X (X T X ) c X T )y] = tr(i X (X T X ) c X T )σ 2 + (X β) T (I X (X T X ) c X T )X β = tr(i X (X T X ) c X T )σ 2 + β T X T X β β T X T X (X T X ) c X T X β = tr(i X (X T X ) c X T )σ 2 + β T X T X β β T X T X β = tr(i X (X T X ) c X T )σ 2.

94 It can be shown that I X (X T X ) c X T is symmetric and idempotent, so E[SS Res ] = r(i X (X T X ) c X T )σ 2 = (n r)σ 2, where r = r(x ), the rank of X. This gives us the following theorem. Theorem Let y = X β + ε be a linear model, where X has rank r and ε has mean and variance σ 2 I. Then an unbiased estimator for σ 2 is SS Res n r.

95 Example. We return to the carbon removal example. The fitted values are X b = =

96 So the residuals are y X b = =

97 This means SS Res = (y X b) T (y X b) = 1.3. The rank of X is easily seen to be 3, so s 2 = =.217.

98 Interval estimation in the less than full rank model As for the full rank model, we have estimated what we could estimate. The next step is to try and find confidence intervals for our estimates. So far, we have not assumed that the error vector ε is normally distributed. However, to find confidence intervals, we need some idea of the distribution of the variables, so we make that assumption now.

99 Recall that in the full rank model, we generated confidence intervals by finding a t-distributed quantity, which was created by dividing a normal variable by a χ 2 variable. The χ 2 variable was SS Res σ 2, which had n p degrees of freedom. The σ 2 term was not known, but cancelled out another σ 2 term in the numerator to leave us with something that we could calculate.

100 We can do pretty much the same thing for the less than full rank model. Theorem Let y = X β + ε be a linear model, where ε is a normal random vector with mean and variance σ 2 I. Then (n r)s 2 σ 2 = SS Res σ 2 has a χ 2 distribution with n r degrees of freedom. The proof of this theorem is very similar to that for the full rank case, so we will not repeat it.

101 The steps to derive a confidence interval are very similar to that for the full rank case, but with two small differences. Firstly, we can only find confidence intervals for quantities that are estimable! Secondly, we replace the inverse (X T X ) 1 by the conditional inverse (X T X ) c. All other steps are the same.

102 This gives us the confidence interval for the (estimable) quantity t T β, using a t distribution with n r degrees of freedom: t T b ± t α/2 s t T (X T X ) c t This formula can also be used to find confidence intervals for the individual parameters, providing that they are estimable.

103 Example. We return again to the carbon removal example. Suppose we want to find a 95% confidence interval for τ 1 τ 2. We have t = 1 1, s2 =.217, t.25 = 2.45 using n r = 9 3 = 6 degrees of freedom. We also use the conditional inverse (X T X ) c =

104 This gives the confidence interval t T b ± t α/2 s t T (X T X ) c t = [ 1 1 ] ± [ ] = 4.3 ±.93 = ( 5.23, 3.37). 1 1 In particular, we can say with 95% confidence that the the first carbon removal treatment is not as effective as the second.

105 Example. We showed earlier that in a general 3-way classification model, the contrast τ 1 τ 2 can be estimated by the difference in the respective population means, ȳ 1 ȳ 2. We also had t = 1 1, (X T X ) c = 1 n 1 1 n 2 1 n 3.

106 Therefore we have t T (X T X ) c t = [ 1 1 ] 1 n 1 1 n 2 1 n = 1 n n 2 and the confidence interval is ȳ 1 ȳ 2 ± t α/2 s 1 n n 2. You have probably seen this formula before! The linear models framework allows us to derive it from first principles.

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

More information

1 Introduction to Matrices

1 Introduction to Matrices 1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix. MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix. Nullspace Let A = (a ij ) be an m n matrix. Definition. The nullspace of the matrix A, denoted N(A), is the set of all n-dimensional column

More information

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2. Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given

More information

Solution to Homework 2

Solution to Homework 2 Solution to Homework 2 Olena Bormashenko September 23, 2011 Section 1.4: 1(a)(b)(i)(k), 4, 5, 14; Section 1.5: 1(a)(b)(c)(d)(e)(n), 2(a)(c), 13, 16, 17, 18, 27 Section 1.4 1. Compute the following, if

More information

University of Lille I PC first year list of exercises n 7. Review

University of Lille I PC first year list of exercises n 7. Review University of Lille I PC first year list of exercises n 7 Review Exercise Solve the following systems in 4 different ways (by substitution, by the Gauss method, by inverting the matrix of coefficients

More information

Lecture 2 Matrix Operations

Lecture 2 Matrix Operations Lecture 2 Matrix Operations transpose, sum & difference, scalar multiplication matrix multiplication, matrix-vector product matrix inverse 2 1 Matrix transpose transpose of m n matrix A, denoted A T or

More information

160 CHAPTER 4. VECTOR SPACES

160 CHAPTER 4. VECTOR SPACES 160 CHAPTER 4. VECTOR SPACES 4. Rank and Nullity In this section, we look at relationships between the row space, column space, null space of a matrix and its transpose. We will derive fundamental results

More information

Solving Linear Systems, Continued and The Inverse of a Matrix

Solving Linear Systems, Continued and The Inverse of a Matrix , Continued and The of a Matrix Calculus III Summer 2013, Session II Monday, July 15, 2013 Agenda 1. The rank of a matrix 2. The inverse of a square matrix Gaussian Gaussian solves a linear system by reducing

More information

Linear Algebra Notes for Marsden and Tromba Vector Calculus

Linear Algebra Notes for Marsden and Tromba Vector Calculus Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of

More information

Quadratic forms Cochran s theorem, degrees of freedom, and all that

Quadratic forms Cochran s theorem, degrees of freedom, and all that Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us

More information

Recall that two vectors in are perpendicular or orthogonal provided that their dot

Recall that two vectors in are perpendicular or orthogonal provided that their dot Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal

More information

NOTES ON LINEAR TRANSFORMATIONS

NOTES ON LINEAR TRANSFORMATIONS NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all

More information

CURVE FITTING LEAST SQUARES APPROXIMATION

CURVE FITTING LEAST SQUARES APPROXIMATION CURVE FITTING LEAST SQUARES APPROXIMATION Data analysis and curve fitting: Imagine that we are studying a physical system involving two quantities: x and y Also suppose that we expect a linear relationship

More information

Row Echelon Form and Reduced Row Echelon Form

Row Echelon Form and Reduced Row Echelon Form These notes closely follow the presentation of the material given in David C Lay s textbook Linear Algebra and its Applications (3rd edition) These notes are intended primarily for in-class presentation

More information

Chapter 7. Matrices. Definition. An m n matrix is an array of numbers set out in m rows and n columns. Examples. ( 1 1 5 2 0 6

Chapter 7. Matrices. Definition. An m n matrix is an array of numbers set out in m rows and n columns. Examples. ( 1 1 5 2 0 6 Chapter 7 Matrices Definition An m n matrix is an array of numbers set out in m rows and n columns Examples (i ( 1 1 5 2 0 6 has 2 rows and 3 columns and so it is a 2 3 matrix (ii 1 0 7 1 2 3 3 1 is a

More information

LS.6 Solution Matrices

LS.6 Solution Matrices LS.6 Solution Matrices In the literature, solutions to linear systems often are expressed using square matrices rather than vectors. You need to get used to the terminology. As before, we state the definitions

More information

Notes on Determinant

Notes on Determinant ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without

More information

Math 312 Homework 1 Solutions

Math 312 Homework 1 Solutions Math 31 Homework 1 Solutions Last modified: July 15, 01 This homework is due on Thursday, July 1th, 01 at 1:10pm Please turn it in during class, or in my mailbox in the main math office (next to 4W1) Please

More information

Systems of Linear Equations

Systems of Linear Equations Systems of Linear Equations Beifang Chen Systems of linear equations Linear systems A linear equation in variables x, x,, x n is an equation of the form a x + a x + + a n x n = b, where a, a,, a n and

More information

Matrix Algebra. Some Basic Matrix Laws. Before reading the text or the following notes glance at the following list of basic matrix algebra laws.

Matrix Algebra. Some Basic Matrix Laws. Before reading the text or the following notes glance at the following list of basic matrix algebra laws. Matrix Algebra A. Doerr Before reading the text or the following notes glance at the following list of basic matrix algebra laws. Some Basic Matrix Laws Assume the orders of the matrices are such that

More information

8 Square matrices continued: Determinants

8 Square matrices continued: Determinants 8 Square matrices continued: Determinants 8. Introduction Determinants give us important information about square matrices, and, as we ll soon see, are essential for the computation of eigenvalues. You

More information

T ( a i x i ) = a i T (x i ).

T ( a i x i ) = a i T (x i ). Chapter 2 Defn 1. (p. 65) Let V and W be vector spaces (over F ). We call a function T : V W a linear transformation form V to W if, for all x, y V and c F, we have (a) T (x + y) = T (x) + T (y) and (b)

More information

Similar matrices and Jordan form

Similar matrices and Jordan form Similar matrices and Jordan form We ve nearly covered the entire heart of linear algebra once we ve finished singular value decompositions we ll have seen all the most central topics. A T A is positive

More information

Matrix Differentiation

Matrix Differentiation 1 Introduction Matrix Differentiation ( and some other stuff ) Randal J. Barnes Department of Civil Engineering, University of Minnesota Minneapolis, Minnesota, USA Throughout this presentation I have

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

CONTROLLABILITY. Chapter 2. 2.1 Reachable Set and Controllability. Suppose we have a linear system described by the state equation

CONTROLLABILITY. Chapter 2. 2.1 Reachable Set and Controllability. Suppose we have a linear system described by the state equation Chapter 2 CONTROLLABILITY 2 Reachable Set and Controllability Suppose we have a linear system described by the state equation ẋ Ax + Bu (2) x() x Consider the following problem For a given vector x in

More information

Solving Systems of Linear Equations

Solving Systems of Linear Equations LECTURE 5 Solving Systems of Linear Equations Recall that we introduced the notion of matrices as a way of standardizing the expression of systems of linear equations In today s lecture I shall show how

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Vector and Matrix Norms

Vector and Matrix Norms Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty

More information

Linearly Independent Sets and Linearly Dependent Sets

Linearly Independent Sets and Linearly Dependent Sets These notes closely follow the presentation of the material given in David C. Lay s textbook Linear Algebra and its Applications (3rd edition). These notes are intended primarily for in-class presentation

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

More information

Linear Algebra Notes

Linear Algebra Notes Linear Algebra Notes Chapter 19 KERNEL AND IMAGE OF A MATRIX Take an n m matrix a 11 a 12 a 1m a 21 a 22 a 2m a n1 a n2 a nm and think of it as a function A : R m R n The kernel of A is defined as Note

More information

1 Determinants and the Solvability of Linear Systems

1 Determinants and the Solvability of Linear Systems 1 Determinants and the Solvability of Linear Systems In the last section we learned how to use Gaussian elimination to solve linear systems of n equations in n unknowns The section completely side-stepped

More information

1 Review of Least Squares Solutions to Overdetermined Systems

1 Review of Least Squares Solutions to Overdetermined Systems cs4: introduction to numerical analysis /9/0 Lecture 7: Rectangular Systems and Numerical Integration Instructor: Professor Amos Ron Scribes: Mark Cowlishaw, Nathanael Fillmore Review of Least Squares

More information

The Determinant: a Means to Calculate Volume

The Determinant: a Means to Calculate Volume The Determinant: a Means to Calculate Volume Bo Peng August 20, 2007 Abstract This paper gives a definition of the determinant and lists many of its well-known properties Volumes of parallelepipeds are

More information

Using row reduction to calculate the inverse and the determinant of a square matrix

Using row reduction to calculate the inverse and the determinant of a square matrix Using row reduction to calculate the inverse and the determinant of a square matrix Notes for MATH 0290 Honors by Prof. Anna Vainchtein 1 Inverse of a square matrix An n n square matrix A is called invertible

More information

Lecture Notes 2: Matrices as Systems of Linear Equations

Lecture Notes 2: Matrices as Systems of Linear Equations 2: Matrices as Systems of Linear Equations 33A Linear Algebra, Puck Rombach Last updated: April 13, 2016 Systems of Linear Equations Systems of linear equations can represent many things You have probably

More information

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

More information

7 Gaussian Elimination and LU Factorization

7 Gaussian Elimination and LU Factorization 7 Gaussian Elimination and LU Factorization In this final section on matrix factorization methods for solving Ax = b we want to take a closer look at Gaussian elimination (probably the best known method

More information

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013 Notes on Orthogonal and Symmetric Matrices MENU, Winter 201 These notes summarize the main properties and uses of orthogonal and symmetric matrices. We covered quite a bit of material regarding these topics,

More information

SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89. by Joseph Collison

SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89. by Joseph Collison SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89 by Joseph Collison Copyright 2000 by Joseph Collison All rights reserved Reproduction or translation of any part of this work beyond that permitted by Sections

More information

Orthogonal Projections

Orthogonal Projections Orthogonal Projections and Reflections (with exercises) by D. Klain Version.. Corrections and comments are welcome! Orthogonal Projections Let X,..., X k be a family of linearly independent (column) vectors

More information

1 Solving LPs: The Simplex Algorithm of George Dantzig

1 Solving LPs: The Simplex Algorithm of George Dantzig Solving LPs: The Simplex Algorithm of George Dantzig. Simplex Pivoting: Dictionary Format We illustrate a general solution procedure, called the simplex algorithm, by implementing it on a very simple example.

More information

Solving Mass Balances using Matrix Algebra

Solving Mass Balances using Matrix Algebra Page: 1 Alex Doll, P.Eng, Alex G Doll Consulting Ltd. http://www.agdconsulting.ca Abstract Matrix Algebra, also known as linear algebra, is well suited to solving material balance problems encountered

More information

3.1 Least squares in matrix form

3.1 Least squares in matrix form 118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression

More information

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively. Chapter 7 Eigenvalues and Eigenvectors In this last chapter of our exploration of Linear Algebra we will revisit eigenvalues and eigenvectors of matrices, concepts that were already introduced in Geometry

More information

[1] Diagonal factorization

[1] Diagonal factorization 8.03 LA.6: Diagonalization and Orthogonal Matrices [ Diagonal factorization [2 Solving systems of first order differential equations [3 Symmetric and Orthonormal Matrices [ Diagonal factorization Recall:

More information

Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

More information

The Characteristic Polynomial

The Characteristic Polynomial Physics 116A Winter 2011 The Characteristic Polynomial 1 Coefficients of the characteristic polynomial Consider the eigenvalue problem for an n n matrix A, A v = λ v, v 0 (1) The solution to this problem

More information

Polynomial Invariants

Polynomial Invariants Polynomial Invariants Dylan Wilson October 9, 2014 (1) Today we will be interested in the following Question 1.1. What are all the possible polynomials in two variables f(x, y) such that f(x, y) = f(y,

More information

Inner Product Spaces

Inner Product Spaces Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

More information

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation

More information

Direct Methods for Solving Linear Systems. Matrix Factorization

Direct Methods for Solving Linear Systems. Matrix Factorization Direct Methods for Solving Linear Systems Matrix Factorization Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011

More information

Matrices 2. Solving Square Systems of Linear Equations; Inverse Matrices

Matrices 2. Solving Square Systems of Linear Equations; Inverse Matrices Matrices 2. Solving Square Systems of Linear Equations; Inverse Matrices Solving square systems of linear equations; inverse matrices. Linear algebra is essentially about solving systems of linear equations,

More information

Methods for Finding Bases

Methods for Finding Bases Methods for Finding Bases Bases for the subspaces of a matrix Row-reduction methods can be used to find bases. Let us now look at an example illustrating how to obtain bases for the row space, null space,

More information

5. Orthogonal matrices

5. Orthogonal matrices L Vandenberghe EE133A (Spring 2016) 5 Orthogonal matrices matrices with orthonormal columns orthogonal matrices tall matrices with orthonormal columns complex matrices with orthonormal columns 5-1 Orthonormal

More information

Linear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007)

Linear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007) MAT067 University of California, Davis Winter 2007 Linear Maps Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007) As we have discussed in the lecture on What is Linear Algebra? one of

More information

Lecture 3: Finding integer solutions to systems of linear equations

Lecture 3: Finding integer solutions to systems of linear equations Lecture 3: Finding integer solutions to systems of linear equations Algorithmic Number Theory (Fall 2014) Rutgers University Swastik Kopparty Scribe: Abhishek Bhrushundi 1 Overview The goal of this lecture

More information

1 2 3 1 1 2 x = + x 2 + x 4 1 0 1

1 2 3 1 1 2 x = + x 2 + x 4 1 0 1 (d) If the vector b is the sum of the four columns of A, write down the complete solution to Ax = b. 1 2 3 1 1 2 x = + x 2 + x 4 1 0 0 1 0 1 2. (11 points) This problem finds the curve y = C + D 2 t which

More information

5 Homogeneous systems

5 Homogeneous systems 5 Homogeneous systems Definition: A homogeneous (ho-mo-jeen -i-us) system of linear algebraic equations is one in which all the numbers on the right hand side are equal to : a x +... + a n x n =.. a m

More information

1.2 Solving a System of Linear Equations

1.2 Solving a System of Linear Equations 1.. SOLVING A SYSTEM OF LINEAR EQUATIONS 1. Solving a System of Linear Equations 1..1 Simple Systems - Basic De nitions As noticed above, the general form of a linear system of m equations in n variables

More information

9.2 Summation Notation

9.2 Summation Notation 9. Summation Notation 66 9. Summation Notation In the previous section, we introduced sequences and now we shall present notation and theorems concerning the sum of terms of a sequence. We begin with a

More information

DETERMINANTS IN THE KRONECKER PRODUCT OF MATRICES: THE INCIDENCE MATRIX OF A COMPLETE GRAPH

DETERMINANTS IN THE KRONECKER PRODUCT OF MATRICES: THE INCIDENCE MATRIX OF A COMPLETE GRAPH DETERMINANTS IN THE KRONECKER PRODUCT OF MATRICES: THE INCIDENCE MATRIX OF A COMPLETE GRAPH CHRISTOPHER RH HANUSA AND THOMAS ZASLAVSKY Abstract We investigate the least common multiple of all subdeterminants,

More information

MATH 551 - APPLIED MATRIX THEORY

MATH 551 - APPLIED MATRIX THEORY MATH 55 - APPLIED MATRIX THEORY FINAL TEST: SAMPLE with SOLUTIONS (25 points NAME: PROBLEM (3 points A web of 5 pages is described by a directed graph whose matrix is given by A Do the following ( points

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

Lecture 5 Least-squares

Lecture 5 Least-squares EE263 Autumn 2007-08 Stephen Boyd Lecture 5 Least-squares least-squares (approximate) solution of overdetermined equations projection and orthogonality principle least-squares estimation BLUE property

More information

Solving simultaneous equations using the inverse matrix

Solving simultaneous equations using the inverse matrix Solving simultaneous equations using the inverse matrix 8.2 Introduction The power of matrix algebra is seen in the representation of a system of simultaneous linear equations as a matrix equation. Matrix

More information

DETERMINANTS TERRY A. LORING

DETERMINANTS TERRY A. LORING DETERMINANTS TERRY A. LORING 1. Determinants: a Row Operation By-Product The determinant is best understood in terms of row operations, in my opinion. Most books start by defining the determinant via formulas

More information

Question 2: How do you solve a matrix equation using the matrix inverse?

Question 2: How do you solve a matrix equation using the matrix inverse? Question : How do you solve a matrix equation using the matrix inverse? In the previous question, we wrote systems of equations as a matrix equation AX B. In this format, the matrix A contains the coefficients

More information

Chapter 19. General Matrices. An n m matrix is an array. a 11 a 12 a 1m a 21 a 22 a 2m A = a n1 a n2 a nm. The matrix A has n row vectors

Chapter 19. General Matrices. An n m matrix is an array. a 11 a 12 a 1m a 21 a 22 a 2m A = a n1 a n2 a nm. The matrix A has n row vectors Chapter 9. General Matrices An n m matrix is an array a a a m a a a m... = [a ij]. a n a n a nm The matrix A has n row vectors and m column vectors row i (A) = [a i, a i,..., a im ] R m a j a j a nj col

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length

More information

CS3220 Lecture Notes: QR factorization and orthogonal transformations

CS3220 Lecture Notes: QR factorization and orthogonal transformations CS3220 Lecture Notes: QR factorization and orthogonal transformations Steve Marschner Cornell University 11 March 2009 In this lecture I ll talk about orthogonal matrices and their properties, discuss

More information

Linear Algebra: Determinants, Inverses, Rank

Linear Algebra: Determinants, Inverses, Rank D Linear Algebra: Determinants, Inverses, Rank D 1 Appendix D: LINEAR ALGEBRA: DETERMINANTS, INVERSES, RANK TABLE OF CONTENTS Page D.1. Introduction D 3 D.2. Determinants D 3 D.2.1. Some Properties of

More information

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES I GROUPS: BASIC DEFINITIONS AND EXAMPLES Definition 1: An operation on a set G is a function : G G G Definition 2: A group is a set G which is equipped with an operation and a special element e G, called

More information

Lecture 4: Partitioned Matrices and Determinants

Lecture 4: Partitioned Matrices and Determinants Lecture 4: Partitioned Matrices and Determinants 1 Elementary row operations Recall the elementary operations on the rows of a matrix, equivalent to premultiplying by an elementary matrix E: (1) multiplying

More information

Abstract: We describe the beautiful LU factorization of a square matrix (or how to write Gaussian elimination in terms of matrix multiplication).

Abstract: We describe the beautiful LU factorization of a square matrix (or how to write Gaussian elimination in terms of matrix multiplication). MAT 2 (Badger, Spring 202) LU Factorization Selected Notes September 2, 202 Abstract: We describe the beautiful LU factorization of a square matrix (or how to write Gaussian elimination in terms of matrix

More information

MAT188H1S Lec0101 Burbulla

MAT188H1S Lec0101 Burbulla Winter 206 Linear Transformations A linear transformation T : R m R n is a function that takes vectors in R m to vectors in R n such that and T (u + v) T (u) + T (v) T (k v) k T (v), for all vectors u

More information

MULTIPLE REGRESSION WITH CATEGORICAL DATA

MULTIPLE REGRESSION WITH CATEGORICAL DATA DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting

More information

DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II. Matrix Algorithms DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

More information

Typical Linear Equation Set and Corresponding Matrices

Typical Linear Equation Set and Corresponding Matrices EWE: Engineering With Excel Larsen Page 1 4. Matrix Operations in Excel. Matrix Manipulations: Vectors, Matrices, and Arrays. How Excel Handles Matrix Math. Basic Matrix Operations. Solving Systems of

More information

6. Cholesky factorization

6. Cholesky factorization 6. Cholesky factorization EE103 (Fall 2011-12) triangular matrices forward and backward substitution the Cholesky factorization solving Ax = b with A positive definite inverse of a positive definite matrix

More information

1 Sets and Set Notation.

1 Sets and Set Notation. LINEAR ALGEBRA MATH 27.6 SPRING 23 (COHEN) LECTURE NOTES Sets and Set Notation. Definition (Naive Definition of a Set). A set is any collection of objects, called the elements of that set. We will most

More information

Inner product. Definition of inner product

Inner product. Definition of inner product Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product

More information

LINEAR ALGEBRA. September 23, 2010

LINEAR ALGEBRA. September 23, 2010 LINEAR ALGEBRA September 3, 00 Contents 0. LU-decomposition.................................... 0. Inverses and Transposes................................. 0.3 Column Spaces and NullSpaces.............................

More information

by the matrix A results in a vector which is a reflection of the given

by the matrix A results in a vector which is a reflection of the given Eigenvalues & Eigenvectors Example Suppose Then So, geometrically, multiplying a vector in by the matrix A results in a vector which is a reflection of the given vector about the y-axis We observe that

More information

Lecture 1: Systems of Linear Equations

Lecture 1: Systems of Linear Equations MTH Elementary Matrix Algebra Professor Chao Huang Department of Mathematics and Statistics Wright State University Lecture 1 Systems of Linear Equations ² Systems of two linear equations with two variables

More information

Solution of Linear Systems

Solution of Linear Systems Chapter 3 Solution of Linear Systems In this chapter we study algorithms for possibly the most commonly occurring problem in scientific computing, the solution of linear systems of equations. We start

More information

Matrix Representations of Linear Transformations and Changes of Coordinates

Matrix Representations of Linear Transformations and Changes of Coordinates Matrix Representations of Linear Transformations and Changes of Coordinates 01 Subspaces and Bases 011 Definitions A subspace V of R n is a subset of R n that contains the zero element and is closed under

More information

Solutions to Math 51 First Exam January 29, 2015

Solutions to Math 51 First Exam January 29, 2015 Solutions to Math 5 First Exam January 29, 25. ( points) (a) Complete the following sentence: A set of vectors {v,..., v k } is defined to be linearly dependent if (2 points) there exist c,... c k R, not

More information

8.2. Solution by Inverse Matrix Method. Introduction. Prerequisites. Learning Outcomes

8.2. Solution by Inverse Matrix Method. Introduction. Prerequisites. Learning Outcomes Solution by Inverse Matrix Method 8.2 Introduction The power of matrix algebra is seen in the representation of a system of simultaneous linear equations as a matrix equation. Matrix algebra allows us

More information

Vieta s Formulas and the Identity Theorem

Vieta s Formulas and the Identity Theorem Vieta s Formulas and the Identity Theorem This worksheet will work through the material from our class on 3/21/2013 with some examples that should help you with the homework The topic of our discussion

More information

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they

More information

CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION

CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION No: CITY UNIVERSITY LONDON BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION ENGINEERING MATHEMATICS 2 (resit) EX2005 Date: August

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Lecture L3 - Vectors, Matrices and Coordinate Transformations

Lecture L3 - Vectors, Matrices and Coordinate Transformations S. Widnall 16.07 Dynamics Fall 2009 Lecture notes based on J. Peraire Version 2.0 Lecture L3 - Vectors, Matrices and Coordinate Transformations By using vectors and defining appropriate operations between

More information

What is Linear Programming?

What is Linear Programming? Chapter 1 What is Linear Programming? An optimization problem usually has three essential ingredients: a variable vector x consisting of a set of unknowns to be determined, an objective function of x to

More information

Notes on Applied Linear Regression

Notes on Applied Linear Regression Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:

More information