Linear Models: The less than full rank model estimation and estimability
|
|
- Carmella Atkins
- 7 years ago
- Views:
Transcription
1 Linear Models: The less than full rank model estimation and estimability
2 The less than full rank model In previous sections we use the linear model y = X β + ε in the knowledge (or assumption) that X, of dimension n p, is of full rank, i.e. r(x ) = p. This assumption allows for easy(er) analysis, because a full rank X implies that X T X is invertible, and therefore the normal equations have a unique solution. X T X b = X T y
3 Unfortunately, not all linear models fall into this category. If this happens, we must develop other techniques to analyse the model. Example. A common (and commonly known) example of a less than full rank model is the one-way classification model with fixed effects. In this model, the samples come from k distinct populations, with different characteristics. We wish to determine the differences in these populations.
4 For example: A medical researcher might want to compare three different types of pain relievers for effectiveness in relieving arthritis; A biologist might study the effects of four experimental treatments used to enhance the growth of tomato plants; or An engineer might want to investigate the sulfur content in the five major coal seams in a particular geographic region. Often, the populations arise as the result of applying k different treatments to groups of similar subjects.
5 For this model, we give each response variable two indices, to denote both the population from which it is taken and its position in the samples from that population. So y ij is the jth sample taken from the ith population. The model we use is y ij = µ + τ i + ε ij, for i = 1, 2,..., k and j = 1, 2,..., n i, where k is the number of populations / treatments; n i is the number of samples from the ith population.
6 Although it might not look exactly like a linear model, it can be written in that form quite easily by taking Writing it out: (next page) β = µ τ 1 τ 2. τ k. Because the first column of X is the sum of the remaining columns, the columns are not linearly independent, and therefore X is not of full rank.
7 y 11 y 12. y 21 y 22. y k,nk = µ τ 1 τ 2. τ k + ε 11 ε 12. ε 21 ε 22. ε k,nk y = X β + ε
8 Example. Three different treatment methods for removing organic carbon from tar sand wastewater are compared: airflotation, foam separation, and ferric-chloride coagulation. A study is conducted and the amounts of carbon removed are as follows: AF FS FCC
9 The linear model is = µ τ 1 τ 2 τ 3 τ k + ε 11 ε 12 ε 13 ε 21 ε 22 ε 23 ε 31 ε 32 ε 33 y = X β + ε
10 As noted before, the big difficulty with a less than full rank model is that X T X is now singular. This means that the normal equations do not have a unique solution. We will show later that in fact, the normal equations now have an infinite number of solutions. However, the problem goes deeper than that: not only can we not estimate the parameters, but the parameters themselves do not have any fixed value! We show this in an example.
11 Example. Suppose that we have a one-way classification model with k = 3 populations. The response variable from each population is centred around µ + τ i. Now suppose that from a study, it is found that µ + τ 1 = 1 µ + τ 2 = 12 µ + τ 3 = 8 Then our parameters might be µ = 1, τ 1 =, τ 2 = 2, τ 3 = 2. However, we can also have µ = 3, τ 1 = 7, τ 2 = 9, τ 3 = 5! In fact we can choose µ to be any real number, and still describe the system.
12 Reparametrization One way we can tackle the less than full rank model is by the simple means of converting to a full rank model. We can then use all the machinery we have developed on the converted model. Example. Consider the one-way classification model with k = 3. The less than full rank model for this is y ij = µ + τ i + ε ij, for i = 1, 2, 3, j = 1, 2,..., n i. However, we can write the mean of each population as µ i = µ + τ i.
13 Then we can recast the model as y ij = µ i + ε ij with corresponding matrices X = 1, β = µ 1 µ 2 µ 3.
14 It is apparent now that the columns of X are linearly independent, and so this is a full rank model that we can fiddle with. Simple matrix calculations give us X T X = 1 n 1 n 2, (X T X ) 1 n 1 = 1 n 2 n 1 3 n 3 X T y = n1 i=1 y 1i n2 i=1 y 2i n3 i=1 y 3i, b = (X T X ) 1 X T y = n1 i=1 y 1i/ n1 n2 i=1 y 2i/ n2 n3 i=1 y 3i/ n3.
15 Therefore, the least squares estimates for each of the population means are the means of the samples drawn from that population: ˆµ i = 1 n i y ij. n i j=1 Linear functions of the parameters, of the form t T β, are estimated using t T b. For example, the function µ 1 µ 2 is estimated by 1 n 1 n 1 i=1 y 1i 1 n 2 y 2i. n 2 i=2
16 The standard assumption that the errors are normally distributed with mean and variance σ 2 I is interpreted in this context to mean that all populations have a common variance σ 2 (but different means). The standard estimator for this variance is s 2 = yt y y T X (X T X ) 1 X T y n p = yt y y T X b. n 3
17 Then s 2 = = = n 3 1 n 3 1 n n 3X X i i=1 i=1 j=1 n 3X X i yij 2 j=1 2 n 3X X i 4 i=1 j=1 y 2 ij + ˆ P n 1 i=1 y 1i y 2 ij 1 n i n 3X 1 X n i i=1 n X j=1 j=1 y ij P n2 i=1 y 2i y ij A 5 A 5. P n3 i=1 y 3i 2 4 P n1 i=1 y 1i n1 Pn2 i=1 y 2i n2 Pn3 i=1 y 3i n
18 This can be written as a pooled variance where s 2 i s 2 = (n 1 1)s (n 2 1)s (n 3 1)s 2 3 (n 1 1) + (n 2 1) + (n 3 1) are the individual population variance estimators s 2 i = 1 n i 1 ( ) n i y ij 1 n i 2 y ik. n i j=1 k=1
19 In general, it is always possible to reparametrize a less than full rank model into a full rank model. However, this is not always desirable. For the one-way classification model, we have a nice interpretation of the (re-)parameters as the population means. But this is not always possible. Example. Consider the two-way classification model (without interaction), with one sample from each combination of factors and two levels of each factor: y ij = µ + τ i + β j + ε ij, i, j = 1, 2. We will study this model with more generality later.
20 The design matrix for this model is X = It is obvious that the first column is the sum of the next two columns, so the rank of X is at most 4. However, the sum of the 2nd and 3rd columns is equal to the sum of the 4th and 5th, so in fact r(x ) = 3. This means that we have to remove 2 parameters making interpretability much harder! Fortunately, we do not have to reparametrize our models we can develop theory for the less than full rank model.
21 Conditional inverses The starting point of our theory is (as might be guessed) more linear algebra. This time we introduce the concept of conditional inverses. Definition Let A be a n p matrix. The p n matrix A c is called a conditional inverse for A if and only if AA c A = A.
22 The first thing we note is that if A is nonsingular and square, then A 1 = A c so conditional inverses are just an extension of regular inverses for non-square and singular matrices. Example. Consider the matrices A = 1 1, A 1 =
23 Then AA 1 A = = = = A Therefore A 1 is a conditional inverse for A.
24 But it can also be shown that A 2 = is also a conditional inverse for A! So conditional inverses are not unique. That is why we speak of a conditional inverse for A, not the conditional inverse for A. Of course, if A is nonsingular, then the conditional inverse is uniquely the regular inverse. We can use this in the above example to show that A is singular.
25 For a square matrix to have a regular inverse, it must satisfy other conditions, namely nonsingularity. However, this is not the case for a conditional inverse. Theorem Let A be a n p matrix. Then A has a conditional inverse.
26 Proof. Let A have rank r. It is possible to perform a serise of elementary row and column operations (multiplication, transposition, and addition) on A to reduce it to the form B = [ Ir If we denote the matrices of the row and column operations by P and Q (which are nonsingular), then we get ]. PAQ = B.
27 Now consider the p n matrix B T = [ Ir where the s are appropriately dimensioned. ] It is not too much work to see that BB T B = B, so B T is a conditional inverse of B.
28 Now since P and Q are nonsingular, A = P 1 BQ 1. Then A(QB T P)A = P 1 BQ 1 QB T PP 1 BQ 1 = P 1 BB T BQ 1 = P 1 BQ 1 = A. By definition, QB T P is a conditional inverse for A. Therefore A has a conditional inverse.
29 Finding a conditional inverse How do we find a conditional inverse? The above theorem gives one way, but there is an easier way: 1 Find a minor M of A which is nonsingular and of dimension r(a) r(a). 2 Find M 1 and (M 1 ) T. 3 Replace M in A with (M 1 ) T and the other entries with zeros. 4 Transpose the resulting matrix.
30 Example. In the earlier example, we have A = It can be seen that r(a) = 2, so we take the principal 2 2 minor M = [ ].
31 and (M 1 ) T = 1 4 A c = [ T ] T = = [ This is the conditional inverse A 1 in the earlier example, so we can see that it works. On the other hand, if we take the lower left 2 2 minor, following the procedure gives us A 2. So this procedure can produce more than one conditional inverse.. ]
32 Conditional inverse properties Let A be a n p matrix of rank r, where n p r. Then A c A and AA c are idempotent; r(aa c ) = r(a c A) = r; (A c ) T = (A T ) c ; A = A(A T A) c (A T A) and A T = (A T A)(A T A) c A T ; I A c A is idempotent.
33 More properties We say that an expression involving a conditional inverse is unique if it is the same no matter what conditional inverse we use. A(A T A) c A T is unique, symmetric, and idempotent; r(a(a T A) c A T ) = r; I A(A T A) c A T is unique, symmetric and idempotent; r(i A(A T A) c A T ) = n r.
34 Example proof. [A(A T A) c A T ] T = A[(A T A) c ] T A T = A[(A T A) T ] c A T = A(A T A) c A T. A(A T A) c A T A(A T A) c A T = [ A(A T A) c A T A ] (A T A) c A T = A(A T A) c A T.
35 Solving the normal equations Now that we have developed the machinery, we can try to solve the normal equations X T X b = X T y. First, we must make sure that they have a solution! Theorem The system Ax = g is consistent if and only if the rank of [ A g ] is equal to the rank of A.
36 Proof. ( ) Assume that r( [ A g ] ) = r(a). Because adding g does not add to the rank, this must mean that g is a linear combination of the columns of A. Therefore there exist constants x 1, x 2,..., x p, not all zero, so that x 1 a 1 + x 2 a x p a p = g where a i is the ith column of A. But if we put this into matrix notation and set x 1 x 2 x =., then this is exactly the system Ax = g. Therefore the system is consistent. x p
37 Theorem Let y = X β + ε be a linear model. Then the normal equations are consistent. X T X b = X T y Proof. It is obvious that r(x T X ) r( [ X T X X T y ] ), as adding a column cannot decrease the number of linearly independent columns.
38 However, using rank properties from earlier on, r( [ X T X X T y ] ) = r(x T [ X y ] ) r(x T ) = r(x T X ). Therefore r( [ X T X X T y ] ) = r(x T X ) and the previous theorem shows that the normal equations are consistent.
39 Now that we know the normal equations have a solution, how do we find it?
40 Now that we know the normal equations have a solution, how do we find it? We use conditional inverses. Theorem Let Ax = g be a consistent system. Then A c g is a solution to the system, where A c is any conditional inverse for A.
41 Proof. Since Ax = g, AA c g = AA c Ax = Ax = g. Therefore, A c g solves the system. From this theorem, we see that b = (X T X ) c X T y solves the normal equations, for any conditional inverse. However, in the less than full rank model, different conditional inverses may result in different solutions.
42 Example. Suppose that for a particular linear model, we derive X T X = 1 1, X T y = This could potentially arise from a two-class classification model with one sample from each class: X = [ ].
43 The normal equations are then b b 1 b 2 = Since the last column of X T X is the sum of the first two, X T X is not of full rank. However, since the first two columns are not multiples of each other, r(x T X ) = 2.
44 To find a conditional inverse[ of X T X ], we apply the algorithm, 2 1 using the nonsingular minor. This gives us (X T X ) c = 1 2 and therefore b = (X T X ) c X T y = = 8 2.
45 [ ] 1 However, using the minor gives the conditional inverse 1 (X T X ) c = 1, 1 which gives the solution b = = Both these solutions solve the normal equations, and are equally valid! This is the problem with the less than full rank model. 6 8.
46 Example. Consider the earlier carbon removal example. We have X T X = so a conditional inverse is (X T X ) c =
47 We can also calculate X T y = Using the conditional inverse gives us a solution (but not the solution) to the normal equations: b = (X T X ) c X T y =
48 In fact, if the model is less than full rank, the normal equations have an infinite number of solutions. Theorem Let Ax = g be a consistent system. Then x = A c g + (I A c A)z solves the system, where z is an arbitrary p 1 vector.
49 Proof. We know that A c g solves the system, so Ax = A [A c g + (I A c A)z] = AA c g + (A AA c A)z = g + (A A)z = g. For the normal equations, this means that any vector of the form b = (X T X ) c X T y + [I (X T X ) c X T X ]z also satisfies the equations.
50 Example. In the two-class example above, one solution to the normal equations was [ 8 2 ] T. Using the first conditional inverse found, (X T X ) c X T X = = Let z = [ ] T, arbitrarily. Then another solution to the normal equations is b = =
51 Example. In the carbon removal example, our conditional inverse gives us (X T X ) c X T X = and so another solution to the normal equations is b = =
52 The converse of the above theorem is also true: all solutions to the system can be expressed in this form. Theorem Let Ax = g be a consistent system and let x be any solution to the system. Then where z = x. x = A c g + (I A c A)z
53 Proof. Since x solves the system, A c g + (I A c A)z = A c g + (I A c A)x = A c g + x A c Ax = A c g + x A c g = x. For the normal equations, this means that any solution can be expressed as b = (X T X ) c X T y + [I (X T X ) c X T X ]z for any conditional inverse (X T X ) c.
54 Example. In the two-class example, we found the solution 8 b 1 = 2 using our original conditional inverse. But we also noted that the conditional inverse (X T X ) c 2 = 1 1 produces the solution b 2 = 6 8.
55 Using the theorem, the first solution can be written in terms of the second solution: b 1 = (X T X ) c 2X T y + (I (X T X ) c 2X T X )z 1 = = =
56 Estimability Now we know how to solve the normal equations; furthermore, we know how to find all solutions for them. But which solution do we want? Which one is the best?
57 Estimability Now we know how to solve the normal equations; furthermore, we know how to find all solutions for them. But which solution do we want? Which one is the best? All of them! They are all equally valid. This means that we can never estimate the parameters.
58 However, not all hope is lost. There is at least one thing which is not arbitrary.
59 However, not all hope is lost. There is at least one thing which is not arbitrary. It is the value of the response variable, y. No matter what the parameters are estimated to be, y will never change! In fact, there exist linear combinations of the parameters will always be estimated at the same value no matter what solution we use for the normal equations. We call these linear combinations estimable.
60 As we might guess, combinations which are estimable can be linked to the response variable in some way. Formally: Definition Let y = X β + ε be a linear model. A function t T β is said to be estimable if there exists a vector c such that E[c T y] = t T β. Another way of looking at it is that there must exist a linear unbiased estimator for t T β.
61 We look at some equivalent conditions to estimability. Theorem Let y = X β + ε be a linear model where ε has mean and variance σ 2 I. Then t T β is estimable if and only if there is a solution to the linear system X T X z = t. Proof. ( ) Let z be a solution to X T X z = t and put c = X z. Then E[c T y] = E[z T X T y] = z T X T E[y] = z T X T X β = t T β, so t T β is estimable.
62 Example. Consider our two-class example. As a reminder, we had X T X = Consider the combination of parameters β 1 β 2. This corresponds to t T β where t = 1 1.
63 Now we look for a solution to the system z z 2 = z 3 1 A little thought shows that this solution has the system z 1 =, z 2 = 1, z 3 = 1, so β 1 β 2 is estimable..
64 Theorem Let y = X β + ε be a linear model where ε has mean and variance σ 2 I. Then t T β is estimable if and only if t T (X T X ) c X T X = t T for any conditional inverse of (X T X ). Proof. ( ) Assume that t T (X T X ) c X T X = t T, so X T X ((X T X ) c ) T t = X T X (X T X ) c t = t. This means that (X T X ) c t is a solution to the system X T X z = t, and the previous theorem implies that t T β is estimable.
65 ( ) Suppose that t T β is estimable. By the previous theorem, there exists a solution to the system X T X z = t. Using the conditional inverse, we know that a solution is z = (X T X ) c t. In other words, X T X (X T X ) c t = t and by taking transposes as above, we see that this gives t T (X T X ) c X T X = t T.
66 Example. Consider the previous example. Let us take the conditional inverse (X T X ) c = and consider the same quantity, β 1 β 2, which corresponds to t = [ 1 1 ] T. Then t T (X T X ) c (X T X ) = [ 1 1 ] = [ 1 1 ] = [ 1 1 ] = t T, so again we see that β 1 β 2 is estimable.
67 On the other hand, suppose we take t = [ 1 ] T so that t T β = β. Then we have t T (X T X ) c (X T X ) = [ 1 ] so β is not estimable. = [ 1 ] = [ 1 1 ] t T,
68 Example. We return to the carbon removal example. We are interested in seeing if the various carbon removal treatments have (significantly) different means. To test this, we look at the quantities τ 1 τ 2 and τ 1 τ 3. If both of these are, then the treatments are the same. We have (X T X ) c X T X = and the coefficient vectors t 1 = 1 1, t 2 =
69 t T 1 (X T X ) c X T X = [ 1 1 ] so t T 1 β = τ 1 τ 2 is estimable. t T 1 (X T X ) c X T X = [ 1 1 ] so t T 2 β = τ 1 τ 3 is also estimable = [ 1 1 ] = [ 1 1 ]
70 Using our definition of estimable, we can prove formally that no matter what conditional inverse we use, we will still generate the same estimate for an estimable quantity. First we will state a supporting lemma. Lemma Let y = X β + ε where ε has mean and variance σ 2 I. The best linear unbiased estimator for any estimable function t T β is z T X T y, where z is a solution to the system X T X z = t.
71 Theorem (A Gauss-Markov Theorem) Let y = X β + ε be a linear model where ε has mean and variance σ 2 I. Suppose t T β is estimable. Then any solution to the system X T X z = t gives the same estimate for t T β. Furthermore, this estimate is t T b, where b is any solution to the normal equations. Lastly, this estimate is BLUE. Proof. Suppose we have two solutions to the system X T X z = t, called z and z 1. Let b be any solution to the normal equations, which means that X T X b = X T y.
72 From the previous lemma, the best linear unbiased estimator of t T β is Similarly, z T X T y = z T X T X b = (X T X z ) T b = t T b. z T 1 X T y = t T b = z T X T y. This shows that the best linear unbiased estimator is unique, and equal to t T b.
73 Example. Let s look again at the two-class example. We have shown that β 1 β 2 is estimable. We also know that solutions to the normal equations include b = 8 2, b = If we want to estimate β 1 β 2, we would use t T b = [ 1 1 ] 8 2 =
74 However, from the above theorem, we can also use t T b = [ 1 1 ] 6 = 2. 8 It is not a coincidence that this estimate is the same as the previous one! The theorem shows that any solution to the normal equation, using any conditional inverse, will produce exactly the same estimate. In other words, the estimator is unique.
75 Example. We look at the carbon removal example. We have shown that τ 1 τ 2 and τ 1 τ 3 are estimable. We estimate them by and t T 1 b = [ 1 1 ] t T 2 b = [ 1 1 ] = 4.3 = 8.2 respectively. The Gauss-Markov theorem shows that no matter what conditional inverse we use to calculate b, these estimates will always remain the same.
76 Estimability theorems Now that we have defined estimability, we would like to know which quantities are estimable and which are not (so that we can decide what we want to find out before we start the study!). The first quantities which are definitely estimable are elements of y this is how we defined estimability, after all! Theorem Let y = X β + ε be a linear model. Then elements of X β are estimable.
77 Proof. We know that E[y] = X β. Therefore, we can multiply X β by each of 1. T, 1. T,...,. 1 T to get functions which are estimable. But these are the elements of X β, so the elements of X β are estimable.
78 Example. Consider the carbon removal example. We have the vectors X = , β = µ τ 1 τ 2 τ 3. We showed earlier that we cannot estimate the parameter vector β.
79 However, the real quantities of interest in this model are the mean responses from the three treatments. These are µ + τ 1, µ + τ 2, and µ + τ 3. We can see that µ + τ 1 = [ 1 1 ] β µ + τ 2 = [ 1 1 ] β µ + τ 3 = [ 1 1 ] β and each of these are elements of X β. Therefore, they are estimable. We would estimate them by replacing β with b, where b is any solution to the normal equations (it does not matter which). In fact, in a classification model with any k, µ + τ i is always estimable.
80 We know that elements of X β are estimable; what else?
81 We know that elements of X β are estimable; what else? If we combine estimable quantities (in a linear manner), the result should be estimable. Theorem Let t T 1 β, tt 2 β,..., tt k β all be estimable functions, and let z = a 1 t T 1 β + a 2 t T 2 β a k t T k β. Then z is estimable, and the best linear unbiased estimator for z is a 1 t T 1 b + a 2 t T 2 b a k t T k b.
82 Proof. By definition, z = (a 1 t 1 + a 2 t a k t k ) T β. Since all the functions are estimable, (a 1 t 1 + a 2 t a k t k ) T (X T X ) c X T X = a 1 t T 1 (X T X ) c X T X + a 2 t T 2 (X T X ) c X T X a k t T k (X T X ) c X T X = (a 1 t 1 + a 2 t a k t k ) T.
83 Therefore z is estimable, with estimator (a 1 t 1 + a 2 t a k t k ) T b. Of particular interest in many studies is the way different populations compare against each other. To attach a numerical value to these comparisons, we form linear combinations a 1 τ 1 + a 2 τ a k τ k, where k i=1 a i =. These treatment contrasts wipe out the effect of the overall mean response, so as to get a better picture of the differences between populations.
84 In a one-way classification model, any treatment contrast is estimable. We show this by noting that if is a treatment contrast, then z = a 1 τ 1 + a 2 τ a k τ k z = k a k µ + a 1 τ 1 + a 2 τ a k τ k i=1 = a 1 (µ + τ 1 ) + a 2 (µ + τ 2 ) a k (µ + τ k ) is a linear combination of the estimable functions µ + τ i, and is therefore itself estimable.
85 Of particular interest among treatment contrasts is the contrast of the form τ i τ j, for some i j. This is because τ i τ j = (µ + τ i ) (µ + τ j ) is the difference between the mean response in population i and the mean response in population j. If we write ȳ i for the sample mean from population i, then we would expect to estimate this contrast by the corresponding difference in sample means, ȳ i ȳ j. We can show using the theory we have developed that this is in fact the case.
86 Example. We do this for k = 3 and the contrast τ 1 τ 2. Our matrices are X =...., y = y 11 y 12. y 1n1 y 21 y 22. y 2n2 y 31 y 32., β = µ τ 1 τ 2 τ 3. Direct multiplication gives y 3n3
87 X T y = 3 i=1 nj j=1 y ij j=1 y 1j j=1 y 2j j=1 y 3j, X T X = n n 1 n 2 n 3 n 1 n 1 n 2 n 2 n 3 n 3 We can use the conditional inverse algorithm on the lower right corner of X T X to get (X T X ) c 1 = n 1 1 n 2. 1 n 3.
88 Therefore a solution to the normal equations is b = (X T X ) c X T y = ȳ 1 ȳ 2 ȳ 3. We can write τ 1 τ 2 as [ 1 1 ] β, so the best linear unbiased estimator for τ 1 τ 2 is [ ] 1 1 ȳ 1 ȳ 2 ȳ 3 = ȳ 1 ȳ 2. If we took any other conditional inverse, we would get the same result.
89 Example. In the carbon removal example, we showed that τ 1 τ 2 and τ 1 τ 3 are estimable. Both of these are contrasts, so we can say straight off that they are estimable (without doing the calculations).
90 Estimating σ 2 in the less than full rank model In the full rank model, we estimated σ 2 by s 2 = SS Res n p, where n is the sample size, p is the number of parameters, and SS Res is the sum of squares of the residuals: SS Res = (y X b) T (y X b) = y T [I X (X T X ) 1 X T ]y. We would like to find a corresponding expression for the less than full rank model, but obviously it will not be the same (since (X T X ) 1 does not exist).
91 We still define the residual sum of squares as SS Res = (y X b) T (y X b), where b is any solution to the normal equations. The important thing is that although b can vary, X b will not, because the elements of X β are estimable. Therefore SS Res is invariant to the choice of b. Next, we find the equivalent expression for SS Res. Theorem SS Res = y T [I X (X T X ) c X T ]y.
92 Proof. Let b = (X T X ) c X T y and recall that X (X T X ) c X T X = X. Then SS Res = (y T b T X T )(y X b) = y T y 2y T X b + b T X T X b = y T y 2y T X (X T X ) c X T y + y T X (X T X ) c X T X (X T X ) c X T y = y T y 2y T X (X T X ) c X T y + y T X (X T X ) c X T y = y T [I X (X T X ) c X T ]y.
93 How do we now find an estimator for σ 2? Using the quadratic forms theory that we developed earlier, we know that E[SS Res ] = E[y T (I X (X T X ) c X T )y] = tr(i X (X T X ) c X T )σ 2 + (X β) T (I X (X T X ) c X T )X β = tr(i X (X T X ) c X T )σ 2 + β T X T X β β T X T X (X T X ) c X T X β = tr(i X (X T X ) c X T )σ 2 + β T X T X β β T X T X β = tr(i X (X T X ) c X T )σ 2.
94 It can be shown that I X (X T X ) c X T is symmetric and idempotent, so E[SS Res ] = r(i X (X T X ) c X T )σ 2 = (n r)σ 2, where r = r(x ), the rank of X. This gives us the following theorem. Theorem Let y = X β + ε be a linear model, where X has rank r and ε has mean and variance σ 2 I. Then an unbiased estimator for σ 2 is SS Res n r.
95 Example. We return to the carbon removal example. The fitted values are X b = =
96 So the residuals are y X b = =
97 This means SS Res = (y X b) T (y X b) = 1.3. The rank of X is easily seen to be 3, so s 2 = =.217.
98 Interval estimation in the less than full rank model As for the full rank model, we have estimated what we could estimate. The next step is to try and find confidence intervals for our estimates. So far, we have not assumed that the error vector ε is normally distributed. However, to find confidence intervals, we need some idea of the distribution of the variables, so we make that assumption now.
99 Recall that in the full rank model, we generated confidence intervals by finding a t-distributed quantity, which was created by dividing a normal variable by a χ 2 variable. The χ 2 variable was SS Res σ 2, which had n p degrees of freedom. The σ 2 term was not known, but cancelled out another σ 2 term in the numerator to leave us with something that we could calculate.
100 We can do pretty much the same thing for the less than full rank model. Theorem Let y = X β + ε be a linear model, where ε is a normal random vector with mean and variance σ 2 I. Then (n r)s 2 σ 2 = SS Res σ 2 has a χ 2 distribution with n r degrees of freedom. The proof of this theorem is very similar to that for the full rank case, so we will not repeat it.
101 The steps to derive a confidence interval are very similar to that for the full rank case, but with two small differences. Firstly, we can only find confidence intervals for quantities that are estimable! Secondly, we replace the inverse (X T X ) 1 by the conditional inverse (X T X ) c. All other steps are the same.
102 This gives us the confidence interval for the (estimable) quantity t T β, using a t distribution with n r degrees of freedom: t T b ± t α/2 s t T (X T X ) c t This formula can also be used to find confidence intervals for the individual parameters, providing that they are estimable.
103 Example. We return again to the carbon removal example. Suppose we want to find a 95% confidence interval for τ 1 τ 2. We have t = 1 1, s2 =.217, t.25 = 2.45 using n r = 9 3 = 6 degrees of freedom. We also use the conditional inverse (X T X ) c =
104 This gives the confidence interval t T b ± t α/2 s t T (X T X ) c t = [ 1 1 ] ± [ ] = 4.3 ±.93 = ( 5.23, 3.37). 1 1 In particular, we can say with 95% confidence that the the first carbon removal treatment is not as effective as the second.
105 Example. We showed earlier that in a general 3-way classification model, the contrast τ 1 τ 2 can be estimated by the difference in the respective population means, ȳ 1 ȳ 2. We also had t = 1 1, (X T X ) c = 1 n 1 1 n 2 1 n 3.
106 Therefore we have t T (X T X ) c t = [ 1 1 ] 1 n 1 1 n 2 1 n = 1 n n 2 and the confidence interval is ȳ 1 ȳ 2 ± t α/2 s 1 n n 2. You have probably seen this formula before! The linear models framework allows us to derive it from first principles.
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +
More information1 Introduction to Matrices
1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a
More informationMATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.
MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column
More informationMATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.
MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix. Nullspace Let A = (a ij ) be an m n matrix. Definition. The nullspace of the matrix A, denoted N(A), is the set of all n-dimensional column
More informationa 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.
Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given
More informationSolution to Homework 2
Solution to Homework 2 Olena Bormashenko September 23, 2011 Section 1.4: 1(a)(b)(i)(k), 4, 5, 14; Section 1.5: 1(a)(b)(c)(d)(e)(n), 2(a)(c), 13, 16, 17, 18, 27 Section 1.4 1. Compute the following, if
More informationUniversity of Lille I PC first year list of exercises n 7. Review
University of Lille I PC first year list of exercises n 7 Review Exercise Solve the following systems in 4 different ways (by substitution, by the Gauss method, by inverting the matrix of coefficients
More informationLecture 2 Matrix Operations
Lecture 2 Matrix Operations transpose, sum & difference, scalar multiplication matrix multiplication, matrix-vector product matrix inverse 2 1 Matrix transpose transpose of m n matrix A, denoted A T or
More information160 CHAPTER 4. VECTOR SPACES
160 CHAPTER 4. VECTOR SPACES 4. Rank and Nullity In this section, we look at relationships between the row space, column space, null space of a matrix and its transpose. We will derive fundamental results
More informationSolving Linear Systems, Continued and The Inverse of a Matrix
, Continued and The of a Matrix Calculus III Summer 2013, Session II Monday, July 15, 2013 Agenda 1. The rank of a matrix 2. The inverse of a square matrix Gaussian Gaussian solves a linear system by reducing
More informationLinear Algebra Notes for Marsden and Tromba Vector Calculus
Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of
More informationQuadratic forms Cochran s theorem, degrees of freedom, and all that
Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us
More informationRecall that two vectors in are perpendicular or orthogonal provided that their dot
Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal
More informationNOTES ON LINEAR TRANSFORMATIONS
NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all
More informationCURVE FITTING LEAST SQUARES APPROXIMATION
CURVE FITTING LEAST SQUARES APPROXIMATION Data analysis and curve fitting: Imagine that we are studying a physical system involving two quantities: x and y Also suppose that we expect a linear relationship
More informationRow Echelon Form and Reduced Row Echelon Form
These notes closely follow the presentation of the material given in David C Lay s textbook Linear Algebra and its Applications (3rd edition) These notes are intended primarily for in-class presentation
More informationChapter 7. Matrices. Definition. An m n matrix is an array of numbers set out in m rows and n columns. Examples. ( 1 1 5 2 0 6
Chapter 7 Matrices Definition An m n matrix is an array of numbers set out in m rows and n columns Examples (i ( 1 1 5 2 0 6 has 2 rows and 3 columns and so it is a 2 3 matrix (ii 1 0 7 1 2 3 3 1 is a
More informationLS.6 Solution Matrices
LS.6 Solution Matrices In the literature, solutions to linear systems often are expressed using square matrices rather than vectors. You need to get used to the terminology. As before, we state the definitions
More informationNotes on Determinant
ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without
More informationMath 312 Homework 1 Solutions
Math 31 Homework 1 Solutions Last modified: July 15, 01 This homework is due on Thursday, July 1th, 01 at 1:10pm Please turn it in during class, or in my mailbox in the main math office (next to 4W1) Please
More informationSystems of Linear Equations
Systems of Linear Equations Beifang Chen Systems of linear equations Linear systems A linear equation in variables x, x,, x n is an equation of the form a x + a x + + a n x n = b, where a, a,, a n and
More informationMatrix Algebra. Some Basic Matrix Laws. Before reading the text or the following notes glance at the following list of basic matrix algebra laws.
Matrix Algebra A. Doerr Before reading the text or the following notes glance at the following list of basic matrix algebra laws. Some Basic Matrix Laws Assume the orders of the matrices are such that
More information8 Square matrices continued: Determinants
8 Square matrices continued: Determinants 8. Introduction Determinants give us important information about square matrices, and, as we ll soon see, are essential for the computation of eigenvalues. You
More informationT ( a i x i ) = a i T (x i ).
Chapter 2 Defn 1. (p. 65) Let V and W be vector spaces (over F ). We call a function T : V W a linear transformation form V to W if, for all x, y V and c F, we have (a) T (x + y) = T (x) + T (y) and (b)
More informationSimilar matrices and Jordan form
Similar matrices and Jordan form We ve nearly covered the entire heart of linear algebra once we ve finished singular value decompositions we ll have seen all the most central topics. A T A is positive
More informationMatrix Differentiation
1 Introduction Matrix Differentiation ( and some other stuff ) Randal J. Barnes Department of Civil Engineering, University of Minnesota Minneapolis, Minnesota, USA Throughout this presentation I have
More informationSimilarity and Diagonalization. Similar Matrices
MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that
More informationCONTROLLABILITY. Chapter 2. 2.1 Reachable Set and Controllability. Suppose we have a linear system described by the state equation
Chapter 2 CONTROLLABILITY 2 Reachable Set and Controllability Suppose we have a linear system described by the state equation ẋ Ax + Bu (2) x() x Consider the following problem For a given vector x in
More informationSolving Systems of Linear Equations
LECTURE 5 Solving Systems of Linear Equations Recall that we introduced the notion of matrices as a way of standardizing the expression of systems of linear equations In today s lecture I shall show how
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationVector and Matrix Norms
Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty
More informationLinearly Independent Sets and Linearly Dependent Sets
These notes closely follow the presentation of the material given in David C. Lay s textbook Linear Algebra and its Applications (3rd edition). These notes are intended primarily for in-class presentation
More informationIntroduction to Matrix Algebra
Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary
More informationLinear Algebra Notes
Linear Algebra Notes Chapter 19 KERNEL AND IMAGE OF A MATRIX Take an n m matrix a 11 a 12 a 1m a 21 a 22 a 2m a n1 a n2 a nm and think of it as a function A : R m R n The kernel of A is defined as Note
More information1 Determinants and the Solvability of Linear Systems
1 Determinants and the Solvability of Linear Systems In the last section we learned how to use Gaussian elimination to solve linear systems of n equations in n unknowns The section completely side-stepped
More information1 Review of Least Squares Solutions to Overdetermined Systems
cs4: introduction to numerical analysis /9/0 Lecture 7: Rectangular Systems and Numerical Integration Instructor: Professor Amos Ron Scribes: Mark Cowlishaw, Nathanael Fillmore Review of Least Squares
More informationThe Determinant: a Means to Calculate Volume
The Determinant: a Means to Calculate Volume Bo Peng August 20, 2007 Abstract This paper gives a definition of the determinant and lists many of its well-known properties Volumes of parallelepipeds are
More informationUsing row reduction to calculate the inverse and the determinant of a square matrix
Using row reduction to calculate the inverse and the determinant of a square matrix Notes for MATH 0290 Honors by Prof. Anna Vainchtein 1 Inverse of a square matrix An n n square matrix A is called invertible
More informationLecture Notes 2: Matrices as Systems of Linear Equations
2: Matrices as Systems of Linear Equations 33A Linear Algebra, Puck Rombach Last updated: April 13, 2016 Systems of Linear Equations Systems of linear equations can represent many things You have probably
More informationMathematics Course 111: Algebra I Part IV: Vector Spaces
Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are
More information7 Gaussian Elimination and LU Factorization
7 Gaussian Elimination and LU Factorization In this final section on matrix factorization methods for solving Ax = b we want to take a closer look at Gaussian elimination (probably the best known method
More informationNotes on Orthogonal and Symmetric Matrices MENU, Winter 2013
Notes on Orthogonal and Symmetric Matrices MENU, Winter 201 These notes summarize the main properties and uses of orthogonal and symmetric matrices. We covered quite a bit of material regarding these topics,
More informationSYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89. by Joseph Collison
SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89 by Joseph Collison Copyright 2000 by Joseph Collison All rights reserved Reproduction or translation of any part of this work beyond that permitted by Sections
More informationOrthogonal Projections
Orthogonal Projections and Reflections (with exercises) by D. Klain Version.. Corrections and comments are welcome! Orthogonal Projections Let X,..., X k be a family of linearly independent (column) vectors
More information1 Solving LPs: The Simplex Algorithm of George Dantzig
Solving LPs: The Simplex Algorithm of George Dantzig. Simplex Pivoting: Dictionary Format We illustrate a general solution procedure, called the simplex algorithm, by implementing it on a very simple example.
More informationSolving Mass Balances using Matrix Algebra
Page: 1 Alex Doll, P.Eng, Alex G Doll Consulting Ltd. http://www.agdconsulting.ca Abstract Matrix Algebra, also known as linear algebra, is well suited to solving material balance problems encountered
More information3.1 Least squares in matrix form
118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression
More informationAu = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.
Chapter 7 Eigenvalues and Eigenvectors In this last chapter of our exploration of Linear Algebra we will revisit eigenvalues and eigenvectors of matrices, concepts that were already introduced in Geometry
More information[1] Diagonal factorization
8.03 LA.6: Diagonalization and Orthogonal Matrices [ Diagonal factorization [2 Solving systems of first order differential equations [3 Symmetric and Orthonormal Matrices [ Diagonal factorization Recall:
More informationContinued Fractions and the Euclidean Algorithm
Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction
More informationThe Characteristic Polynomial
Physics 116A Winter 2011 The Characteristic Polynomial 1 Coefficients of the characteristic polynomial Consider the eigenvalue problem for an n n matrix A, A v = λ v, v 0 (1) The solution to this problem
More informationPolynomial Invariants
Polynomial Invariants Dylan Wilson October 9, 2014 (1) Today we will be interested in the following Question 1.1. What are all the possible polynomials in two variables f(x, y) such that f(x, y) = f(y,
More informationInner Product Spaces
Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and
More informationDecember 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS
December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation
More informationDirect Methods for Solving Linear Systems. Matrix Factorization
Direct Methods for Solving Linear Systems Matrix Factorization Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011
More informationMatrices 2. Solving Square Systems of Linear Equations; Inverse Matrices
Matrices 2. Solving Square Systems of Linear Equations; Inverse Matrices Solving square systems of linear equations; inverse matrices. Linear algebra is essentially about solving systems of linear equations,
More informationMethods for Finding Bases
Methods for Finding Bases Bases for the subspaces of a matrix Row-reduction methods can be used to find bases. Let us now look at an example illustrating how to obtain bases for the row space, null space,
More information5. Orthogonal matrices
L Vandenberghe EE133A (Spring 2016) 5 Orthogonal matrices matrices with orthonormal columns orthogonal matrices tall matrices with orthonormal columns complex matrices with orthonormal columns 5-1 Orthonormal
More informationLinear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007)
MAT067 University of California, Davis Winter 2007 Linear Maps Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007) As we have discussed in the lecture on What is Linear Algebra? one of
More informationLecture 3: Finding integer solutions to systems of linear equations
Lecture 3: Finding integer solutions to systems of linear equations Algorithmic Number Theory (Fall 2014) Rutgers University Swastik Kopparty Scribe: Abhishek Bhrushundi 1 Overview The goal of this lecture
More information1 2 3 1 1 2 x = + x 2 + x 4 1 0 1
(d) If the vector b is the sum of the four columns of A, write down the complete solution to Ax = b. 1 2 3 1 1 2 x = + x 2 + x 4 1 0 0 1 0 1 2. (11 points) This problem finds the curve y = C + D 2 t which
More information5 Homogeneous systems
5 Homogeneous systems Definition: A homogeneous (ho-mo-jeen -i-us) system of linear algebraic equations is one in which all the numbers on the right hand side are equal to : a x +... + a n x n =.. a m
More information1.2 Solving a System of Linear Equations
1.. SOLVING A SYSTEM OF LINEAR EQUATIONS 1. Solving a System of Linear Equations 1..1 Simple Systems - Basic De nitions As noticed above, the general form of a linear system of m equations in n variables
More information9.2 Summation Notation
9. Summation Notation 66 9. Summation Notation In the previous section, we introduced sequences and now we shall present notation and theorems concerning the sum of terms of a sequence. We begin with a
More informationDETERMINANTS IN THE KRONECKER PRODUCT OF MATRICES: THE INCIDENCE MATRIX OF A COMPLETE GRAPH
DETERMINANTS IN THE KRONECKER PRODUCT OF MATRICES: THE INCIDENCE MATRIX OF A COMPLETE GRAPH CHRISTOPHER RH HANUSA AND THOMAS ZASLAVSKY Abstract We investigate the least common multiple of all subdeterminants,
More informationMATH 551 - APPLIED MATRIX THEORY
MATH 55 - APPLIED MATRIX THEORY FINAL TEST: SAMPLE with SOLUTIONS (25 points NAME: PROBLEM (3 points A web of 5 pages is described by a directed graph whose matrix is given by A Do the following ( points
More information15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
More informationLecture 5 Least-squares
EE263 Autumn 2007-08 Stephen Boyd Lecture 5 Least-squares least-squares (approximate) solution of overdetermined equations projection and orthogonality principle least-squares estimation BLUE property
More informationSolving simultaneous equations using the inverse matrix
Solving simultaneous equations using the inverse matrix 8.2 Introduction The power of matrix algebra is seen in the representation of a system of simultaneous linear equations as a matrix equation. Matrix
More informationDETERMINANTS TERRY A. LORING
DETERMINANTS TERRY A. LORING 1. Determinants: a Row Operation By-Product The determinant is best understood in terms of row operations, in my opinion. Most books start by defining the determinant via formulas
More informationQuestion 2: How do you solve a matrix equation using the matrix inverse?
Question : How do you solve a matrix equation using the matrix inverse? In the previous question, we wrote systems of equations as a matrix equation AX B. In this format, the matrix A contains the coefficients
More informationChapter 19. General Matrices. An n m matrix is an array. a 11 a 12 a 1m a 21 a 22 a 2m A = a n1 a n2 a nm. The matrix A has n row vectors
Chapter 9. General Matrices An n m matrix is an array a a a m a a a m... = [a ij]. a n a n a nm The matrix A has n row vectors and m column vectors row i (A) = [a i, a i,..., a im ] R m a j a j a nj col
More informationLinear Algebra Review. Vectors
Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length
More informationCS3220 Lecture Notes: QR factorization and orthogonal transformations
CS3220 Lecture Notes: QR factorization and orthogonal transformations Steve Marschner Cornell University 11 March 2009 In this lecture I ll talk about orthogonal matrices and their properties, discuss
More informationLinear Algebra: Determinants, Inverses, Rank
D Linear Algebra: Determinants, Inverses, Rank D 1 Appendix D: LINEAR ALGEBRA: DETERMINANTS, INVERSES, RANK TABLE OF CONTENTS Page D.1. Introduction D 3 D.2. Determinants D 3 D.2.1. Some Properties of
More informationI. GROUPS: BASIC DEFINITIONS AND EXAMPLES
I GROUPS: BASIC DEFINITIONS AND EXAMPLES Definition 1: An operation on a set G is a function : G G G Definition 2: A group is a set G which is equipped with an operation and a special element e G, called
More informationLecture 4: Partitioned Matrices and Determinants
Lecture 4: Partitioned Matrices and Determinants 1 Elementary row operations Recall the elementary operations on the rows of a matrix, equivalent to premultiplying by an elementary matrix E: (1) multiplying
More informationAbstract: We describe the beautiful LU factorization of a square matrix (or how to write Gaussian elimination in terms of matrix multiplication).
MAT 2 (Badger, Spring 202) LU Factorization Selected Notes September 2, 202 Abstract: We describe the beautiful LU factorization of a square matrix (or how to write Gaussian elimination in terms of matrix
More informationMAT188H1S Lec0101 Burbulla
Winter 206 Linear Transformations A linear transformation T : R m R n is a function that takes vectors in R m to vectors in R n such that and T (u + v) T (u) + T (v) T (k v) k T (v), for all vectors u
More informationMULTIPLE REGRESSION WITH CATEGORICAL DATA
DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting
More informationDATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
More informationTypical Linear Equation Set and Corresponding Matrices
EWE: Engineering With Excel Larsen Page 1 4. Matrix Operations in Excel. Matrix Manipulations: Vectors, Matrices, and Arrays. How Excel Handles Matrix Math. Basic Matrix Operations. Solving Systems of
More information6. Cholesky factorization
6. Cholesky factorization EE103 (Fall 2011-12) triangular matrices forward and backward substitution the Cholesky factorization solving Ax = b with A positive definite inverse of a positive definite matrix
More information1 Sets and Set Notation.
LINEAR ALGEBRA MATH 27.6 SPRING 23 (COHEN) LECTURE NOTES Sets and Set Notation. Definition (Naive Definition of a Set). A set is any collection of objects, called the elements of that set. We will most
More informationInner product. Definition of inner product
Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product
More informationLINEAR ALGEBRA. September 23, 2010
LINEAR ALGEBRA September 3, 00 Contents 0. LU-decomposition.................................... 0. Inverses and Transposes................................. 0.3 Column Spaces and NullSpaces.............................
More informationby the matrix A results in a vector which is a reflection of the given
Eigenvalues & Eigenvectors Example Suppose Then So, geometrically, multiplying a vector in by the matrix A results in a vector which is a reflection of the given vector about the y-axis We observe that
More informationLecture 1: Systems of Linear Equations
MTH Elementary Matrix Algebra Professor Chao Huang Department of Mathematics and Statistics Wright State University Lecture 1 Systems of Linear Equations ² Systems of two linear equations with two variables
More informationSolution of Linear Systems
Chapter 3 Solution of Linear Systems In this chapter we study algorithms for possibly the most commonly occurring problem in scientific computing, the solution of linear systems of equations. We start
More informationMatrix Representations of Linear Transformations and Changes of Coordinates
Matrix Representations of Linear Transformations and Changes of Coordinates 01 Subspaces and Bases 011 Definitions A subspace V of R n is a subset of R n that contains the zero element and is closed under
More informationSolutions to Math 51 First Exam January 29, 2015
Solutions to Math 5 First Exam January 29, 25. ( points) (a) Complete the following sentence: A set of vectors {v,..., v k } is defined to be linearly dependent if (2 points) there exist c,... c k R, not
More information8.2. Solution by Inverse Matrix Method. Introduction. Prerequisites. Learning Outcomes
Solution by Inverse Matrix Method 8.2 Introduction The power of matrix algebra is seen in the representation of a system of simultaneous linear equations as a matrix equation. Matrix algebra allows us
More informationVieta s Formulas and the Identity Theorem
Vieta s Formulas and the Identity Theorem This worksheet will work through the material from our class on 3/21/2013 with some examples that should help you with the homework The topic of our discussion
More informationEigenvalues, Eigenvectors, Matrix Factoring, and Principal Components
Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they
More informationCITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION
No: CITY UNIVERSITY LONDON BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION ENGINEERING MATHEMATICS 2 (resit) EX2005 Date: August
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationLecture L3 - Vectors, Matrices and Coordinate Transformations
S. Widnall 16.07 Dynamics Fall 2009 Lecture notes based on J. Peraire Version 2.0 Lecture L3 - Vectors, Matrices and Coordinate Transformations By using vectors and defining appropriate operations between
More informationWhat is Linear Programming?
Chapter 1 What is Linear Programming? An optimization problem usually has three essential ingredients: a variable vector x consisting of a set of unknowns to be determined, an objective function of x to
More informationNotes on Applied Linear Regression
Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:
More information