NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all v, v V. ii T av = at v for all v V and all scalars a F. It is important to note that the addition of vectors v + v in i takes place in V, whereas the addition T v + T v takes place in W. Similarly, av in ii is scalar multiplication in V, whereas at v is scalar multiplication in W. We casually refer to such a function T as a structure preserving map, because it respects the two essential operations that give V its vector space structure, namely its addition and scalar multiplication. Example 2. Let V be any vector space and define id V : V V to be the function given by id V v = v for all v V. This is called the identity function on V. To see that id V is a linear transformation, observe for all v, v V that we have id V v + v = v + v = id V v + id V v, hence the first condition in Definition 1 holds for id V. Moreover, if v V and a F is any scalar, then we have id V av = av = a id V v. This shows that the second condition in Definition 1 holds for the function id V, thus id V is a linear transformation. Example 3. Consider the function pr 1 : F 2 F 1 defined by x pr 1 = x for all x, y F. y This is called the projection of F 2 onto its first factor. To show that pr 1 is a linear transformation, note for any x, y, x, y F that we have x x x + x pr 1 + = pr y y 1 = x + x x x = pr y + y 1 + pr y 1. y For any x, y, a F, we also have x pr 1 a = pr y 1 ax ay = ax = a pr 1 These relations show that pr 1 is a linear transformation. 1 x y.
Proposition 4. Let V and W be vector spaces and let T : V W be a linear transformation from V to W. Then T 0 V = 0 W, where 0 V and 0 W are the zero vectors in V and W, respectively. Proof. Since T is a linear transformation, we have T 0 V = T 0 V + 0 V = T 0 V + T 0 V. Adding the additive inverse of T 0 V in W to both sides immediately yields 0 W = T 0 V. Proposition 5. Let V and W be vector spaces with V finite dimensional and let T : V W be a linear transformation. If v 1,..., v n is a fixed basis of V, then T is uniquely determined by the vectors T v 1,..., T v n W. Proof. A function is completely determined by where it maps each element of its domain. In order to prove that T v 1,..., T v n determine T, it therefore suffices to show that they allow us to calculate T v for each v V. Observe that since v 1,..., v n is a basis of V, we may write any element v V as a linear combination v = a 1 v 1 + + a n v n. This yields T v = T a 1 v 1 + + a n v n = T a 1 v 1 + + T a n v n = a 1 T v 1 + + a n T v n. The second and third equalities follow from the fact that T is a linear transformation. It is apparent from this that the only information required to calculate T v are the values of T v 1,..., T v n. The uniquely part of the proposition has to do with the following: Given any collection of vectors w 1,..., w n W, there are, in general, many functions T : V W that satisfy T v j = w j for all j. But there is only one linear transformation T : V W with T v j = w j for all j. To see this, suppose that T : V W and S : V W are any two linear transformations such that T v j = Sv j = w j for all j. Letting be any vector in V, we then have v = a 1 v 1 + + a n v n T v = a 1 T v 1 + + a n T v n = a 1 w 1 + + a n w n = a 1 Sv 1 + + a n Sv n = Sv. This shows that T = S. We now consider the case where both V and W are finite dimensional vector spaces. In this case, we can fix bases v 1,..., v n and w 1,..., w m of V and W, respectively. Because the vectors T v j lie in W, we can write them as linear combinations of the basis vectors w i for W in a unique way. In other words, for each 1 j n, we have a unique expression m T v j = A ij w i. i=1 2
This means for each 1 j n and each 1 i m that A ij is the coefficient on the basis vector w i in the expression of T v j as a linear combination of w 1,..., w m. The coefficients A ij may be put into a matrix A 11 A 12 A 1n A A = 21 A 22 A 2n.... A m1 A m2 A mn Evidently A is an m n matrix. We say that A represents the linear transformation T with respect to the bases v 1,..., v n and w 1,..., w m. It is very important to note that a different choice of bases for V and W would yield a different matrix representing T in general. In other words, when we talk about a certain matrix representing a linear transformation, we must always assume that bases have already been chosen for V and W. The converse to the above discussion would be the following: Suppose that we are given an n-dimensional vector space V and an m-dimensional vector space W with chosen bases v 1,..., v n and w 1,..., w m, respectively. Given any m n matrix A = A ij, we may ask whether or not there exists a linear transformation T : V W that is represented by A with respect to these chosen bases. The answer is yes. To construct T, we would first define 1 T v 1 = A 11 w 1 + A 21 w 2 + + A m1 w m, T v 2 = A 12 w 1 + A 22 w 2 + + A m2 w m,. T v n = A 1n w 1 + A 2n w 2 + + A mn w m. Now if v = a 1 v 1 + + a n v n were any vector in V, we could then define T v = a 1 T v 1 + + a n T v n. It is a routine exercise to verify that the function T defined in this way is actually a linear transformation. It is then clear from the equations 1 that A represents T with respect to the bases v 1,..., v n and w 1,..., w m. All of these observations prove the following result. Theorem 6. Let V and W be finite dimensional vector spaces and fix bases v 1,..., v n and w 1,..., w m of V and W, respectively. Then there is 1-1 correspondence {linear transformations V W } {m n matrices with entries in F}. Here, a linear transformation T : V W is mapped to the m n matrix A that represents T with respect to the bases v 1,..., v n and w 1,..., w m. 3
Now consider the special case of a linear transformation T : F n F m. In what follows, we shall work exclusively with the standard bases e 1,..., e n and f 1,..., f m of F n and F m, respectively. Recall that the standard basis of F r is defined to be the collection of vectors in F r consisting of the columns of the r r identity matrix. With respect to these bases, there is a more explicit relationship between T and the matrix A = A ij representing T. To see this, we first write A in terms of its columns: A = A 1 A 2... A n. For any 1 j n we then have T e j = A 1j f 1 + A 2j f 2 + + A mj f m = It follows for any v = a 1 e 1 + + a n e n F n that A 1j A 2j. A mj = A j. T v = T a 1 e 1 + a 2 e 2 + + a n e n = a 1 T e 1 + a 2 T e 2 + + a n T e n = a 1 A 1 + a 2 A 2 + + a n A n = A In summary, we have the following proposition. a 1 a 2. a n = Av. Proposition 7. Let T : F n F m be a linear transformation and let A = A 1 A 2... A n be the m n matrix representing T with respect to the standard bases of F n and F m. Then for all 1 j n we have T e j = A j, and in general, for each v F n we have T v = Av. Example 8. Consider the identity function id F n : F n F n defined in Example 2. Let A be the n n matrix representing id F n with respect to the standard basis of F n. Observe that id F ne j = e j for all 1 j n, simply by the definition of id F n. Proposition 7 now shows that A = e 1 e 2... e n = I n, that is, with respect to the standard basis, the identity function on F n is represented by the n n identity matrix. 4
Example 9. Similarly, one may verify that the linear transformation pr 1 : F 2 F 1 defined in Example 3 is represented with respect to the standard bases by the 1 2 matrix pr 1 e 1 pr 1 e 2 = 1 0. Linear transformations give rise to two important examples of subspaces. Definition 10. Let V and W be any vector spaces and let T : V W be a linear transformation from V to W. The kernel of T is defined to be the subset ker T = {v V T v = 0 W }. of V. The image of T is defined to be the subset of W. im T = {w W there exists v V such that T v = w} Proposition 11. If V and W are vector spaces and T : V W is a linear transformation, then ker T and im T are subspaces of V and W, respectively. Proof. By Proposition 4, we have T 0 V = 0 W, showing that 0 V ker T. Then T v = 0 W and T v = 0 W so that ker T. Next, let v, v T v + v = T v + T v = 0 W + 0 W = 0 W. This shows that v + v ker T, so ker T is closed under addition. Now let v ker T and a F. Then T av = at v = a0 W = 0 W. This shows that av ker T, so ker T is also closed under scalar multiplication. These results imply that ker T is a subspace of V. The proof that im T is a subspace of W is left to the interested reader. Definition 12. If V and W are finite dimensional vector spaces and T : V W is a linear transformation, then we know that ker T and im T are also finite dimensional. We define the nullity of T to be the dimension of ker T. Similarly, we define the rank of T to be the dimension of im T. For convenience, we denote the nullity of T by nullityt and the rank of T by rankt. The rank and nullity of a linear transformation satisfy a very interesting relationship. Theorem 13 Rank-nullity theorem. Let V and W be finite dimensional vector spaces and let T : V W be a linear transformation. Then rankt + nullityt = dim V. 5
Proof. We choose a basis u 1,..., u m of ker T and use the replacement theorem to extend this to a basis u 1,..., u m, v 1,..., v r of V. We have nullityt = m and r + m = dim V. To prove the theorem, it therefore suffices to show that rankt = r. To this end, we wish to prove that the vectors T v 1,..., T v r are a basis of im T. To show that these vectors span im T, note that any vector in V may be written as a linear combination a 1 u 1 + + a m u m + b 1 v 1 + + b r v r. It follows that any element w of im T is of the form But the right hand side is equal to w = T a 1 u 1 + + a m u m + b 1 v 1 + + b r v r. T a 1 u 1 + + a m u m + b 1 T v 1 + + b r T v r since T is a linear transformation. Observe that a 1 u 1 + + a m u m lies in ker T, because u 1,..., u m is a basis of ker T. It follows that T a 1 u 1 + + a m u m = 0, giving us w = b 1 T v 1 + + b r T v r. This shows that every vector in im T is a linear combination of T v 1,..., T v r, so these vectors span im T. To show that T v 1,..., T v r are linearly independent, suppose that b 1 T v 1 + + b r T v r = 0. We need to show that b 1 = = b r = 0. Using the fact that T is a linear transformation, the above equality becomes T b 1 v 1 + + b r v r = 0. This implies that b 1 v 1 + + b r v r lies in ker T. Because u 1,..., u m is a basis of ker T, we may write b 1 v 1 + + b r v r as a linear combination for some coefficients a 1,..., a m. But then b 1 v 1 + + b r v r = a 1 u 1 + + a m u m a 1 u 1 a m u m + b 1 v 1 + + b r v r = 0. Notice that the collection u 1,..., u m, v 1,..., v r, being a basis of V, is linearly independent. This means that all of the coefficients in the above expression must be zero. In particular, we have b 1 = = b r = 0 as required. Definition 14. Let A be an m n matrix. We define the kernel of A to be the set ker A = ker T, 6
where T : F n F m is the linear transformation that A represents with respect to the standard bases of F n and F m, respectively. Similarly, we define the image of A to be the set im A = im T. It is immediate from Proposition 11 that ker A and im A are subspaces of F n and F m, respectively. We define the nullity of A to be the nullity of T, and we define the rank of A to be the rank of T. We denote these numbers by nullitya and ranka, respectively. There is a concrete description of the subspaces defined above, which is given by Proposition 7. Specifically, we have ker A = ker T = {v F n T v = 0} = {v F n Av = 0}. In other words, ker A consists of the solutions to the matrix equation Ax = 0. It is often fashionable in the literature to call the set of such vectors the null space of A. To get an explicit description of im A = im T, observe that every element of im T is of the form T v for some v F n. Writing v as a linear combination v = a 1 e 1 + + a n e n, we have T v = T a 1 e 1 + + a n e n = a 1 T e 1 + + a n T e n. This shows that T v is a linear combination of the vectors T e 1,..., T e n F m. In other words, im T is spanned by T e 1,..., T e n, or in notation, im T = span{t e 1,..., T e n }. But by Proposition 7, we also have T e j = A j for all j, where A j denotes the jth column of A. Putting this all together, we obtain im A = im T = span{t e 1,..., T e n } = span{a 1,..., A n }. Because of this, the image of A is often called the column space of A. Remark 15. The keen observer or weary student will likely observe that we could very well have begun by defining the kernel and image of a matrix to be its null space and column space, without ever having mentioned linear transformations. The reason for the approach here is to emphasise the profound idea that representing linear transformations is not just an application of matrix theory, but rather, the primary purpose of studying matrices is to understand linear transformations. To further promote our philosophy, we now show how even the formula for matrix multiplication arises as a consequence of the theory of linear transformations. 7
Lemma 16. Let U, V and W be any vector spaces not necessarily finite dimensional. If T : U V and S : V W are linear transformations, then the composition S T : U W is also a linear transformation. Proof. This is left as a standard exercise. Proposition 17. Let U, V and W be finite dimensional vector spaces. Fix bases u 1,..., u p, v 1,..., v n and w 1,..., w m of U, V and W respectively. Now let T : U V and S : V W be linear transformations. If B is the n p matrix representing T with respect to the bases u 1,..., u p and v 1,..., v n, and if A is the m n matrix representing S with respect to the bases v 1,..., v n and w 1,..., w m, then AB is the matrix representing S T : U W with respect to the bases u 1,..., u p and w 1,..., w m. Proof. The matrix C = C ik representing the composition S T with respect to the given bases is defined by the relations m S T u k = C ij w i for all 1 k p. i=1 In order to find the scalars C ik, we therefore need to calculate S T u k for each k. To this end, let B = B jk and A = A ij be the matrices given in the proposition. Since B represents T with respect to the given bases, we have the equality T u k = n j=1 B jkv j for each 1 k p. Similarly, because A represents S with respect to the given bases, we have the equality Sv j = m i=1 A ijw i for each 1 j n. It follows for each 1 k p that S T u k = ST u k n = S B jk v j = = = j=1 n B jk Sv j j=1 n m B jk A ij w i j=1 i=1 m n A ij B jk w i. i=1 j=1 since S is a linear transformation This means that C ik = n j=1 A ijb jk for all pairs i, k. But the right hand side is the formula for AB ik, where AB is the matrix product of A and B. We therefore have C = AB. 8