1 Review. 1.1 Vector Spaces

1 Review 1.1 Vector Spaces Let F be a field. An F -vector space or vector space over F is a non-empty set V with elements called vectors and two operations: addition of vectors and multiplication of vectors by elements of F (called scalar multiplication) such that: 1. V is an abelian group under the operation of addition V V V, (v, w) v + w commutative: for all v, w V, v + w = w + v associative: for all u, v, w V, u + (v + w) = (u + v) + w (additive) identity: there exists an element e V such that e + v = v for all v V (additive) inverse: for each v V there exists w V such that v + w = e 2. The operation of scalar multiplication F V V, (a, v) av satisfies the following properties for all a, b F and all v V a(v + w) = av + aw (a + b)v = av + bv (ab)v = a(bv) 1v = v Remark 1.1. In an F -vector space V, the additive identity is unique and is denoted by 0 V. For each v V, the additive inverse is unique and is denoted by v V. Example 1.2. Denote by F n the set of all n-tuples (a 1, a 2,..., a n ) with a i F for 1 i n. This is an F -vector space with both addition and scalar multiplication being defined component wise, i.e. (a 1, a 2,..., a n ) + (b 1, b 2,..., b n ) = (a 1 + b 1, a 2 + b 2,..., a n + b n ) s(a 1, a 2,..., a n ) = (sa 1, sa 2,..., sa n ) Example 1.3. The set M mn (F ) of m n matrices with entries in F is an F -vector space under matrix addition and scalar multiplication. Example 1.4. The polynomial ring P (F ) (sometimes denoted F [x]) is an F -vector space under polynomial addition and scalar multiplication. Note that the set of all polynomials in P (n) of degree n, which we will denote by P n (F ), is also an F -vector space, but the set of all polynomials in P (F ) of degree n is NOT an F -vector space. 1

The following two properties hold for any field F and any F -vector space V : 0v = 0 for all v V (on the LHS we consider 0 F while on the RHS we have 0 V ) v = ( 1)v for all v V A nonempty subset W of an F -vector space V is called a subspace of V if W, equipped with the vector addition and scalar multiplication of V, is itself an F -vector space. In particular, W is a subspace of V if and only if W is closed under vector addition and scalar multiplication. Proposition 1.5. A nonempty subset W of an F -vector space V is a subspace if and only if the following conditions hold 0 W w, w W = w + w W a F and w W = aw W Example 1.6. If V = F n and A is an n n matrix over F, the subset is a subspace of V called the nullspace of A. n(a) := {v V Av = 0} Example 1.7. The set P n (F ) is a subspace of P (F ) for any n 0. Example 1.8. The set of all matrices whose i-th row is 0 is a subspace of M mn (F ). Given vectors v 1,..., v n V and scalars a 1,..., a n F, the finite sum a 1 v 1 + + a n v n is called a linear combination of v 1,..., v n. The set Span{v 1,..., v n } of all possible linear combinations of v 1,..., v n is a subspace of V called the subspace generated by or spanned by v 1,..., v n. If S V is a subspace of V and v 1,..., v n S, then Span{v 1,..., v n } S, since every linear combination of v 1,..., v n must also be in S (by definition). Therefore, Span{v 1,..., v n } is the smallest subspace of V containing v 1,..., v n. A subspace S V is said to be finitely generated if there exists some finite set v 1,..., v n V such that S = Span{v 1,..., v n }. More generally, we can consider a (possibly infinite) nonempty set of vectors E V and define Span(E) to be the set of all possible linear combinations of vectors from E. Note that each linear combination involves only finitely many vectors, but those vectors can be chosen from an infinite set. As in the finite case, Span(E) is a subspace of V and is the smallest subspace of V containing E. To the other extreme, if E consists of a single vector v V, we typically write Span v = F v, since every element of this subspace will be of the form av for some a F. 2

Example 1.9. The finite set {1, x, x 2,..., x n } generates P n (F ), while an infinite set such as {1, x, x 2,... } is needed to generate the entire polynomial ring P (F ). Example 1.10. If A is an matrix in M mn (F ) then the row space of A is the subspace of F n generated by the rows of A and the column space of A is the subspace of F m generated by the columns of A. We denote the column space (also called the range of A) by R(A), and note that R(A) = {Av v F n } F m The vectors v 1,..., v n are said to be linearly independent if the following condition holds: n a i v i = a 1 v 1 + a 2 v 2 + + a n v n = 0 = a 1 = a 2 = = a n = 0 i=1 Vectors which are NOT linearly independent are called linearly dependent. Proposition 1.11. The vectors v 1,..., v n V are linearly independent if and only if every vector v Span{v 1,..., v n } can be written uniquely as a linear combination of v 1,..., v n. An arbitrary set E V is called a linearly independent set if any finite number of vectors from E are linearly independent. If this condition is not satisfied then E is a linearly dependent set. An equation n i=1 a iv i = 0 with v 1,..., v n E and a 1,..., a n F not all zero is called a linear dependence relation for E. Remark 1.12. We consider the empty set to be a linearly independent set Let W be a subspace of V. A set B of vectors in W is called a basis of W if B is a linearly independent set and Span(B) = W. The zero vector is never a member of a basis except in the unique situation that W = {0}. Theorem 1.13. A maximal linearly independent subset of V is a basis for V. A minimal spanning subset of V is a basis for V. Any spanning subset of V can be reduced to a basis for V. Theorem 1.14. Let E be a finite set of vectors in V and let E 0 E be a linearly independent subset. Then there exists a basis B of the subspace Span(E) such that E 0 B E. In other words, any linearly independent subset E 0 E can be extended to a basis of Span(E) by adding suitable vectors from E to E 0. Theorem 1.15 (Steinitz Replacement Theorem). Let V be a finitely generated F -vector space with a basis B = {v 1,..., v n } and let E = {w 1...., w m } be an arbitrary linearly independent subset of V. Then m n and E can be extended to a basis of V by adding n m suitable vectors from B. 3

Corollary 1.16. Any two bases of a finitely generated vector space contain the same number of elements. The number of elements in any basis of a finitely generated vector space V is called the dimension of V and denoted by dim(v ). By definition, the dimension of the 0-vector space is equal to 0. A vector space is finite dimensional if and only if it is finitely generated. Example 1.17. dim F n = n Standard basis of F n : β 0 = {e 1,..., e n } where e i F n satisfies (e i ) j = δ ij, that is, e i = (0, 0,..., i, 0,..., 0) with the 1 in the i-th entry dim P n (F ) = n + 1 Standard basis of P n (F ): β 0 = {1, x, x 2,..., x n } dim M mn (F ) = mn Standard basis of M mn (F ): β 0 = {E ij : 1 i m, 1 j n} where E ij M mn (F ) satisfies (E ij ) kl = δ ik δjl, that is, E ij has all zero entries except for 1 in the i-th row and j-th column. Remark 1.18. While standard bases are useful for many purposes, it is important to note that bases are not unique. For example, the standard basis for R 2 is given by β 0 = {(1, 0), (0, 1)}, but another equally valid basis is given by β = {(5, 1), (3, 2)}. Theorem 1.19. Let W be a subspace of a finite-dimensional vector space V. Then W is finite-dimensional with dim(w ) dim(v ) and dim(w ) = dim(v ) W = V. Proposition 1.20. Suppose W, W are subspaces of V. Then W + W and W W are both subspaces of V. In particular, W + W is the smallest subspace containing both W and W, while W W is the largest subspace contained in both W and W. Theorem 1.21 (Dimension Formula). Let U and W be finite-dimensional subspaces of an F -vector space V. Then U + W and U W are finite-dimensional subspaces of V and dim(u + W ) = dim(u) + dim(w ) dim(u W ) In the case that U W = {0}, the set U + W is called a direct sum of U and W and is denoted by U W. This concept generalizes to more than two summands in the following way. If U 1,..., U k are subspaces of a vector space V, then U 1 + + U k is the subspaces of V consisting of all vectors of the form v = v 1 + + v k with v i U i for each i = 1,..., k. 4

Furthermore, U 1 + + U k = U 1 U k if and only if for each i = 1,..., k we have ( ) U i U j = {0} j i For i = 1,..., k, if β i is a basis for U i, then β 1 β k is a basis for U 1 U k. The notion of the direct sum provides a handy way to illustrate the basis of a finitedimensional vector space: Assume that {v 1,..., v n } is a set of generators of V. Then if and only if {v 1,..., v n } is a basis of V. V = F v 1 F v n Given an subspace U V, we can always find a subspace W V such that V = U W. We call W a complementary subspace to U and note that W is in general not uniquely determined. Let W be a subspace of an F -vector space V. We define an equivalence relation on V by defining v W v if and only if v v W. The equivalence classes are given by V/W = {v + W v V }, where v + W = {v + w w W }. The set V/W is called the quotient vector space of V by W. It is itself a vector space with well-defined operations Addition: V/W V/W V/W, defined by (v + W ) + (v + W ) = (v + v ) + W Scalar Multiplication: F V/W V/W, defined by a(v + W ) = av + W The quotient map π W : V V/W given by π W (v) = v + W is F -linear and surjective. Theorem 1.22. Let V be a finite-dimensional F -vector space and let β W a basis for a subspace W of V. Let β = β W S for some subset S V. Then β is a basis of V if and only if π W (S) is a basis for V/W. In particular, dim(v/w ) = dim(v ) dim(w ). Proposition 1.23 (Lagrange Interpolation). Given x 0,..., x n distinct elements of F and y 0,..., y n arbitrary elements of F, there exists a unique polynomial f(x) P n (F ) such that f(x i ) = y i for all i = 0,..., n. The proof for Lagrange Interpolation is constructive. Given x 0,..., x n distinct, we first construct polynomials f i P n (F ) such that f i (x j ) = δ ij. We do this by setting f i (t) := i j (t x j ), for i = 0,..., n (x i x j ) Now, setting f = n i=0 y if i, we clearly have f(x i ) = y i. All that remains is to show that f is unique. It suffices to show that {f 0,..., f n } form a basis of P n (F ). Set n i=0 α if i = 0. Evaluating the left-hand side at x 0, we obtain α 0 = 0. Continuing this process for i = 1,... n shows that α 0 = α 1 = = α n = 0 and so f 0,..., f n are linearly independent. Since dim P n (F ) = n + 1, we are done. 5

1.2 Linear Transformations Let V and W be finite-dimensional vector spaces over a field F. A map T : V W is called a linear transformation (or F -linear map) if T (v 1 + v 2 ) = T (v 1 ) + T (v 2 ) and T (av) = at (v) for all v, v 1, v 2 V and all a F. In other words, T is a linear transformation if it respects vector addition and scalar multiplication. If T is a linear transformation from V to itself, we sometimes refer to T as a linear operator on V. Remark 1.24. To check that T : V W is a linear transformation, it is equivalent to check the single criterion T (v + aw) = T (v) + at (w) for all v, w V, and all a F Lemma 1.25. If T : V W is a linear transformation, then T (0) = 0 and ( k ) T a i v i = i=1 k a i T (v i ) i=1 for all a 1,..., a k F and all v 1,..., v k V. In particular, if V is finite-dimensional with basis {v 1,..., v n } then a linear transformation T : V W is uniquely determined by the images T (v 1 ),..., T (v n ) of the basis vectors, and these can be chosen arbitrarily. For any linear transformation T : V W we have two important subspaces: The range or image of T, denoted by T (V ) or im(t ) is the subspace of W defined by T (V ) := {T (v) v V } The nullspace or kernel of T, denoted by n(t ) or ker(t ) is the subspace of V defined by n(t ) := {v V T (v) = 0} If V is finite-dimensional we have the following relation between the dimensions of these subspaces: dim n(t ) + dim T (V ) = dim(v ) We often refer to dim T (V ) as the rank of T, and denote it by rank(t ). Similarly, we define the nullity of T by null(t ) = dim(n(t )). Example 1.26. Let A M mn (F ) be an m n matrix. The map L A : F n F m defined by L A (v) = Av is a linear transformation with im(l A ) = R(A) and null(l A ) = null(a). 6

For arbitrary subspaces V 0 V and W 0 W, we can define the image of V 0 under T : as well as the pre-image of W 0 under T : T (V 0 ) = {T (v) v V 0 } T 1 (W 0 ) = {v V T (v) W 0 } Note that T (V 0 ) is a subspace of W, while T 1 (W 0 ) is a subspace of V. We say that a linear transformation T : V W is one-to-one or injective if the following condition holds T (v 1 ) = T (v 2 ) = v 1 = v 2 We say that T is onto or surjective if T (V ) = W, which is equivalent to saying that for every w W there exists some v V such that w = T (v). A linear transformation T which is both injective and surjective is called bijective or an isomorphism. Lemma 1.27. A linear transformation T : V W is injective if and only if n(t ) = {0}. T is surjective if and only if rank(t ) = dim(w ). Lemma 1.28. There exists an isomorphism T : V W if and only if there exists an isomorphism S : W V. If an isomorphism between vector spaces V and W exists, we say that they are isomorphic vector spaces. Two finite-dimensional vector spaces over F are isomorphic if and only if they have the same dimension. Let X, Y be sets and consider a function f : X Y. We say that f has a left inverse g : Y X if g f = id X. Similarly, f has a right inverse h : Y X if f h = id Y. The function f is invertible if it has a two-sided inverse k : Y X such that f k = id Y and k f = id X. Lemma 1.29. If f : X Y is invertible, any left inverse is equal to any right inverse. In particular, two-sided inverses are unique and are written as f 1 : Y X. Proposition 1.30. Let f : X Y be a function. 1. f has a left inverse if and only if f is injective 2. f has a right inverse if and only if f is surjective 3. f is invertible if and only if f is bijective. 7

Theorem 1.31. Let T : V W be a linear transformation between finite-dimensional vector spaces with dim(v ) = dim(w ) = n. The following statements are equivalent: 1. T is bijective 2. T is injective 3. T is surjective 4. null(t ) = 0 5. rank(t ) = n Proposition 1.32. Given a basis β = {v 1,..., v n } of a vector space V and arbitrary w 1,..., w n W, there exists a unique linear transformation T : V W with T (v i ) = w i, i = 1,..., n. Explicitly, this is given by T ( n i=1 a iv i ) = n i=1 a iw i. Theorem 1.33. Let T : V W be a linar transformation and let β = {v 1,..., v n } be a basis of V. Let T (β) = {T (v 1 ),..., T (v n )}. Then, 1. T is injective if and only if T (β) is linearly independent 2. T is surjective if and only if T (β) spans W 3. T is bijective if and only if T (β) is a basis of W. Let T : V W be a linear transformation and let V 0 be a subspace of V Then T V0 : V 0 W, defined by T V0 (v) = T (v), is called the restriction of T to V 0. T V0 is a linear transformation, with nullspace n(t V0 ) = n(t ) V 0 and range T (V 0 ). Example 1.34. Let P, A M mn (F ) with P invertible. Restricting L P : F n F m to the subspace R(A) F n provides an isomorphism L P R(A) : R(A) im(p A), and so rank(p A) = rank(a). Theorem 1.35. Consider a matrix A M mn (F ) such that its row-reduced form RREF(A) has pivot columns j 1,..., j r. Then R(A) has a basis given by {a j1,..., a jr } and rank(a) is the number of non-zero rows in RREF(A). Theorem 1.36. If A M mn (F ) and RREF(A) has r non-zero rows, then Row(A) = R(A T ) has rank r with a basis given by the non-zero rows of RREF(A). It follows from this theorem that rank(a) = rank(a T ). Theorem 1.37. If A M mn (F ) and RREF(A) has pivot columns j 1,..., j r, then n(a) = n(rref(a)) has dimension n r. Letting S be the set of indices of the non-pivot columns of RREF(A), a basis of n(a) is given by { } e k j i <k b ik e ji k S 8

Theorem 1.38. Let U : X V, T : V W, S : W Y be linear transformations. 1. im(t U) im(t ) and if U is surjective, im(t U) = im(t ) 2. n(t ) n(st ) and if S is injective, n(t ) = n(st ) 3. rank(st U) = rank(t ) if S is surjective and U is injective 1.3 The Matrix of a Linear Transformation For F -vector spaces V, W, we define L(V, W ) to be the set of all linear transformations from V to W. To simplify notation, we write L(V, V ) =: L(V ). This set becomes an F -vector space itself if we define the sum S + T by (S + T )(v) = S(v) + T (v) and a scalar multiple at by at (v) = at (v). Choosing a basis β = {v 1,..., v n } for V, we define the coordinate map ψ β : V F n to be the F -linear isomorphism sending n i=1 a iv i to the column vector [a 1,..., a n ] T. For an arbitrary vector v V, we write ψ β (v) = [v] β. Our goal is to show that if dim(v ) = n and dim(w ) = m, then L(V, W ) is isomorphic to M mn (F ), and that this isomorphism is dependent on the choice of bases for V and W. So, we begin by choosing a basis β = {v 1,..., v n } of V and a basis γ = {w 1,..., w m } of W. Let ψ β : V F n and ψ γ : W F m be the respective coordinate maps for β and γ. Now, for a linear transformation T L(V, W ), we consider the following diagram V ψ β F n T A W ψ γ F m We want to define a matrix A M mn (F ) such that this diagram commutes, i.e. such that ψ γ T = L A ψ β. For ease of notation, we refer to the linear transformation L A : F n F m simply by A : F n F m. We start with a basis vector v j from β. Moving along the top of the diagram, v j is mapped first to T (v j ), which we can write uniquely in terms of the basis γ, that is T (v j ) = a 1j w 1 + a 2j w 2 + + a mj w j, for some a 1j,..., a mj F Next, the coordinate map ψ γ maps T (v j ) to the column vector [a 1j,..., a mj ] T. So, the composition of these maps gives ψ γ T (v j ) = [a 1j,..., a mj ] T. Moving the other way around the diagram, the coordinate map ψ β maps v j to the j-th standard basis vector e j F n, viewed as a column vector. If we multiply this column vector by A (on the left), we obtain the jth column of A. So, if we want the diagram to commute, the entries in the j-th column of A must be equal to a 1j,..., a mj. Repeating this process for all j = 1,..., n, we see that A must be equal to the matrix (a ij ). 9

To emphasize the dependence of the matrix (a ij ) on not only the transformation T but also the bases β and γ, we denote it by by [T ] γ β := (a ij) M mn (F ) (many possible notations are commonly used, but we will try to stick with this one). Proposition 1.39. The map T [T ] γ β is a linear transformation and provides an isomorphism between L(V, W ) and M mn (F ). In particular, we have dim L(V, W ) = dim(v ) dim(w ) If T : V V and we consider only the basis β of V, we write [T ] β := [T ] γ β. Given two bases β, γ of the same vector space V, then the matrix [id V ] γ β is called the change of basis matrix from β to γ. Proposition 1.40. If β is a basis for the n-dimensional space V and γ is a basis for the m-dimensional vector space W, and T L(V, W ), then 1. The coordinate map ψ β maps the subspace n(t ) V isomorphically onto the subspace n([t ] γ β ) F n. 2. The coordinate map ψ γ maps the subspace T (V ) W isomorphically onto the subspace R([T ] γ β ) F m. Suppose we have another linear transformation S : W U and a basis λ = {u 1,..., u p } of U. What is the matrix associated with the composition S T? We consider the following diagram, this time with two squares (for this diagram, we need both squares to commute as well as the outer rectangle) V T W S U ψ β F n A ψ γ F m B F p ψ λ Each square commutes by the previous argument, so we need only to consider the outer rectangle in the diagram. In order for this rectangle to commute, we must have ψ λ S T = L B L A ψ β We know that L A L B = L AB and furthermore, from the individual squares we must have A = [T ] γ β and B = [S]λ γ. Thus, we obtain [ST ] λ β = [S]λ γ[t ] γ β. The following proposition lists the multiplicative properties of the matrix of a linear transformation. 10

Proposition 1.41. 1. If dim(v ) = n and β is a basis of V, then [id V ] β β = I n 2. For X M mn (F ), X = [L X ] γ 0 β 0, where β 0 and γ 0 are the standard bases of F n and F m, respectively. 3. Let V and W be finite-dimensional F -vectors spaces with bases β and γ respectively. If T, U : V W are linear transformations, then [T + U] γ β = [T ]γ β + [U]γ β and [at ]γ β = a[t ]γ β for all a F 4. T is an isomorphism if and only if [T ] γ β is invertible. In this case, [T 1 ] β γ = ([T ] γ β ) 1. 5. If β is a fixed basis of V and P M n (F ) is an invertible matrix with n = dim(v ), then there exists another basis β of V such that P = [id V ] β β. In fact, β = ψ 1 β ({p 1,..., p n }) where {p 1,..., p n } is the set of columns of P. Given a matrix A M mn (F ) and bases β = {v 1,..., v n } and γ = {w 1,..., w m } for V and W respectively, how do we find the transformation T L(V, W ) such that A = [T ] γ β? For v V, we can find unique coefficients a 1,..., a n ψ β (v) = [a 1,..., a n ] T and [ n [T ] γ β [a 1,..., a n ] T = ([T ] γ β ) 1ka k,..., k=1 such that v = n i=1 a iv i. Then, ] T n ([T ] γ β ) mka k Finally, applying ψγ 1 to [T ] γ β (ψ β(v)) will produce T (v), by commutativity of the square diagram: ( T (v) = ψγ 1 [T ] γ β (ψ β(v) ) ( m n ) = ([T ] γ β ) jka k w j We define two matrices X, Y M mn (F ) to be row and column equivalent if P XQ = Y for some invertible matrices P M m (F ) and Q M n (F ). Theorem 1.42. 1. Row and column equivalence defines an equivalence relation on M mn (F ). 2. [T ] γ β and [T ]γ β j=1 k=1 k=1 are row and column equivalent matrices 3. Suppose X and Y are row and column equivalent m n matrices with Y = P XQ for invertible matrices P M m (F ) and Q M m (F ). Fixing bases β of V and γ of W, there exists a unique linear transformation T : V W such that X = [T ] γ β and there exist bases β of V and γ of W such that Y = [T ] γ β. 4. In particular, if we take β to be the columns of Q and γ to be the columns of P 1, we have [T ] γ β = Y. 11

Proposition 1.43. For a linear transformation T : V W, there exists a basis β f V and γ of W such that [T ] γ β = diag(i r, 0) where r = rank(t ). This implies that the equivalence classes of matrix equivalence on M mn (F ) can be represented by matrices of the form diag(i r, 0) M mn (F ) where 0 r min{m, n}. We define matrices A, B M mn (F ) to be similar if there exists an invertible matrix P such that P 1 AP = B. Theorem 1.44. 1. Similarity defines an equivalence relation on M mn (F ) 2. For any two bases β, β of V and any T L(V, V ), [T ] β and [T ] β are similar 3. Suppose A and B are similar n n matrices with B = P 1 AP for some P M n (F ) invertible. Fixing a basis β of V, there exists a unique T L(V, V ) such that A = [T ] β and there exists another basis β of V such that B = [T ] β 4. In particular, if we take β to be the columns of P, we have [T ] β = B. 1.4 Determinants For this section, we assume that F = R or C. For a function D : M n (F ) F, we may identify M n (F ) with (F n ) n via A = [a 1,..., a n ] (a 1,..., a n ). Under this identification, we call D a determinantal mapping if it satisfies the following set of properties: 1. (multilinear on columns) For each 1 i n, r i,a : F n F is F -linear, where r i,a (x) = D(a 1,..., a i 1, x, a i+1,..., a n ) By this we mean that if all columns j i are fixed, D is linear in the i-th column, for i = 1,..., n. 2. (alternating) For all 1 i < j n, r i,j,a (a i, a j ) = r i,j,a (a j, a i ), where r i,j,a (x, y) = D(a 1,..., a i 1, x, a i+1,..., a j 1, y, a j+1,..., a m ) By this, we mean that if we exchange two columns, D changes sign. 3. (identity) D(I n ) D(e 1,..., e n ) = 1 Lemma 1.45. Condition 2 of the above definition is equivalent to the following statement: For all 1 i < j n, r i,j,a (x, x) = 0 In other words, D is zero on any matrix having two equal columns. 12

A permutation σ on the set {1, 2,..., n} is a bijection σ : {1, 2,..., n} {1, 2,..., n} The set of all permutations on {1, 2,..., n} is denoted by S n. We can represent any σ S n as an n-tuple (σ(1), σ(2),..., σ(n)). This is commonly called on-line notation. Since σ(i) = σ(j) i = j, we have S n = n!. Lemma 1.46. S n is a group under composition. A transposition τ S n is an element of S n which interchanges i j and fixes all other elements of {1, 2,..., n}. Proposition 1.47. Every σ S n can be written as a finite product of transpositions. Let σ S n. The inversion set of σ is defined by I(σ) := {(i, j) : 1 i < j n, σ(i) > σ(j)} In other words, the inversion set of σ is the set of ordered pairs (i, j) whose order is flipped by σ. With this, we define the sign of σ by ɛ σ = ( 1) I(σ) Note that I(τ) = {(i, j)} if τ is the transposition exchanging i and j. Proposition 1.48. For σ, τ S n, ɛ στ = ɛ σ ɛ τ and ɛ σ 1 = ɛ σ. Theorem 1.49. A determinantal mapping on M n (F ) is unique. It is given by D(A) = σ S n ɛ σ a σ(1),1 a σ(n),n Since the map is unique, we call it the determinant, and denote it by det : M n (F ) F. Theorem 1.50. Let A M n (F ). For all i = 1,..., n det(a) = n ( 1) i+j a ij det(a ij ) j=1 where A ij is the (n 1) (n 1) matrix obtained from A by removing the i-th row and j-th column. This is the cofactor expansion across row i of A. Proposition 1.51. For A M n (F ), det(a T ) = det(a). 13

Corollary 1.52. Let A M n (F ). For all i = 1,..., n det(a) = n ( 1) i+j a ij det(a ij ) i=1 Note that this time we are summing over i instead of j. This is the cofactor expansion down column j of A. Proposition 1.53. For A, B M n (F ), det(ab) = det(a) det(b). Proposition 1.54. A is invertible if and only if det(a) 0. Example 1.55. A matrix A M n (R) is called orthogonal if AA T = I n. If A is orthogonal, then det(a) = ±1. Example 1.56. A matrix A M n (C) is called nilpotent if A k = 0 for some positive integer k (note that we consider 0 here to be the n n zero matrix). If A is nilpotent, then det(a) = 0. The adjoint of a matrix A M n (F ) is an n n matrix denoted by adj(a) M n (F ), whose entries are given by adj(a) ij = ( 1) i+j det(a ji ). It is the transpose of the matrix of cofactors. Proposition 1.57. Let x F n. Then (adj(a)x) j = det[a, x] j, where Also, (x T adj(a)) i = det [A, x] j := [a 1,..., a j 1, x, a j+1,..., a n ] [ ] A x T i where [ ] A x T = i A 1. A i 1 x T A i+1 Proposition 1.58. Let A M n (F ). Then adj(a)a = det(a)i n = A adj(a). If A is invertible, then A 1 = 1 det(a) Proposition 1.59 (Cramer s Rule). Let A M n (F ) be invertible and let b F n. The j-th component of the unique solution to Ax = b is given by x j = det[a, b] j for all 1 j n.. A n. 14

1.5 Diagonalizability For this section we assume V is an n-dimensional F -vector space, T L(V ) and A M n (F ). Define the characteristic polynomial of A M n (F ) by c A (x) = det(a xi n ) and the trace of A by tr(a) = n i=1 a ii. An eigenvector of T : V V is a non-zero vector v V such that T (v) = λv for some λ F. The associated scalar λ is called an eigenvalue of T. For a matrix A M n (F ), the eigenvectors and eigenvalues of A are defined to be those of the linear transformation L A : F n F n. The λ-eigenspace of T is E λ (T ) := n(t λ id V ). The eigenspace is a subspace of V consisting of 0 and all λ-eigenvectors. Correspondingly, the λ-eigenspace of A is E λ (A) = n(a λi n ). Proposition 1.60. For A M n (F ), c A (x) is a polynomial of degree n with leading coefficient ( 1) n. Its roots are the eigenvalues of A. Similar matrices have the same rank, trace, determinant, characteristic polynomial and eigenvalues. Thus, for T L(V ), we may define its rank, trace, determinant, characteristic polynomial and eigenvalues as those of [T ] β for any basis β of V. For a polynomial p(x) = k i=0 c ix i P (F ), define p(t ) = k i=0 c it i L(V ), where T 0 = I V and p(a) = k i=0 c ia i M n (F ) where A 0 = I n. Proposition 1.61. Let T L(V ) where V is an n-dimensional F -vector space with basis β. For any p P (F ), 1. [p(t )] β = p([t ] β ) 2. ψ β maps n(p(t )) isomorphically onto n(p([t ] β )) In particular, ψ β maps E λ (T ) isomorphically onto E λ ([T ] β ) 3. ψ β maps im(p(t )) isomorphically only R(p([T ] β )) Lemma 1.62. If λ is a eigenvalue of T and p P (F ), then p(λ) is an eigenvalue of p(t ). Theorem 1.63 (Cayley-Hamilton Theorem). For a linear transformation T : V V with characteristic polynomial c T (x), c T (T ) = 0. Similarly, for a matrix A M n (F ), c A (A) = 0. Remark 1.64. Note that the proof c A (A) = det(ai n A) = det(0) = 0, while tempting, makes no mathematical sense. One way to see this is that c A (A) M n (F ), while det(0) F. So, this equality can only work in the unique case that n = 1. Lemma 1.65. T : V V is invertible if and only if 0 is not an eigenvalue of T. If T is invertible, the eigenvalues of T 1 are the inverses of those of T. That is, λ is an eigenvalue of T if and only if λ 1 is an eigenvalue of T 1. 15

Proposition 1.66. T : V V is diagonalizable if and only if there exists a basis β of V consisting of eigenvectors of T. In fact, if β = {v 1,..., v n } is a basis of V, then [T ] β = diag(λ 1,..., λ n ) if and only if T v i = λ i v i for all i = 1,..., n. Furthermore, If there exists a basis β of F n consisting of eigenvectors of A, then P = [id V ] β 0 β is an invertible matrix such that P 1 AP is diagonal (where β 0 is the standard basis of F n ). We say that a polynomial p(x) P (F ) splits if p(x) = c(x a 1 ) (x a n ) for some c, a 1,..., a n F. Note that by the Fundamental Theorem of Algebra, all polynomials in P (C) split. Theorem 1.67. If T : V V is diagonalizable, then c T (x) splits. For T L(V ), if the characteristic polynomial c T (x) splits, the algebraic multiplicity of an eigenvalue λ of T is given by its multiplicity as a root of c T (x). That is, if λ 1,..., λ k are the distinct eigenvalues of c T (x) and c T (x) splits, then c T (x) = ( 1) n k i=1 (x λ i) m i, where m i is the algebraic multiplicity of λ i. The geometric multiplicity of an eigenvalue λ i of T is the dimension of E λi (T ). Theorem 1.68. Let λ be an eigenvalue of T L(V ). If d λ = dim(e λ (T )) and m λ is the algebraic multiplicity of λ, then d λ m λ. If c T splits, then T is diagonalisable if and only if d λi = m λi for every distinct eigenvalue of T. By combining these last few results we are able to give a complete answer to the question: When is a linear transformation T : V V diagonalizable? We call this the Test for Diagonalization: Let T be a linear operator on an n-dimensional F -vector space V. Then T is diagonalizable if and only if both of the following conditions hold. 1. The characteristic polynomial of T splits 2. For each eigenvalue λ of T, m λ = n rank(t λ id V ). 16