Similarity and Diagonalization. Similar Matrices

MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that P AP = B. If A is similar to B, we write A B. Remarks If A B, we can write, equivalently, that A = P BP or AP = P B. If A B, we can write, equivalently, that A = P BP or AP = P B. The matrix P depends on A and B. It is not unique for a given pair of similar matrices A and B. To see this, simply take A = B = I, in which case I I, since P IP = I for any invertible matrix P. Theorem 4.2. Let A, B and C be n n matrices. a. A A. b. If A B, then B A. c. If A B and B C, then A C. This means that is an equivalence relation. The main problem is to find a good representative in each equivalence class. The real meaning of P AP is that this is the matrix of the same linear transformation (given in the standard basis by the matrix A) in a different basis, which consists of the columns of P. This really much better explains why many properties are the same for A and P AP. Theorem 4.22. Let A and B be n n matrices with A B. Then a. det A = det B. b. A is invertible if and only if B is invertible. c. A and B have the same rank. d. A and B have the same characteristic polynomial. e. A and B have the same eigenvalues.

MATH022 Linear Algebra Brief lecture notes 49 Diagonalization Definition. An n n matrix A is diagonalizable if there is a diagonal matrix D such that A is similar to D that is, if there is an invertible matrix P such that P AP = D. Note that the eigenvalues of D are its diagonal elements, and these are the same eigenvalues as for A. Theorem 4.23. Let A be an n n matrix. Then A is diagonalizable if and only if A has n linearly independent eigenvectors. More precisely, there exists an invertible matrix P and a diagonal matrix D such that P AP = D if and only if the columns of P are n linearly independent eigenvectors of A and the diagonal entries of D are the eigenvalues of A corresponding to the eigenvectors in P in the same order. Theorem 4.25. If A is an n n matrix with n distinct eigenvalues, then A is diagonalizable....since eigenvectors for distinct eigenvalues are lin. indep. by Th. 4.20. Theorem 4.24. Let A be an n n matrix and let λ, λ 2,..., λ k be distinct eigenvalues of A. If B i is a basis for the eigenspace E λi, then B = B B 2 B k (i.e., the total collection of basis vectors for all of the eigenspaces) is linearly independent. Lemma 4.26. If A is an n n matrix, then the geometric multiplicity of each eigenvalue is less than or equal to its algebraic multiplicity. Theorem 4.27. The Diagonalization Theorem Let A be an n n matrix whose distinct eigenvalues are λ, λ 2,..., λ k. The following statements are equivalent: a. A is diagonalizable. b. The union B of the bases of the eigenspaces of A (as in Theorem 4.24) contains n vectors (which is equivalent to k i= dim E λ i = n). c. The algebraic multiplicity of each eigenvalue equals its geometric multiplicity and all eigenvalues are real numbers this condition is missing in the textbook!.

MATH022 Linear Algebra Brief lecture notes 50 In these theorems the eigenvalues are supposed to be real numbers, although for real matrices there may be some complex roots of the characteristic polynomial (in fact, these theorems remain valid for vector spaces and matrices over C then, of course, one does not need the condition that the eigenvalues be all real). Theorem 4.27 and Th. 4.23 actually give a method to decide whether A is diagonalizable, and if yes, to find P such that P AP is diagonal: the columns of P are vectors of bases of the eigenspaces. Example. For A = 2 2 2 2 the characteristic polynomial is 2 2 λ 2 2 det(a λi) = 2 λ 2 2 2 λ = ( λ) 3 + 8 + 8 4( λ) 4( λ) 4( λ) = = (λ 5)(λ + ) 2. Thus, eigenvalues are 5 and. 2 2 2 x Eigenspace E : (A ( )I) x = 0; 2 2 2 x 2 = 0 0 ; x = x 2 x 3, 2 2 2 x 3 0 where x 2, x 3 are free var.; E = s t s s, t R ; t a basis of E :, 0. 0 Eigenspace E 5 : (A 5I) x = 0; 4 2 2 x 2 4 2 x 2 = 0 0 ; solve this 2 2 4 x 3 0 system...: x = x 2 = x 3, where x 3 is a free var.; E 5 = t t t R ; t a basis of E 5 :. Together the dimensions add up to 3, so B 5 B is a basis of R 3, so A is diagonalizable. 5 0 0 Let P = 0 ; then P AP = 0 0. 0 0 0 (Note that is we arrange the eigenvectors in a different order, then the eigenvalues on the diagonal must be arranged accordingly: let Q = 0 ; 0 then Q AQ = 0 0 0 0.) 0 0 5

MATH022 Linear Algebra Brief lecture notes 5 3 20 29 Example. For A = 0 82 the eigenvalues are 3,, and 7. Since 0 0 7 they are distinct, the matrix is diagonalizable. 3 0 0 (To find that P such that P AP = 0 0, one still needs to solve those 0 0 7 linear systems (A (λ)i) x = 0...). Example. For A = 3 0 0 3 the eigenvalue is 3 of alg. multiplicity 3. 0 0 3 Eigenspace E 3 : 0 0 0 0 x = 0; matrix has rank 2, so dim E 3 =. So A is 0 0 0 not digonalizable. 2 Example. Use diagonalization to find A 00 for A =. Eigenvalues 2 {[ 2 2 are... and 3. Eigenspace E 3 : x = 0; x 2 2 = x 2 ; basis. {} [ ]} ] 2 2 Eigenspace E : x = 0; x 2 2 = x 2 ; basis. Let P = ; 0 then P AP = D =. Now, A = P DP 0 3, so A 00 = (P DP ) 00 = 00 0 /2 /2 P DP P DP P DP = P D 00 P = = [ 0 3 ] /2 /2 0 /2 /2 3 00 /2 /2 0 3 00 = /2 /2 3 00 = [ /2 /2 ] 3 (/2) 00 + 3 00 3 00 3 00. +

MATH022 Linear Algebra Brief lecture notes 52 Orthogonality in R n We introduce the dot product of vectors in R n by setting that is, if then u = u v = u T v; u. u n u v = u T v = u u n and v = v. v n v. v n = u v + u 2 v 2 + + u n v n. The dot product is frequently called scalar product or inner product; we shall use the latter term in a slightly more general context. Notice the following properties of the dot product which can be easily checked directly or immediately follow from the properties of matrix multiplication. They hold for arbitrary vectors u, v, w R n and arbitrary scalar λ. u v = v u (commutativity). u ( v + w) = u v + u w u (λ v) = λ( v u) (The last two properties are referred to as linearity of the dot product.) u u = u 2 + + u 2 n and therefore u u 0. Moreover, if u u = 0 then u = 0. We define the length (or norm) v of vector v = v. v n by v = v v = v 2 + v2 2 + + v2 n Orthogonal and Orthonormal Sets of Vectors A set of vectors v, v 2,..., v k

MATH022 Linear Algebra Brief lecture notes 53 in R n is called an orthogonal set if all pairs of distinct vectors in the set are orthogonal that is, if v i v j = 0 whenever i j for i, j =, 2,..., k The standard basis e, e 2,..., e n in R n is an orthogonal set, as is any subset of it. illustrates, there are many other possibilities. As the first example Example 5. Show that { v, v 2, v 3 } is an orthogonal set in R 3 if v = 2, v 2 = 0, v 3 = Solution We must show that every pair of vectors from this set is orthogonal. This is true, since v v 2 = 2(0) + () + ( )() = 0 v 2 v 3 = 0() + ( ) + ()() = 0 v v 3 = 2() + ( ) + ( )() = 0 Theorem 5.. If v, v 2,..., v k is an orthogonal set of nonzero vectors in R n, then these vectors are linearly independent. Proof If c, c 2,..., c k are scalars such that c v + c 2 v 2 + + c k v k = 0, then or, equivalently, (c v + c 2 v 2 + + c k v k ) v i = 0 v i = 0 Since c ( v v i ) + + c i ( v i v i ) + + c k ( v k v i ) = 0 () v, v 2,..., v k is an orthogonal set, all of the dot products in equation () are zero, except v i v i. Thus, equation () reduces to c i ( v i v i ) = 0

MATH022 Linear Algebra Brief lecture notes 54 Now, v i v i 0 because v i 0 by hypothesis. So we must have c i = 0. The fact that this is true for all i =,..., k implies that v, v 2,..., v k is a linearly independent set. Remark. Thanks to the Theorem 5., we know that if a set of vectors is orthogonal, it is automatically linearly independent. For example, we can immediately deduce that the three vectors in Example 5. are linearly independent. Contrast this approach with the work needed to establish their linear independence directly! An orthogonal basis for a subspace W of R n is a basis of W that is an orthogonal set. Example 5.2. The vectors v = 2, v 2 = 0, v 3 = from Example 5. are orthogonal and, hence, linearly independent. Since any three linearly independent vectors in R 3 form a basis in R 3, by the Fundamental Theorem of Invertible Matrices, it follows that v, v 2, v 3 is an orthogonal basis for R 3. Theorem 5.2 Let { v, v 2,..., v k } be an orthogonal basis for a subspace W of R n and let w be any vector in W. Then the unique scalars c, c 2,..., c k such that are given by Proof Since w = c v + c 2 v 2 + + c k v k c i = w v i v i v i for i =,..., k v, v 2,..., v k is a basis for W, we know that there are unique scalars c, c 2,..., c k such that w = c v + c 2 v 2 + + c k v k (from Theorem 3.29). To establish the formula for c i, we take the dot product of this linear combination with v i to obtain w v i = (c v + c 2 v 2 + + c k v k ) v i = c ( v v i ) + + c i ( v i v i ) + + c k ( v k v i )

MATH022 Linear Algebra Brief lecture notes 55 = c i ( v i v i ) since v j v i = 0 for j i. Since v i 0, v i v i 0. Dividing by v i v i, we obtain the desired result. A unit vector is a vector of unit length. Notice that if v 0 then u = v v is a unit vector collinear (directed along the same line) as v: v = v u. A set of vectors in R n is an orthonormal set if it is an orthogonal set of unit vectors. An orthonormal basis for a subspace W of R n is a basis of W that is an orthonormal set. Theorem 5.3 Let { q, q 2,..., q k } be an orthonormal basis for a subspace W of R n and let w be any vector in W. Then w = ( w q ) q + ( w q 2 ) q 2 + + ( w q k ) q k and this representation is unique. Theorem 5.4. The columns of an m n matrix Q form an orthonormal set if and only if Q T Q = I n. Proof. We need to show that (Q T Q) ij = { 0 if i j if i = j Let q i denote the ith column of Q (and, hence, the ith row of Q T ). Since the (i, j) entry of Q T Q is the dot product of the ith row of Q T and the jth column of Q, it follows that (Q T Q) ij = q i q j (2) by the definition of matrix multiplication. Now the columns of Q form an orthonormal set if and only if { 0 if i j q i q j = if i = j which, by equation (2) holds if and only if { (Q T 0 if i j Q) ij = if i = j

MATH022 Linear Algebra Brief lecture notes 56 This completes the proof. If the matrix Q in Theorem 5.4 is a square matrix, is has a special name. An n n matrix Q whose columns form an orthonormal set is called an orthogonal matrix. The most important fact about orthogonal matrices is given by the next theorem. Theorem 5.5. A square matrix Q is orthogonal if and only if Q = Q T. Proof. By Theorem 5.4, Q is orthogonal if and only if Q T Q = I. This is true if and only if Q is invertible and Q = Q T, by Theorem 3.3. Example Each of the following matrices is orthogonal: 0, 0 0, 0 [ ] / 2 / 2 / 2 /, 2 [ cos α sin α ] sin α cos α Theorem 5.6. equivalent: Let Q be an n n matrix. The following statements are a. Q is orthogonal. b. Q x = x for every x in R n. c. Q x Q y = x y for every x and y in R n. If Q is an orthogonal matrix, then its rows form an or- Theorem 5.7. thonormal set. Theorem 5.8. Let Q be an orthogonal matrix. a. Q is orthogonal. b. det Q = ±. c. If λ is an eigenvalue of Q, then λ =. d. If Q and Q 2 are orthogonal n n matrices, then so is Q Q 2.