Section 6.1 - Inner Products and Norms

Section 6.1 - Inner Products and Norms Definition. Let V be a vector space over F {R, C}. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F, denoted x, y, such that for all x, y, and z in V and all c in F, the following hold: 1. x + z, y = x, y + z, y 2. cx, y = c x, y 3. x, y = y, x, where the bar denotes complex conjugation. 4. x, x > 0 if x 0 Note that if z is a complex number, then the statement z 0 means that z is real and non-negative. Notice that if F = R, (3) is just x, y = y, x. Definition. Let A M m n (F ). We define the conjugate transpose or adjoint of A to be the n m matrix A such that (A ) i,j = A i,j for all i, j. Theorem 6.1. Let V be an inner product space. Then for x, y, z V and c F, 1. x, y + z = x, y + x, z 2. x, cy = c x, y 3. x, 0 = 0, x = 0 4. x, x = 0 x = 0 5. If x, y = x, z, for all x V, then y = z. Proof. (Also see handout by Dan Hadley.) 1. x, y + z = y + z, x = y, x + z, x = y, x + z, x = x, y + x, z 2. x, cy = cy, x = c y, x = c y, x = c x, y 3. x, 0 = x, 0x = 0 x, x = 0x, x = 0, x 4. If x = 0 then by (3) of this theorem, x, x = 0. If x, x = 0 then by (3) of the definition, it must be that x = 0. 5. x, y x, z = 0, for all x V. x, y z = 0, for all x V. Thus, y z, y z = 0 and we have y z = 0 by (4). So, y = z. Definition. Let V be an inner product space. For x V, we define the norm or length of x by x = x, x. Theorem 6.2. Let V be an inner product space over F. Then for all x, y V and c F, the following are true. 1. cx = c x. 2. x = 0 if and only if x = 0. In any case x 0. 3. (Cauchy-Schwarz Inequality) < x, y > x y. 4. (Triangle Inequality) x + y x + y. 1

Proof. (Also see notes by Chris Lynd.) 1. cx 2 = cx, cx = c c x, x = c 2 x 2. 2. x = 0 iff ( x, x ) = 0 iff x, x = 0 x = 0. If x 0 then ( x, x ) > 0 and so ( x, x ) > 0. 3. If y = 0 the result is true. So assume y 0. Finished in class. 4. Done in class. Definition. Let V be an inner product space, x, y V are orthogonal or ( perpendicular ) if x, y = 0. A subset S V is called orthogonal if x, y S, x, y = 0. x V is a unit vector if x = 1 and a subset S V is orthonormal if S is orthogonal and x S, x = 1. Section 6.2 Definition. Let V be an inner product space. Then S V is an orthonormal basis of V if it is an ordered basis and orthonormal. Theorem 6.3. Let V be an inner product space and S = {v 1, v 2,..., v k } be an orthogonal subset of V such that i, v i 0. If y Span(S), then k y, v i y = v i 2 v i Proof. Let Then y = k a i v i y, v j = = k a i v i, v j k a i v i, v j = a j v j, v j = a j v j 2 So a j = y,vi v i. Corollary 1. If S also is orthonormal then y = k y, v i v i Corollary 2. If S also is orthogonal and all vectors in S are non-zero then S is linearly independent. Proof. Suppose k a iv i = 0. Then for all j, a j = 0,v i v i = 0. So S is linearly independent. Theorem 6.4. (Gram-Schmidt) Let V be an inner product space and S = {w 1, w 2,..., w n } V be a linearly independent set. Define S = {v 1, v 2,..., v n } where v 1 = w 1 and for 2 k n, k 1 v k = w k w k, v i v i 2 v i. Then, S is an orthogonal set of non-zero vectors and Span(S ) =Span(S). 2

Proof. Base case, n = 1. S = {w 1 }, S = {v 1 }. Let n > 1 and S n = {w 1, w 2,..., w n }. For S n 1 = {w 1, w 2,..., w n 1 } and S n 1 = {v 1, v 2,..., v n 1 }, we have by induction that span(s n 1) = span(s n 1 ) and S n 1 is orthogonal. To show S n is orthogonal, we just have to show j [n 1], < v n, v j >= 0. n 1 v n = w n k=1 < v n, v j > = < w n, v j > < < w k, v k > v k 2 v k n 1 = < w n, v j > k=1 n 1 k=1 < w k, v k > v k 2 v k, v j > < w k, v k > v k 2 < v k, v j > = < w n, v j > < w n, v j > v j 2 < v j, v j > = = 0 We now show span(s n ) = span(s n). We know dim(span(s n )) = n, since S n is linearly independent. We know dim(span(s n)) = n, by Corollary 2 to Theorem 6.3. We know span(s n 1 ) = span(s n 1). So, we just need to show v n span(()s n ). n 1 v n = w n a j v j for some constants a 1, a 2,..., a n 1. For all j [n 1], v j span(s n 1 ) since span(s n 1 ) = span(s n 1) and v j S n 1. Therefore, v n span(s n ). Theorem 6.5. Let V be a non-zero finite dimensional inner product space. Then V has an orthonormal basis β. Furthermore, if β = {v 1, v 2,..., v n } and x V, then x = j=1 x, v i v i. Proof. Start with a basis of V Apply Gram Schmidt, Theorem 6.4 to get an orthogonal set β. Produce β from β by normalizing β. That is, multiply each x β by 1/ x. By Corollary 2 to Theorem 6.3, β is linearly independent and since it has n vectors, it must be a basis of V. By Corollary 1 to Theorem 6.3, if x V, then x = x, v i v i. Corollary 1. Let V be a finite dimensional inner product space with an orthonormal basis β = {v 1, v 2,..., v n }. Let T be a linear operator on V, and A = [T ] β. Then for any i, j, (A) i,j = T (v j ), v i. Proof. By Theorem 6.5, x V, then So Hence So, (A) i,j = T (v j ), v i. x = T (v j ) = x, v i v i. T (v j ), v i v i. [T (v j )] β = ( T (v j ), v 1, T (v j ), v 2,..., T (v j ), v n ) t 3

Definition. Let S be a non-empty subset of an inner product space V. We define S (read S perp ) to be the set of all vectors in V that are orthogonal to every vector in S; that is, S = {x : x, y = 0, y S}. S is called the orthogonal complement of S. Theorem 6.6. Let W be a finite dimensional subspace of an inner product space V, and let y V. Then there exist unique vectors u W, z W such that y = u + z. Furthermore, if {v 1, v 2..., v k } is an orthonormal basis for W, then k u = y, v i v i. Proof. Let y V. Let {v 1, v 2,..., v k } be an orthonormal basis for W. This exists by Theorem 6.5. Let u = k y, v i v i. We will show y u W. It suffices to show that for all i [k], y u, v i = 0. y u, v i = y k y, v j v j, v i j=1 = y, v i k y, v j v j, v i j=1 = y, v i y, v i = 0 Thus y u W. Suppose x W W. Then x W and x W. So, x, x = 0 and we have that x = 0. Suppose r W and s W such that y = r + s. Then r + s = u + z r u = z s This shows that both r u and z s are in both W and W. But W W = so it must be that r u = 0 and z s = 0 which implies that r = u and z = s and we see that the representation of y is unique. Corollary 1. In the notation of Theorem 6.6, the vector u is the unique vector in W that is closest to y. That is, for any x W, y x y u and we get equality in the previous inequality if and only if x = u. Proof. Let y V, u = k < y, v i > v i. z = y u W. Let x W. Then < y u, x >= 0, since z W. By Exercise 6.1, number 10, if a is orthogonal to b then a + b 2 = a 2 + b 2. So we have y x 2 = (u + z) x 2 = (u x) + z 2 = u x 2 + z 2 z 2 = y u 2 Now suppose y x = y u. By the squeeze theorem, u x 2 + z 2 = z 2 and thus u x 2 = 0, u x = 0, u x = 0, and so u = x. 4

Theorem 6.7. Suppose that S = {v 1, v 2,..., v k } is an orthonormal set in an n-dimensional inner product space V. Then (a) S can be extended to an orthonormal basis {v 1,..., v k, v k+1,..., v n } (b) If W =Span(S) then S 1 = {v k+1, v k+2,..., v b } is an orthonormal basis for W. (c) If W is any subspace of V, then dim(v ) = dim(w )+ dim(w ). Proof. (a) Extend S to an ordered basis. S = {v 1, v 2,..., v k, w k+1, w k+2,..., w n }. Apply Gram-Schmidt to S. First k do not change. S spans V. Then normalize S to obtain β = {v 1,..., v k, v k+1,..., v n }. (b) S 1 = {v k+1, v k+2,..., v n } is an orthonormal set and linearly independent Also, it is a subset of W. It must span W since if x W, x = n < x, v i > v i = n i=k+1 < x, v i > v i. (c) dim(v ) = n = k + n k = dim(w ) + dim(w ). Section 6.3: Theorem 6.8. Let V be a finite dimensional inner product space over F, and let g : V F be a linear functional. Then, there exists a unique vector y V such that g(x) = x, y, x V. Proof. Let β = {v 1, v 2,..., v n } be an orthonormal basis for V and let y = g(v i )v i. Then for 1 j n, v j, y = v j, = g(v i )v i g(v i ) v j, v i = g(v j ) v j, v j = g(v j ) and we have, x V, g(x) = x, y. To show y is unique, suppose g(x) = x, y, x. Then, x, y = x, y, x. By Theorem 6.1(e), we have y = y. Example. (2b) Let V = C 2, g(z 1, z 2 ) = z 1 2z 2. Then V is an inner-product space with the standard inner product (x 1, x 2 ), (y 1, y 2 ) = x 1 y 1 + x 2 y 2 and g is a linear operator on V. Find a vector y V such that g(x) = x, y, x V. Sol: We need to find (y 1, y 2 ) C 2 such that That is: g(z 1, z 2 ) = (z 1, z 2 ), (y 1, y 2 ), (z 1, z 2 ) C 2. z 1 2z 2 = z 1 y 1 + z 2 y 2. (1) Using the standard ordered basis {(1, 0), (0, 1)} for C 2, the proof of Theorem 6.8 gives that y = n g(v i)v i. So, (y 1, y 2 ) = g(1, 0)(1, 0) + g(0, 1)(0, 1) = z 1 (1, 0) + 2z 2 (0, 1) = (1, 0) 2(0, 1) = (1, 2) Check (1): LHS= z 1 2z 2 and for y 1 = 1 and y 2 = 2, we have RHS= z 1 2z 2. 5

Theorem 6.9. Let V be a finite dimensional inner product space and let T be a linear operator on V. There exists a unique operator T : V V such that T (x), y = x, T (y), x, y V. Furthermore, T is linear. Proof. Let y V Define g : V F as g(x) = T (x), y, x V. Claim: g is linear. g(ax + z) = T (ax + z), y = at (x) + T (z), y = at (x), y + T (z), y = ag(x) + g(z). By Theorem 6.8, there is a unique y V such that g(x) = x, y. So we have T (x), y = x, y, x V. We define T : V V by T (y) = y. So T (x), y = x, T (y), x V Claim: T is linear. We have for cy + z V, T (cy + z) equals the unique y such that T (x), cy + z = x, T (cy + z). But, Since y is unique, T (cy + z) = ct (y) + T (z). Claim: T is unique. Let U : V V be linear such that So, T = U. T (x), cy + z = c T (x), y + T (x), z = c x, T (y) + x, T (z) = x, ct (y) + T (z) T (x), y = x, U(y), x, y V x, T (y) = x, U(y), x, y V = T (y) = U(y), y V Definition. T is called the adjoint of the linear operator T and is defined to be the unique operator on V satisfying T (x), y = x, T (y), x, y V. For A M n (F ), we have the definition of A, the adjoint of A from before, the conjugate transpose. Fact. x, T (y) = T (x), y, x, y V. Proof. x, T (y) = T (y), x = y, T (x) = T (x), y Theorem 6.10. Let V be a finite dimensional inner product space and let β be an orthonormal basis for V. If T is a linear operator on V, then [T ] β = [T ] β Proof. Let A = [T ] β and B = [T ] β with β = {v 1, v 2,..., v n }. By the corollary to Theorem 6.5, (B) i,j = T (v j ), v i = v i, T (v j ) = T (v i ), v j = (A) j,i = (A ) i,j 6

Corollary 2. Let A be an n n matrix, then L A = (L A ). Proof. Use β, the standard ordered basis. By Theorem 2.16, [L A ] β = A (2) and [L A ] β = A (3) By Theorem 6.10, [L A ] β = [L A ] β, which equals A by (2) and [L A ] β by (3). Therefore, [L A ] β = [L A ] β L A = L A. Theorem 6.11. Let V be an inner product space, T, U linear operators on V. Then 1. (T + U) = T + U. 2. (ct ) = ct, c F. 3. (T U) = U T (composition) 4. (T ) = T 5. I = I. Proof. 1. 2. 3. and (T + U)(x), y = x, (T + U) (y) (T + U)(x), y = T (x), y + U(x), y And (T + U) is unique, so it must equal T + u. and and = x, T (y) + x, U (y) = x, (T + U )(y) ct (x), y = x, (ct ) (y) ct (x), y = c T (x), y = c x, T (y) = x, ct (y) T U(x), y = x, (T U) (y) T U(x), y = T (U(x)), y = U(x), T (y) = x, U (T (y)) = x, (U T )(y) 7

4. T (x), y = x, (T ) (y) by definition and T (x), y = x, T (y) by Fact. 5. I (x), y = x, I(y) = x, y, x, y V Therefore, I (x) = x, x V and we have I = I. Corollary 1. Let A and B be n n matrices. Then 1. (A + B) = A + B. 2. (ca) = ca, c F. 3. (AB) = B A 4. (A ) = A 5. I = I. Proof. Use Theorem 6.11 and Corollary to Theorem 6.10. Or, use below. Example. (Exercise 5b) Let A and B be m n matrices and C an n p matrix. Then 1. (A + B) = A + B. 2. (ca) = ca, c F. 3. (AC) = C A 4. (A ) = A 5. I = I. Proof. 1. (A + B) i,j = (A + B) j,i = (A) j,i + (B) j,i = (A) j,i + (B) j,i and (A + B ) i,j = (A ) i,j + (B ) i,j = (A) j,i + (B) j,i 2. Let c F. (ca) i,j = (ca) j,i = c(a) j,i = c(a) j,i and (ca ) i,j = c(a ) i,j = c(a) j,i 8

3. ((AC) ) i,j = (AC) j,i = (A) j,k (C) k,i = = = k=1 (A) j,k (C) k,i k=1 (A ) k,j (C ) i,k k=1 (C ) i,k (A ) k,j k=1 = (C A ) i,j Fall 2007 - The following was not covered For x, y F n, let x, y n denote the standard inner product of x and y in F n. Recall that if x and y are regarded as column vectors, then x, y n = y x. Lemma. Let A M m n (F ), x F n, and y F m. Then Ax, y m = x, A y n. Proof. Ax, y m = y (Ax) = (y A)x = (A y) x = x, A y n. Lemma. Let A M m n (F ), x F n. Then rank(a A) =rank(a). Proof. A A is an n n matrix. By the Dimension Theorem, rank(a A)+nullity(A A) = n. We also have, rank(a)+nullity(a) = n. We will show that the nullspace of A equals the nullspace of A A. We will show A Ax = 0 if and only if Ax = 0. 0 = A Ax 0 = A Ax, x n 0 = Ax, A x m 0 = Ax, Ax m 0 = Ax Corollary 1. If A is an m n matrix such that rank(a) = n, then A A is invertible. Theorem 6.12. Let A M m n (F ) and y F m. Then there exists x 0 F n such that (A A)x 0 = A y and Ax 0 y Ax y for all x F n. Furthermore, if rank(a) = n, then x 0 = (A A) 1 A y. Proof. Define W = {Ax : x F n } = R(L A ). By the corollary to Theorem 6.6, there is a unique vector u = Ax 0 in W that is closest to y. Then, Ax 0 y Ax y for all x F n. Also by Theorem 6.6, z = y u is in W. So, z = Ax 0 y is in W. So, Ax, Ax 0 y m = 0, x F n. 9

By Lemma 1, x, A (Ax 0 y) n = 0, x F n. So, A (Ax 0 y) = 0. We see that x 0 is the solution for x in A Ax = A y. If, in addition, we know that rank(a) = n, then by Lemma 2, we have that rank(a A) = n and is therefore invertible. So, x 0 = (A A) 1 A y. Fall 2007 - not covered until here. Section 6.4: Lemma. Let T be a linear operator on a finite-dimensional inner product space V. If T has an eigenvector, then so does T. Proof. Let v be an eigenvector of T corresponding to the eigenvalue λ. For all x V, we have 0 = 0, x = (T λi)(x), x = v, (T λi) (x) = v, (T λi)(x) So v is orthogonal to (T λi)(x) for all x. Thus, v R(T λi) and so the nullity of (T λi) is not 0. There exists x 0 such that (T λi)(x) = 0. Thus x is a eigenvector corresponding to the eigenvalue λ of T. Theorem 6.14. (Schur). Let T be a linear operator on a finite-dimensional inner product space V. Suppose that the characteristic polynomial of T splits. Then there exists an orthonormal basis β for V such that the matrix [T ] β is upper triangular. Proof. The proof is by mathematical induction on the dimension n of V. The result is immediate if n = 1. So suppose that the result is true for linear operators on (n 1)-dimensional inner product spaces whose characteristic polynomials split. By the lemma, we can assume that T has a unit eigenvector z. Suppose that T (z) = λz and that W = span({z}). We show that W is T -invariant. If y W and x = cz W, then T (y), x = T (y), cz = y, T (cz) = y, ct (z) = y, cλz = cλ y, z = cλ(0) = 0. So T (y) W. By Theorem 5.21, the characteristic polynomial of T W divides the characteristic polynomial of T and hence splits. By Theorem 6.7(c), dim(w ) = n 1, so we may apply the induction hypothesis to T W and obtain an orthonormal basis γ of W such that [T W ] γ is upper triangular. Clearly, β = γ {z} is an orthonormal basis for V such that [T ] β is upper triangular. Definition. Let V be an inner product space, and let T be a linear operator on V. We say that T is normal if T T = T T. An n n real or complex matrix A is normal if AA = A A. Theorem 6.15. Let V be an inner product space, and let T be a normal operator on V. Then the following statements are true. (a) T (x) = T (x) for all x V. (b) T ci is normal for every c F (c) If x is an eigenvector of T, then x is also an eigenvector of T. In fact, if T (x) = λx, then T (x) = λx. 10

(d) If λ 1 and λ 2 are distinct eigenvalues of T with corresponding eigenvectors x 1 and x 2, then x 1 and x 2 are orthogonal. Proof. (a) For any x V, we have (b) T (x) 2 = T (x), T (x) = T T (x), x = T T (x), x = T (x), T (x) = T (x) 2. (T ci)(t ci) = (T ci)(t ci) = T (T ci) ci(t ci) = T T ct ct c ci = T T ct ct c ci = T T ct ct c ci = T (T ci) ci(t ci) = (T ci)(t ci) = (T ci) (T ci) (c) Suppose that T (x) = λx for some x V. Let U = T λi. Then U(x) = 0, and U is normal by (b). Thus (a) implies that 0 = U(x) = U (x) = (T λi)(x) = T (x) λx. Hence T (x) = λx. So x is an eigenvector of T. (d) Let λ 1 and λ 2 be distinct eigenvalues of T with corresponding eigenvectors x 1 and x 2. Then using (c), we have Since λ 1 λ 2, we conclude that x 1, x 2 = 0. λ 1 x 1, x 2 = λ 1 x 1, x 2 = T (x 1 ), x 2 = x 1, T (x 2 ) = x 1, λ 2 x 2 = λ2 x 1, x 2. Definition. Let T be a linear operator on an inner product space V. We say that T is self-adjoint ( Hermitian) if T = T. An n n real or complex matrix A is self-adjoint (Hermitian) if A = A. Theorem 6.16. Let T be a linear operator on a finite-dimensional complex inner product space V. Then T is normal if and only if there exists an orthonormal basis for V consisting of eigenvectors of T. Proof. The characteristic polynomial of T splits over C. By Shur s Theorem there is an orthonormal basis β = {v 1, v 2,..., v n } such that A = [T ] β is upper triangular. ( ) Assume T is normal. T (v 1 ) = A 1,1 v 1. Therefore, v 1 is an eigenvector with associated eigenvalue A 1,1. Assume v 1,..., v k 1 are all eigenvectors of T. Claim: v k is also an eigenvector. Suppose λ 1,..., λ k 1 are the corresponding eigenvalues. By Theorem 6.15, T (v j ) = λ j v j T (v j ) = λ j v j. Also, T (v k ) = A 1,k v 1 + A 2,k v 2 + + A k,k v k. We also know by the Corollary to Theorem 6.5 that A i,j =< T (v j, v i >. So, A j,k = < T (v k, v j > = < v k, T (v j ) > = < v k, λ j v j > = λ j < v k, v j > { 0 j k = j = k A k,k 11

So T (v k ) = A k,k v k and we have that v k is an eigenvector of T. By induction, β is a set of eigenvectors of T. ( ) If β is orthonormal basis of eigenvectors then T is diagonalizable by Theorem 5.1. D = [T ] β is diagonal. Hence [T ] β is diagonal and equals [T ] β. Diagonal matrices commute. Hence, [T ] β [T ] β = [T ] β [T ] β. Let x V, x = a 1 v 1 + a 2 v 2 + + a n v n. We have that T T (x) = T T (x), and hence, T T = T T. Lemma. Let T be a self-adjoint operator on a finite-dimensional inner product space V. Then (a) Every eigenvalue of T is real. (b) Suppose that V is a real inner product space. Then the characteristic polynomial of T splits. Proof. (a) Let λ be an eigenvalue of T. So T (x) = λx for some x 0. Then λx = T (x) = T (x) = λ(x) (λ λ)x = 0 But x 0 so λ λ = 0, and we have that λ = λ so λ is real. (b) Let dim(v ) = n and β be an orthonormal basis for V. Set A = [T ] β. Then So A is self-adjoint. A = [T ] β = [T ] β = [T ] β = A Define T A : C n C n by T A (x) = Ax. Notice that [T A ] γ = A where γ is the standard ordered basis, which is orthonormal. So T A is self-adjoint and by (a) its eigenvalues are real. WE know that over C the characteristic polynomial of T A factors into linear factors, t λ, and since each λ is R, we know it also factors over R. But T A, A, and T all have the same characteristic polynomial. Theorem 6.17. Let T be a linear operator on a finite-dimensional real inner product space V. Then T is self-adjoint if and only if there exists an orthonormal basis β for V consisting of eigenvectors of T. Proof. ( ) Assume T is self-adjoint. By the lemma, the characteristic polynomial of T splits. Now by Shur s Theorem, there exists an orthonormal basis β for V such that A = [T ] β is upper triangular. But A = [T ] β = [T ] β = [T ] β = A So A and A are both upper triangular, thus diagonal and we see that β is a set of eigenvectors of T. ( ) Assume there is an orthonormal basis β of V of eigenvectors of T. We know D = [T ] β is diagonal with eigenvalues on the diagonal. D is diagonal and is equal to D since it is real. But then [T ] β = D = D = [T ] β = [T ] β. So T = T. Fall 2007 - The following was covered but will not be included on the final. Section 6.5: Definition. Let T be a linear operator on a finite-dimensional inner product space V (over F ). If T (x) = x for all x V, we call T a unitary operator if F = C and an orthogonal operator if F = R. Theorem 6.18. Let T be a linear operator on a finite-dimensional inner product space V. Then the following statements are equivalent. 1. T T = T T = I. 12

2. T (x), T (y) = x, y, x, y V. 3. If β is an orthonormal basis for V, then T (β) is an orthonormal basis for V. 4. There exists an orthonormal basis β for V such that T (β) is an orthonormal basis for V. 5. T (x) = x, x V. Proof. (1) (2) T (x), T (y) = x, T T (y) = x, y (2) (3) Let v i, v j β. Then 0 = v i, v j = T (v i ), T (v j ), so T (β) is orthogonal. By Corollary 2 to Theorem 6.3, any orthogonal subset is linearly independent and since T (β) has n vectors, it must be a basis of V. Also, 1 = v i 2 = v i, v i = T (v i ), T (v i ). So, T (β) is an orthonormal basis of V. (3) (4) By Graham-Schmit, there is an orthonormal basis β for V. By (3), T (β) is orthonormal. (4) (5) Let β = {v 1, v 2,..., v n } be an orthonormal basis for V. Let Then And x = a 1 v 1 + a 2 v 2 + a n v n. x 2 = a 2 1 v 1, v 1 + a 2 2 v 2, v 2 + + a 2 n v n, v n = a 2 1 + a 2 2 + + a 2 n T (x) 2 = a 2 1 T (v 1 ), T (v 1 ) + a 2 2 T (v 2 ), T (v 2 ) + + a 2 n T (v n ), T (v n ) = a 2 1 + a 2 2 + + a 2 n Therefore, T (x) = x. (5) (1) We are given T (x) = x, for all x. We know x, x = 0 if and only if x = 0 and T (x), T (x) = 0 if and only if T (x) = 0. Therefore, T (x) = 0 if and only if x = 0. So N(T ) = {0} and therefore, T is invertible. We have x, x = T (x), T (x) = x, T T (x). Therefore, T T (x) = x for all x, which implies that T T = I. But since T is invertible, it must be that T = T 1 and we have that T T = T T = I. Lemma. Let U be a self-adjoint operator on a finite-dimensional inner product space V. If x, U(x) = 0 for all x V, then U = T 0. (Where T 0 (x) = 0, x.) Proof. u = U U is normal. If F = C, by Theorem 6.16, there is an orthonormal basis β of eigenvectors. If F = R, by Theorem 6.17, there is an orthonormal basis β of eigenvectors. Let x β. Then U(x) = λx for some λ. x, Therefore, λ = 0, so λ = 0 and U(x) = 0, x. 0 =< x, U(x) >=< x, λx >= λ < x, x > Corollary 1. Let T be a linear operator on a finite-dimensional real inner product space V. Then V has an orthonormal basis of eigenvectors of T with corresponding eigenvalues of absolute value 1 if and only if T is both self-adjoint and orthogonal. 13

Proof. ( ) Suppose V has an orthonormal basis {v 1, v 2,..., v n } such that i, T (v i ) = λ i v i and λ i = 1. Then by Theorem 6.17, T is self-adjoint. We ll show T T = T T = I. Then by Theorem 6.18, T (x) = x, x V T is orthogonal. T T (v i ) = T (T (v i )) = T (λ i v i ) = λ i T (v i ) = λ i λ i v i = λ 2 i v i = v i So T T = I. Similarly, T T = I. ( ) Assume T is self-adjoint. Then by Theorem 6.17, V has an orthonormal basis {v 1, v 2,..., v n } such that T (v i ) = λ i v i, i. If T is also orthogonal, we have λ i v i = λ i v i = T (v i ) = v i λ i = 1 Corollary 2. Let T be a linear operator on a finite-dimensional complex inner product space V. Then V has an orthonormal basis of eigenvectors of T with corresponding eigenvalues of absolute value 1 if and only if T is unitary. Definition. A square matrix A is called an orthogonal matrix if A t A = AA t = I and unitary if A A = AA = I. We say B is unitarily equivalent to D if there exists a unitary matrix Q such that D = Q BQ. Theorem 6.19. Let A be a complex n n matrix. Then A is normal if and only if A is unitarily equivalent to a diagonal matrix. Proof. ( ) Assume A is normal. There is an orthonormal basis β for F n consisting of eigenvectors of A, by Theorem 6.16. Let β = {v 1, v 2,..., v n }. So A is similar to a diagonal matrix D by Theorem 5.1, where the matrix S with column i equal to v i is the invertible matrix of similarity. S 1 AS = D. S is unitary. (Leftarrow) Suppose A = P DP where P is unitary and D is diagonal. Also A A = P D DP, but D D = DD. AA = (P DP )(P DP ) = P DP P D P = P DD P Theorem 6.20. Let A be a real n n matrix. Then A is symmetric if and only if A is orthogonally equivalent to a real diagonal matrix. Theorem 6.21. Let A M n (F ) be a matrix whose characteristic polynomial splits over F. 1. If F = C, then A is unitarily equivalent to a complex upper triangular matrix. 2. If F = R, then A is orthogonally equivalent to a real upper triangular matrix. Proof. (1) By Shur s Theorem there is an orthonormal basis β = {v 1, v 2,..., v n } such that [L A ] β = N where N is a complex upper triangular matrix. Let β be the standard ordered basis. Then [L A ] β = A. Let Q = [I] β β. Then N = Q 1 AQ. We know that Q is unitary since its columns are an orthonormal set of vectors and so Q Q = I. Section 6.6 Definition. If V = W 1 W 2, then a linear operator T on V is the projection on W 1 along W 2 if whenever x = x 1 + x 2, with x 1 W 1 and x 2 W 2, we have T (x) = x 1. In this case, R(T ) = W 1 = {x V : T (x) = x} and N(T ) = W 2. We refer to T as the projection. Let V be an inner product space, and let T : V V be a projection. We say that T is an orthogonal projection if R(T ) = N(T ) and N(T ) = R(T ). 14

Theorem 6.24. Let V be an inner product space, and let T be a linear operator on V. orthogonal projection if and only if T has an adjoint T and T 2 = T = T. Then T is an Compare Theorem 6.24 to Theorem 6.9 where V is finite-dimensional. This is the non-finite dimensional version. Theorem 6.25. (The Spectral Theorem). Suppose that T is a linear operator on a finite-dimensional inner product space V over F with the distinct eigenvalues λ 1, λ 2,..., λ k. Assume that T is normal if F = C and that T is self-adjoint if F = R. For each i(1 i k), let W i be the eigenspace of T corresponding to the eigenvalue λ i, and let T i be the orthogonal projection of V on W i. Then the following statements are true. (a) V = W 1 W 2 W k. (b) If W i denotes the direct sum of the subspaces W j for j i, then W i = W i. (c) T i T j = δ i,j T i for 1 i, j k. (d) I = T 1 + T 2 + + T k. (e) T = λ 1 T 1 + λ 2 T 2 + + λ k T k. Proof. Assume F = C. (a) T is normal. By Theorem 6.16 there exists an orthonormal basis of eigenvectors of T. By Theorem 5.10, V = W 1 W 2 W k. (b) Let x W i and y W j, i j. Then x, y = 0 and so W i = W i. But from (1), dim(w i ) = j i dim(w j ) = dim(v ) dim(w i ). By Theorem 6.7(c), we know also that dim(w i ) = dim(v ) dim(w i ). Hence W i = W i. (c) T i is the orthogonal projection of T on W i. For i j, x V, x = w 1 + w 2 + + w k, w i W i. (d) T i (w i ) = w i T i (T i (x)) = T i (x) T i (x) = w i T j T i (x) = T j (w i ) = 0 (T 1 + T 2 + + T k )(x) = T 1 (x) + T 2 (x) + + T k (x) (e) Let x = T 1 (x) + T 2 (x) + + T k (x). So, T (x) = T (T 1 (x)) + T (T 2 (x)) + + T (T k (x)). For all i, T i (x) W i. So T (T i (x)) = λ i T i (x) = w 1 + w 2 + + w k = x = λ 1 T 1 (x) + λ 2 T 2 (x) + + λ k T k (x) = (λ 1 T 1 + λ 2 T 2 + + λ k T k )(x). 15

Definition. The set {λ 1, λ 2,..., λ k } of eigenvalues of T is called the spectrum of T, the sum I = T 1 + T 2 + + T k is called the resolution of the identity operator induced by T, and the sum T = λ 1 T 1 + λ 2 T 2 + + λ k T k is called the spectral decomposision of T. Corollary 1. If F = C, then T is normal if and only if T = g(t ) for some polynomial g. Corollary 2. If F = C, then T is unitary if and only if T is normal and λ = 1 for every eigenvalue λ of T. Corollary 3. If F = C and T is normal, then T is self-adjoint if and only if every eigenvalue of T is real. Corollary 4. Let T be as in the spectral theorem with spectral decomposition T = λ 1 T 1 + λ 2 T 2 + + λ k T k. Then each T j is a polynomial in T. 16