LECTURE NOTES FOR 416, INNER PRODUCTS AND SPECTRAL THEOREMS

LECTURE NOTES FOR 416, INNER PRODUCTS AND SPECTRAL THEOREMS CHARLES REZK Real inner product. Let V be a vector space over R. A (real) inner product is a function, : V V R such that x, y = y, x for all x, y V, c 1 x 1 + c 2 x 2, y = c 1 x 1, y + c 2 x 2, y for all x 1, x 2, y V, c 1, c 2 R, x, x 0 with x, x = 0 iff x = 0. That is, the pairing is symmetric, linear in the first variable (and therefore bilinear, by symmetry), and positive definite. Example (Standard dot product). Given a column vector v R n 1, let v t R 1 n be the transpose. Then define x, y = y t x. Check that this is just the usual dot product, so that x, y = x i y i. Example. Let P R n n be an invertible matrix. Define x, y P = y t P t P x. This is an inner product, usually different from the dot product. Fact: every inner product on R n 1 is of this form for some P. (See discussion of isometries below.) Example. For f, g P, define f, g = 1 1 f(t)g(t) dt. (Exercise.) This is an inner product. Symmetry and bilinearity are clear. It is also clear that f, f = 1 1 f(t)2 dt 0; if the integral is equal to 0, then it must be the case that f(t) 2 = 0 for all 1 t 1, and therefore f(t) = 0 since it is a polynomial. Date: May 2, 2012. 1

2 CHARLES REZK Complex inner product. Let V be a vector space over C. A (complex, or Hermitian) inner product is a function, : V V C, satisfying the same axioms, except that the first axiom is replaced by skew-symmetry : x, y = y, x. Observe that this implies that although a complex inner product is C-linear in the first variable, it is conjugate linear in the second variable: x, c 1 y 1 + c 2 y 2 = c 1 x, y 1 + c 2 x, y 2. Note also that if we take x = y, then skew-symmetry gives x, x = x, x, and so x, x R for all x R, so the positive definite axiom still makes sense. Example (standard Hermitian inner product). Given A C m n, let A = A t, the conjugate transpose of A, also called the adjoint of A; if A = (a ij ), then the ij-entry of A is a ji. Define x, y = y x = x k y k. We can think of the real version of inner product as a kind of special case of the complex inner product; the formulas for C also apply in the real case. Below, when I speak of an inner product space, it could either be a real or a complex inner product space. Over either R or C, we can define x = x, x, called the length. In the case of V = R n 1, the standard inner product has a geometric interpretation familiar from vector calculus: the inner product knows about lengths and angles. (Note that x = x 2 i in this case.) This is added structure, coming from the inner product; vector spaces without an inner product don t know anything about lengths or angles. Exercise. If V is a complex inner product space, then the function x, y R def = Re x, y is a real inner product on V viewed as a real vector space. (We will not use this real inner product.) Subspaces of inner product spaces. Proposition. If V has inner product,, and W V is a subspace, then the restriction of, to W is an inner product on W. Proof. (Exercise.) Straightforward using the definitions. Orthogonality. Say that x, y V are orthogonal if x, y = 0. Given S V, the orthogonal complement of S is Proposition. S is a subspace of V. S = { v V v, s = 0 for all s S }. Proof. (Exercise.) If x 1, x 2 S, then c 1 x 1 + c 2 x 2, s = c 1 x 1, s + c 2 x 2, s, which is 0 for all s S by definition. A set S is orthonormal if for all x, y S, x, x = 1 and x, y = 0 if x y. Proposition. Any orthonormal subset S V is linearly independent.

LECTURE NOTES FOR 416, INNER PRODUCTS AND SPECTRAL THEOREMS 3 Proof. (Exercise.) If {x 1,..., x n } forms an orthonormal set, consider v = c 1 x 1 + + c n x n for scalars c i. Compute v, x k = c k, since x k, x k = 1, and x i, x k = 0 if i k. Thus v = 0 implies c k = 0 for all k. If u 1,..., u n is an orthonormal basis of V, then Gram-Schmidt. x = x, u j u j. Proposition. Every finite dimensional inner product space has an orthonormal basis. The proof is the Gram-Schmidt algorithm, which takes a basis v 1,..., v n and inductively produces an orthonormal basis u 1,..., u n. The process is: u 1 = v 1 v 1, u 2 = v 2 v 2, u 1 u 1 v 2 v 2, u 1 u 1, u 3 = v 3 v 3, u 1 u 1 v 3, u 2 u 2 v 3 v 3, u 1 u 1 v 3, u 2 u 2... To prove that this works, we first have to show that this construction is well-defined, which amounts to showing that we never divide by zero in the above formulas. We prove this by induction on k, by showing (inductively) that Span(u 1,..., u k 1 ) = Span(v 1,..., v k 1 ). This implies that v k Span(u 1,..., u k 1 ), and therefore v k k 1 j=1 v k, u j u j. Now that we know that vectors u 1,..., u n are defined, it is clear that u k = 1, and it is straightforward to show (again by induction on k) that u i, u k = 0 for i < k, so they form an orthonormal basis. Later we will need the following: Proposition. If V is a finite dimensional inner product space, then v is complementary to Span(v). Proof. Assume v 0. Choose a basis v 1,..., v n of V with v 1 = v. Use Gram-Schmidt to replace with an orthonormal basis u 1,..., u n, with u 1 = v/ v. Verify that v Span(u 2,..., u n ), and therefore for dimension reasons v = Span(u 2,..., u n ). Thus, V is a direct sum of Span(v) = Span(u 1 ) and v = Span(u 2,..., u n ). Exercise. If W is a subspace of a finite dimensional inner product space V, then (W ) = W. Isometry. If V and W are inner product spaces, an isometry is an isomorphism T : V W of vector spaces such that T x, T y W = x, y V. Proposition. For any (real or complex) n-dimensional inner product space V, there exists an isometry between V and R n 1 (if real) or C n 1 (if complex) with the standard inner product. Proof. (Exercise.) Choose an orthonormal basis u 1,..., u n, and define T (x 1,..., x n ) = x k u k.

4 CHARLES REZK Spectral theorems. Recall that for A C m n, we define the adjoint A = A t C n m. Note that (AB) = B A, and that (A ) = A. Theorem. Let A C n n. (1) If AA = A A, there exists an orthonormal basis of eigenvectors of A, which is also an orthonormal basis of eigenvectors for A. (2) If A = A ( self-adjoint, or Hermitian ), then there exists an orthonormal basis of eigenvectors of A, and all the eigenvalues are real (i.e., λ k R). (3) If A = A, ( skew-adjoint, or skew-hermitian ), then there exists an orthonormal basis of eigenvectors of A, and all the eigenvalues are imaginary (i.e., λ k ir). (4) If A = A 1, ( unitary ), then there exists an orthonormal basis of eigenvectors of A, and all eigenvalues have norm one (i.e., λ k = λ k λ k = 1). Statements (2) (4) are special cases of (1), since in each case A commutes with A. To derive the statements about eigenvalues from (1), note in general that since (A w) = w A = w A, we have Av, w = w Av = (A w) v = v, A w. Suppose v is a common eigenvector of A and A, e.g., Av = λv and A v = µv, with v 0. Then Av, v = λv, v = λ v, v is equal to v, A v = v, µv = µ v, v, and so λ = µ. In case (2), this gives λ = λ. In case (3), this gives λ = λ. In case (4), this gives λ 1 = λ. Theorem. Let A R n n. (2) If A t = A ( self-adjoint or symmetric ), then there exists an orthonormal basis of eigenvectors of A in R n 1, and all the eigenvalues are real. (3) If A t = A ( skew-adjoint or skew-symmetric ), then all eigenvalues of A are imaginary (i.e., λ k ir), and come in conjugate pairs ±ai when they are not 0. There are no real eigenvectors (except those with eigenvalue 0); however, in C n 1, there is an orthonormal basis of eigenvectors. (4) If A t = A 1 ( orthogonal ), then all eigenvalues of A are complex of norm one (i.e., λ k = 1), and come in conjugate pairs λ, λ when they are not 1 or 1. The real forms are easily derived from the complex ones. Real (2) is a special case of complex (2), etc. When the eigenvalues are not real, we do not get real eigenvectors; however, the non-real eigenvalues must always come in conjugate pairs. [ ] 2 2 Examples. Symmetric. The real matrix A = has eigenvalues 2, 3, with corresponding 2 1 eigenvectors (1, 2) and (2, 1), which are orthogonal, and so can be normalized to an orthonormal basis of eigenvectors 5 1 (1, 2) and 5 1 (2, 1). Exercise. If A is a real symmetric n n-matrix, consider f : R n 1 R defined by f(x) = x t Ax. Show that the maximum and minimum values attained by f on the unit sphere { x R n 1 x = 1 } are exactly the maximal and minimal eigenvalues of A. (Hint: change to a coordinate system with axes parallel to the eigenvectors. A more fun proof is to use Lagrange multipliers; because

LECTURE NOTES FOR 416, INNER PRODUCTS AND SPECTRAL THEOREMS 5 A is symmetric, the Lagrange multiplier equations will exactly become the eigenvector equation (A λi)x = 0.) [ ] a b Skew-symmetric. If A = is a real skew-symmetric matrix, then a = d = 0 and b = c. c d Thus, the eigenvalues are λ = ±bi. [ ] cos θ sin θ Orthogonal. The real matrix R θ = is orthogonal, since R sin θ cos θ θ t = R θ. The spectral theorem for n n orthogonal matrices says implies that there is an orthonormal basis of R n 1, with respect to which A has block form consisting of 1 1 and 2 2-blocks on the diagonal. The 1 1-blocks have the form ±1, corresponding to real eigenvalues, and the 2 2-blocks have the form R θ, corresponding to eigenvalue pairs e ±iθ. [ For instance, ] any 3 3 orthogonal matrix has an orthonormal basis u 1, u 2, u 3 with block form ±1 0. If the real eigenvalue is 1, then it is a rotation around the axis u 0 R 1 ; if the real eigenvalue θ is 1, then it is a combination of a rotation with a reflection through the plane of rotation. Proof. We have sketched how to derive every form of the spectral theorem from complex case (1), so we only need this basic case. Recall the following facts Any operator T : V V on a (non-trivial) finite dimensional complex vector space has an eigenvalue. If operators S, T : V V commute (ST = T S), then each T -eigenspace E T (λ) is invariant under S. If V is a finite dimensional inner product space and v 0, then v and Span(v) are complementary subspaces. The hypothesis is that A, A C n n commute. Let λ be an eigenvalue of A. Since E A (λ) is invariant under A, the operator A E A (λ) must therefore have an eigenvector v, which will be a common eigenvector of A and A. Assuming Av = λv and A v = µv, compute λ v, v = Av, v = v, A v = v, µv = µ v, v, so λ = µ, i.e., µ = λ. Let v = { x C n 1 x, v = 0 }, the orthogonal complement of v in C n 1. Claim. v is invariant under both A and A. This is a verification. Assuming w v, show that Aw, A w v. For instance, so Aw v ; likewise, Aw, v = w, A v = w, λv = λ w, v = 0, A w, v = v, A w = Av, w = λv, w = λ w, v = 0, so A w v. I would like to inductively apply this to the restrictions of A and A to linear operators acting on v, which has dimension one less than C n 1. We need the following. Lemma. Let A, A C n n such that AA = A A, and let W C n 1 be a non-trivial subspace invariant under both A and A. Then there exists a common eigenvector v W of A and A in

6 CHARLES REZK W, the subspace W v = { w W w, v = 0 } is itself invariant under A and A, and W is a direct sum of Span(v) and W v. Proof. The argument is the same. If W is non-zero, then A W (the restriction of A to a map W W) has an eigenvalue λ. Since A commutes with A, the space E A (λ) is A invariant, and hence so is E A W (λ) = W E A (λ). Since A W has λ as an eigenvalue, the subspace E A W (λ) is non-zero; since it is A -invariant, it contains an A -eigenvector v. We have already seen that v must be invariant under both A and A. Therefore, W v is also invariant under both A and A. It remains to show that W is a direct sum of Span(v) and W v. Since W is itself a finite dimensional inner product space, W v is the orthogonal complement of v relative to W, so the claim follows. To prove (1), use the lemma inductively. Thus, use the lemma to find a common eigenvector v 1 W 0 = C n 1 of A and A, and set W 1 = v, which is guaranteed to be invariant under A and A. Then use the lemma to find a common eigenvector v 2 W 1 of A and A, and set W 2 = W 1 v1, etc. By construction, each new eigenvector is orthogonal to the previous ones, and we can normalize them to length one by setting u k = v k / v k. Orthogonal and unitary matrices. Let U C n n have columns u 1,..., u n. Then U is unitary (U U = U 1 U = I) if and only if u j u i = δ ij. That is, U is unitary if and only if its columns are an orthonormal basis of C n 1. The analogous statetment holds for U R n n, which is orthogonal (U t U = U 1 U = I) if and only if u t j u i = δ ij, i.e., U is orthogonal if and only if its columns are an orthonormal basis of R n 1. The spectral theorem implies, for A C n n, that A is Hermitian if and only if it can be written A = UDU for some unitary matrix U and diagonal matrix D with real entries; A is skew-hermitian if and only if it can be written A = UDU for some unitary matrix U and diagonal matrix D with imaginary entries; A is unitary if and only if it can be written A = UDU for some unitary matrix U and diagonal matrix D with diagonal entries in C of unit norm. The spectral theorem implies, for A R n n, that A is symmetric if and only if it can be written A = UDU t for some orthogonal matrix U and diagonal matrix D with real entries. Department of Mathematics, University of Illinois at Urbana-Champaign, Urbana, IL E-mail address: rezk@math.uiuc.edu