Computational Methods CMSC/AMSC/MAPL 460. Eigenvalues and Eigenvectors. Ramani Duraiswami, Dept. of Computer Science


1 Computational Methods CMSC/AMSC/MAPL 460 Eigenvalues and Eigenvectors Ramani Duraiswami, Dept. of Computer Science
2 Eigen Values of a Matrix Definition: A N N matrix A has an eigenvector x (nonzero) with corresponding eigenvalue if Ax= x This means Ax x=0 (AΙ) x=0 If a matrix vector product gives a zero vector, then either the vector is zero, or the matrix has zero determinant (is singular). Here this means det(a I) =0
3 Left and Right Eigenvectors Right eigenvector of a matrix A is Ax= x For a N N matrix we can also define a left matrix product y t A=v t So if we have y t A= y t then y is a left eigenvector of A If A is symmetric A=A t (Ax) t =x t A t = x t A=( x) t = x t So left and right eigenvectors of a symmetric matrix are the same
4 Symmetric Matrices A matrix is symmetric if its transpose is equal to itself A is symmetric if A t =A For a complex matrix A H =A Eigenvalues and Eigenvectors of a real symmetric (complex hermitian) matrix are real and eigenvectors are orthogonal.
5 Characteristic Equation Ax = x can be written as (A I)x = 0 which holds for x 0, so (A I) is singular and det(a I) = 0 This is called the characteristic polynomial. If A is n n the polynomial is of degree n and so A has n eigenvalues, counting multiplicities.
6 Example = A = I A 0 (1)(3) ) )(2 (4 0 ) det( = = I A 0 1) 5)( ( = = + Hence the two eigenvalues are 1 and 5.
7 Example (continued) Once we have the eigenvalues, the eigenvectors can be obtained by substituting back into (A I)x = 0. This gives eigenvectors (11) T and (1 1/3) T Note that we can scale the eigenvectors any way we want. Determinant are not used for finding the eigenvalues of large matrices.
8 Positive Definite Matrices A complex matrix A is positive definite if for every nonzero complex vector x the quadratic form x H Ax is real and: x H Ax > 0 where x H denotes the conjugate transpose of x (i.e., change the sign of the imaginary part of each component of x and then transpose).
9 Eigenvalues of Positive Definite Matrices If A is positive definite and and x are an eigenvalue/eigenvector pair, then: Ax = x x H Ax = x H x Since x H Ax and x H x are both real and positive it follows that is real and positive.
10 Properties of Positive Definite Matrices If A is a positive definite matrix then: A is nonsingular. The inverse of A is positive definite. Gaussian elimination can be performed on A without pivoting. The eigenvalues of A are positive.
11 Hermitian Matrices A square matrix for which A = A H is said to be an Hermitian matrix. If A is real and Hermitian it is said to be symmetric, and A = A T. Every Hermitian matrix is positive definite. Every eigenvalue of an Hermitian matrix is real. Different eigenvectors of an Hermitian matrix are orthogonal to each other, i.e., their scalar product is zero.
12 Eigen Decomposition Let 1, 2,, n be the eigenvalues of the n n matrix A and x 1,x 2,,x n the corresponding eigenvectors. Let Λ be the diagonal matrix with 1, 2,, n on the main diagonal. Let X be the n n matrix whose jth column is x j. Then AX = X Λ, and so we have the eigen decomposition of A: A = X t Λ X 1 This requires X to be invertible, thus the eigenvectors of A must be linearly independent.
13 Powers of Matrices If A = X t Λ X 1 then: A 2 = (X t Λ X 1 )(X t Λ X 1 ) = X t Λ (X 1 X) Λ X 1 = X t Λ 2 X 1 Hence we have: A p = X t Λ p X 1 Thus, A p has the same eigenvectors as A, and its eigenvalues are 1p, 2p,, np. We can use these results as the basis of an iterative algorithm for finding the eigenvalues of a matrix.
14 The Power Method Label the eigenvalues in order of decreasing absolute value so 1 > 2 > n. Consider the iteration formula: y k+1 = Ay k where we start with some initial y 0, so that: y k = A k y 0 Then y k converges to the eigenvector x 1 corresponding the eigenvalue 1.
15 Proof We know that A k = X Λ k X 1, so: y k = A k y 0 = X Λ k X 1 y 0 Now we have: = = Λ k k n k k k k n k k k O O The terms on the diagonal get smaller in absolute value as k increases, since 1 is the dominant eigenvalue.
16 Proof (continued) So we have c x c c c x x y k n n k k = = M O M M L M M Since 1 k c 1 x 1 is just a constant times x 1 then we have the required result.
17 Example Let A = [212; 15] and y 0 =[1 1] y 1 = 4[ ] y 2 = 10[ ] y 3 = 22[ ] y 4 = 46[ ] y 5 = 94[ ] y 6 = 190[ ] The iteration is converging on a scalar multiple of [3 1], which is the correct dominant eigenvector.
18 Rayleigh Quotient Note that once we have the eigenvector, the corresponding eigenvalue can be obtained from the Rayleigh quotient: dot(ax,x)/dot(x,x) where dot(a,b) is the scalar product of vectors a and b defined by: dot(a,b) = a 1 b 1 +a 2 b 2 + +a n b n So for our example, 1 = 2.
19 Scaling The 1k can cause problems as it may become very large as the iteration progresses. To avoid this problem we scale the iteration formula: y k+1 = A(y k /r k+1 ) where r k+1 is the component of Ay k with largest absolute value.
20 Example with Scaling Let A = [212; 15] and y 0 =[1 1] Ay 0 = [104] so r 1 =10 and y 1 =[ ]. Ay 1 = [ ] so r 2 =2.8 and y 2 =[ ]. Ay 2 = [ ] so r 3 = and y 3 =[ ]. Ay 3 = [ ] so r 4 = and y 4 =[ ]. Ay 4 = [ ] so r 5 = and y 5 =[ ]. Ay 5 = [ ] so r 6 = and y 6 =[ ]. r is converging to the correct eigenvector 2.
21 Scaling Factor At step k+1, the scaling factor r k+1 is the component with largest absolute value is Ay k. When k is sufficiently large Ay k ' 1 y k. The component with largest absolute value in 1 y k is 1 (since y k was scaled in the previous step to have largest component 1). Hence, r k+1 1 as k.
22 MATLAB Code function [lambda,y]=powermethod(a,y,n) for (i=1:n) y = A*y; [c j] = max(abs(y)); lambda = y(j); y = y/lambda; end
23 Convergence The Power Method relies on us being able to ignore terms of the form ( j / 1 ) k when k is large enough. Thus, the convergence of the Power Method depends on 2 / 1. If 2 / 1 =1 the method will not converge. If 2 / 1 is close to 1 the method will converge slowly.
24 Orthonormal Vectors A set S of nonzero vectors are orthonormal if, for every x and y in S, we have dot(x,y)=0 (orthogonality) and for every x in S we have x 2 =1 (length is 1).
25 The QR Algorithm The QR algorithm for finding eigenvalues is based on the QR factorisation that represents a matrix A as: A = QR where Q is a matrix whose columns are orthonormal, and R is an upper triangular matrix. Note that Q H Q = I and Q 1 =Q H. Q is termed a unitary matrix.
26 QR Algorithm without Shifts A 0 = A for k=1,2, Q k R k = A k A k+1 = R k Q k end Since: A k+1 = R k Q k = Q k 1 A k Q k then A k and A k+1 are similar and so have the same eigenvalues. A k+1 tends to an upper triangular matrix with the same eigenvalues as A. These eigenvalues lie along the main diagonal of A k+1.
27 A 0 = A for k=1,2, s = A k (n,n) Q k R k = A k si A k+1 = R k Q k + si end QR Algorithm with Shift Since: A k+1 = R k Q k + si = Q k 1( A k si)q k +si = Q k 1 A k Q k so once again A k and A k+1 are similar and so have the same eigenvalues. The shift operation subtracts s from each eigenvalue of A, and speeds up convergence.
28 MATLAB Code for QR Algorithm Let A be an n n matrix n = size(a,1); I = eye(n,n); s = A(n,n); [Q,R] = qr(as*i); A = R*Q+s*I Use the up arrow key in MATLAB to iterate or put a loop round the last line.
29 Deflation The eigenvalue at A(n,n) will converge first. Then we set s=a(n1,n1) and continue the iteration until the eigenvalue at A(n1,n1) converges. Then set s=a(n2,n2) and continue the iteration until the eigenvalue at A(n2,n2) converges, and so on. This process is called deflation.
