Numerical Methods I Eigenvalue Problems



Similar documents
Numerical Methods I Solving Linear Systems: Sparse Matrices, Iterative Methods and Non-Square Systems

Section Inner Products and Norms

Lecture 1: Schur s Unitary Triangularization Theorem

Chapter 6. Orthogonality

Similarity and Diagonalization. Similar Matrices

Examination paper for TMA4205 Numerical Linear Algebra

Lecture 5: Singular Value Decomposition SVD (1)

Orthogonal Diagonalization of Symmetric Matrices

Applied Linear Algebra I Review page 1

ALGEBRAIC EIGENVALUE PROBLEM

Linear Algebra Review. Vectors

5. Orthogonal matrices

Numerical Analysis Lecture Notes

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

[1] Diagonal factorization

October 3rd, Linear Algebra & Properties of the Covariance Matrix

MATH APPLIED MATRIX THEORY

Recall the basic property of the transpose (for any A): v A t Aw = v w, v, w R n.

Linear algebra and the geometry of quadratic equations. Similarity transformations and orthogonal matrices

Inner Product Spaces and Orthogonality

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

by the matrix A results in a vector which is a reflection of the given

7 Gaussian Elimination and LU Factorization

NUMERICAL METHODS FOR LARGE EIGENVALUE PROBLEMS

DATA ANALYSIS II. Matrix Algorithms

Bindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8

Inner Product Spaces

CS3220 Lecture Notes: QR factorization and orthogonal transformations

Finite Dimensional Hilbert Spaces and Linear Inverse Problems

1 Review of Least Squares Solutions to Overdetermined Systems

Inner products on R n, and more

Introduction to Matrix Algebra

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS

Similar matrices and Jordan form

Continuity of the Perron Root

Factorization Theorems

Vector and Matrix Norms

13 MATH FACTS a = The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions.

Chapter 17. Orthogonal Matrices and Symmetries of Space

Elementary Linear Algebra

4: EIGENVALUES, EIGENVECTORS, DIAGONALIZATION

Solving Linear Systems of Equations. Gerald Recktenwald Portland State University Mechanical Engineering Department

3 Orthogonal Vectors and Matrices

MAT 242 Test 2 SOLUTIONS, FORM T

1 Introduction to Matrices

University of Lille I PC first year list of exercises n 7. Review

Linear Algebra: Determinants, Inverses, Rank

Derivative Free Optimization

Notes on Symmetric Matrices

BANACH AND HILBERT SPACE REVIEW

3.1 State Space Models

LINEAR ALGEBRA. September 23, 2010

Linear Algebra I. Ronald van Luijk, 2012

Computational Optical Imaging - Optique Numerique. -- Deconvolution --

Math 550 Notes. Chapter 7. Jesse Crawford. Department of Mathematics Tarleton State University. Fall 2010

3. Let A and B be two n n orthogonal matrices. Then prove that AB and BA are both orthogonal matrices. Prove a similar result for unitary matrices.

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Lecture 11: 0-1 Quadratic Program and Lower Bounds

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.

Brief Introduction to Vectors and Matrices

Mean value theorem, Taylors Theorem, Maxima and Minima.

Eigenvalues and Eigenvectors

Continued Fractions and the Euclidean Algorithm

4 MT210 Notebook Eigenvalues and Eigenvectors Definitions; Graphical Illustrations... 3

The Characteristic Polynomial

Linear Algebra Done Wrong. Sergei Treil. Department of Mathematics, Brown University

Operation Count; Numerical Linear Algebra

Linear Algebra Review and Reference

Nonlinear Iterative Partial Least Squares Method

17. Inner product spaces Definition Let V be a real vector space. An inner product on V is a function

Lectures notes on orthogonal matrices (with exercises) Linear Algebra II - Spring 2004 by D. Klain

Lecture 4: Partitioned Matrices and Determinants

The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method

Data Mining: Algorithms and Applications Matrix Math Review

Chapter 12 Modal Decomposition of State-Space Models 12.1 Introduction The solutions obtained in previous chapters, whether in time domain or transfor

Lecture 2 Matrix Operations

Linear Algebra Done Wrong. Sergei Treil. Department of Mathematics, Brown University

1. Let P be the space of all polynomials (of one real variable and with real coefficients) with the norm

Orthogonal Bases and the QR Algorithm

Math 115A HW4 Solutions University of California, Los Angeles. 5 2i 6 + 4i. (5 2i)7i (6 + 4i)( 3 + i) = 35i + 14 ( 22 6i) = i.

1 VECTOR SPACES AND SUBSPACES

More than you wanted to know about quadratic forms

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

Notes on Determinant

Chapter 20. Vector Spaces and Bases

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 10

Examination paper for TMA4115 Matematikk 3

Linear Algebra Notes for Marsden and Tromba Vector Calculus

α = u v. In other words, Orthogonal Projection

Finite dimensional C -algebras

The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression

Linear Algebra: Vectors

Transcription:

Numerical Methods I Eigenvalue Problems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course G63.2010.001 / G22.2420-001, Fall 2010 September 30th, 2010 A. Donev (Courant Institute) Lecture IV 9/30/2010 1 / 23

Outline 1 Review of Linear Algebra: Eigenvalues 2 Conditioning of Eigenvalue Problems 3 Computing Eigenvalues and Eigenvectors 4 Methods based on QR factorizations 5 Conclusions A. Donev (Courant Institute) Lecture IV 9/30/2010 2 / 23

Review of Linear Algebra: Eigenvalues Eigenvalue Decomposition For a square matrix A C n n, there exists at least one λ such that Ax = λx (A λi) y = 0 Putting the eigenvectors x j as columns in a matrix X, and the eigenvalues λ j on the diagonal of a diagonal matrix Λ, we get AX = XΛ. A matrix is non-defective or diagonalizable if there exist n linearly independent eigenvectors, i.e., if the matrix X is invertible: X 1 AX = Λ A = XΛX 1. The transformation from A to Λ = X 1 AX is called a similarity transformation and it preserves the eigenspace. A. Donev (Courant Institute) Lecture IV 9/30/2010 3 / 23

Review of Linear Algebra: Eigenvalues Unitarily Diagonalizable Matrices A matrix is unitarily diagonalizable if there exist n linearly independent orthogonal eigenvectors, i.e., if the matrix X can be chosen to be unitary (orthogonal), X U, where U 1 = U : A = UΛU. Note that unitary matrices generalize orthogonal matrices to the complex domain, so we use adjoints (conjugate transposes) instead of transposes throughout. Theorem: A matrix is unitarily diagonalizable iff it is normal, i.e., it commutes with its adjoint: A A = AA. Theorem: Hermitian (symmetric) matrices, A = A, are unitarily diagonalizable and have real eigenvalues. A. Donev (Courant Institute) Lecture IV 9/30/2010 4 / 23

Review of Linear Algebra: Eigenvalues Left Eigenvectors The usual eigenvectors are more precisely called right eigenvectors. There is also left eigenvector corresponding to a given eigenvalue λ y A = λy A y = λy. Y A = ΛY For a matrix that is diagonalizable, observe that Y = X 1 and so the left eigenvectors provide no new information. For unitarily diagonalizable matrices, Y = ( X 1) = U, so that the left and right eigenvectors coincide. A. Donev (Courant Institute) Lecture IV 9/30/2010 5 / 23

Review of Linear Algebra: Eigenvalues Non-diagonalizable Matrices For matrices that are not diagonalizable, one can use Jordan form factorizations, or, more relevant to numerical mathematics, the Schur factorization (decomposition): where T is upper-triangular. A = UTU, The eigenvalues are on the diagonal of T. Note: Observe that A = (UTU ) = UT U so for Hermitian matrices T = T is real diagonal. An important property / use of eigenvalues: A n = (UTU ) (UTU ) (UTU ) = UT (U U) T (U U) TU A n = UT n U A. Donev (Courant Institute) Lecture IV 9/30/2010 6 / 23

Conditioning of Eigenvalue Problems Sensitivity of Eigenvalues Now consider a perturbation of a diagonalizable matrix δa and see how perturbed the similar matrix becomes: X 1 (A + δa) X = Λ + δλ δλ = X 1 (δa) X δλ X 1 δa X = κ (X) δa Conclusion: The conditioning of the eigenvalue problem is related to the conditioning of the matrix of eigenvectors. If X is unitary then X 2 = 1 (from now on we exclusively work with the 2-norm): Unitarily diagonalizable matrices are always perfectly conditioned! Warning: The absolute error in all eigenvalues is of the same order, meaning that the relative error will be very large for the smallest eigenvalues. A. Donev (Courant Institute) Lecture IV 9/30/2010 7 / 23

Conditioning of Eigenvalue Problems Sensitivity of Individual Eigenvalues A. Donev (Courant Institute) Lecture IV 9/30/2010 8 / 23

Conditioning of Eigenvalue Problems Sensitivity of Individual Eigenvalues δλ y (δa) x y x Recalling the Cauchy-Schwartz inequality: y x = x y cos θ xy x y δλ x δa y = δa x y cos θ xy cos θ xy Defining a conditioning number for a given eigenvalue κ (λ, A) = sup δa δλ δa = 1 cos θ xy For unitarily diagonalizable matrices y = x and thus κ (λ, A) = 1: perfectly conditioned! A. Donev (Courant Institute) Lecture IV 9/30/2010 9 / 23

Conditioning of Eigenvalue Problems Sensitivity of Eigenvectors A priori estimate: The conditioning number for the eigenvector itself depends on the separation between the eigenvalues κ (x, A) = ( ) 1 min λ λ j j This indicates that multiple eigenvalues require care. Even for Hermitian matrices eigenvectors are hard to compute. If there is a defective (non-diagonalizable) matrix with eigenvalue for which the difference between the algebraic and geometric multiplicities is d > 0, then δλ δa 1/(1+d), which means the conditioning number is infinite: Defective eigenvalues are very badly conditioned. A. Donev (Courant Institute) Lecture IV 9/30/2010 10 / 23

Computing Eigenvalues and Eigenvectors The need for iterative algorithms The eigenvalues are roots of the characteristic polynomial of A, which is generally of order n. According to Abel s theorem, there is no closed-form (rational) solution for n 5. All eigenvalue algorithms must be iterative! This is a fundamental difference from, example, linear solvers. There is an important distinction between iterative methods to: Compute all eigenvalues (similarity transformations). Compute only one or a few eigenvalues, typically the smallest or the largest one (power-like methods). Bounds on eigenvalues are important, e.g., Courant-Fisher theorem for the Rayleigh quotient: min λ r A (x) = x Ax x x max λ A. Donev (Courant Institute) Lecture IV 9/30/2010 11 / 23

Computing Eigenvalues and Eigenvectors The Power Method Recall that for a diagonalizable matrix A n = XΛ n X 1 and assume λ 1 > λ 2 λ 3 λ n and that the columns of X are normalized, x j = 1. Any initial guess vector q 0 can be represented in the linear basis formed by the eigenvectors q 0 = Xa Recall iterative methods for linear systems: Multiply a vector with the matrix A many times: q k+1 = Aq k q n = A n q 0 = ( XΛ n X 1) Xa = X (Λ n a) A. Donev (Courant Institute) Lecture IV 9/30/2010 12 / 23

Computing Eigenvalues and Eigenvectors Power Method As n, the eigenvalue of largest modulus λ 0 will dominate, { ( ) n } Λ n = λ n λ2 1Diag 1,,... λ n 1Diag {1, 0,..., 0} λ 1 q n = X (Λ n a) λ n 1X a 1 0. 0 = λn 1x 1 Therefore the normalized iterates converge to the eigenvector: q n = q n q n x 1 The Rayleigh quotient converges to the eigenvalue: r A (q n ) = q naq n q n q n = q na q n λ 1 A. Donev (Courant Institute) Lecture IV 9/30/2010 13 / 23

Computing Eigenvalues and Eigenvectors An alternative derivation A. Donev (Courant Institute) Lecture IV 9/30/2010 14 / 23

Computing Eigenvalues and Eigenvectors Power Method Implementation Start with an initial guess q 0, and then iterate: 1 Compute matrix-vector product and normalize it: q k = Aq k 1 Aq k 1 2 Use Raleigh quotient to obtain eigenvalue estimate: ˆλ k = q kaq k 3 Test for convergence: Evaluate the residual r k = Aq k ˆλ k q k and terminate if the error estimate is small enough: λ 1 ˆλ k r k cos θ xy < ε A. Donev (Courant Institute) Lecture IV 9/30/2010 15 / 23

Computing Eigenvalues and Eigenvectors Convergence Estimates The normalized iterates converge to the eigenvector linearly: ( λ 2 ) k q k (±x 1 ) = O Typically the eigenvalue estimate converges quadratically: ( λ 2 ) 2k ˆλ k λ 1 O The power method is fast when the dominant eigenvalue is well-separated from the rest (even if it is degenerate). This conclusion is rather general for all iterative methods: Convergence is good for well-separated eigenvalues, bad otherwise. The power method is typically too slow to be used in practice and there are more sophisticated alternatives (Lanczos/Arnoldi iteration). A. Donev (Courant Institute) Lecture IV 9/30/2010 16 / 23 λ 1 λ 1

Computing Eigenvalues and Eigenvectors Inverse Power Iteration Observe that applying the power method to A 1 will find the largest of λ 1 j, i.e., the smallest eigenvalue (by modulus). If we have an eigenvalue estimate µ λ, then doing the power method for the matrix (A µi) 1 will give the eigenvalue closest to µ. Convergence will be faster if µ is much closer to λ then to other eigenvalues. Recall that in practice (A µi) 1 q is computed by solving a linear system, not matrix inversion (one can reuse an LU factorization)! Finally, if we have an estimate of both the eigenvalue and the eigenvector, we can use Rayleigh Quotient Iteration (see homework). A. Donev (Courant Institute) Lecture IV 9/30/2010 17 / 23

Methods based on QR factorizations Estimating all eigenvalues / eigenvectors Iterative methods akin the power method are not suitable for estimating all eigenvalues. Basic idea: Build a sequence of matrices A k that all share eigenvalues with A via similarity transformations: A k+1 = P 1 A k P, starting from A 1 = A. A numerically stable and good way to do this is to use the QR factorization: A k = Q k+1 R k+1 A k+1 = Q 1 k+1 A kq k+1 = ( Q 1 k+1 Q k+1) Rk+1 Q k+1 = R k+1 Q k+1. Note that the fact the Q s are orthogonal is crucial to keep the conditioning from getting worse. A. Donev (Courant Institute) Lecture IV 9/30/2010 18 / 23

Methods based on QR factorizations The basic QR method The behavior of the QR iteration can be understood most transparently as follows [following Trefethen and Bau]: Observation: The range of the matrix A k converges to the space spanned by the eigenvectors of A, with the eigenvectors corresponding to the largest eigenvalues dominating as k (so this is ill-conditioned). Recall: The columns of Q in A = QR form an orthonormal basis for the range of A. Idea: Form a well-conditioned basis for the eigenspace of A by factorizing: A k = Q k R k and then calculate A k = Q 1 k A Q k = Q ka Q k. It is not too hard to show that this produces the same sequence of matrices A k as the QR algorithm. A. Donev (Courant Institute) Lecture IV 9/30/2010 19 / 23

Methods based on QR factorizations Why the QR algorithm works Summary: The columns of Q k converge to the eigenvectors, and A k = Q ka Q k. We can recognize the above as a matrix of Rayleigh quotients, which for diagonalizable matrices { (A k ) ij = q λ i if i = j i A q j λ i δ ij = 0 if i j showing that (under suitable assumptions): It can also be shown that A k Λ Q k = Q 1 Q 2 Q k X A. Donev (Courant Institute) Lecture IV 9/30/2010 20 / 23

Methods based on QR factorizations More on QR algorithm The convergence of the basic QR algorithm is closely related to that of the power method: It is only fast if all eigenvalues are well-separated. For more general (non-diagonalizable) matrices in complex arithmetic, the algorithm converges to the Schur decomposition A = UTU, A k T and Q k U. It is possible to implement the algorithm entirely using real arithmetic (no complex numbers). There are several key improvements to the basic method that make this work in practice: Hessenberg matrices for faster QR factorization, shifts and deflation for acceleration. There are other sophisticated algorithms as well, such as the divide-and-conquer algorithm, and the best are implemented in the library LAPACK (MATLAB). A. Donev (Courant Institute) Lecture IV 9/30/2010 21 / 23

Conclusions Eigenvalues in MATLAB In MATLAB, sophisticated variants of the QR algorithm (LAPACK library) are implemented in the function eig: Λ = eig(a) [X, Λ] = eig(a) For large or sparse matrices, iterative methods based on the Arnoldi iteration (ARPACK library), can be used to obtain a few of the largest/smallest/closest-to-µ eigenvalues: Λ = eigs(a, n eigs ) [X, Λ] = eigs(a, n eigs ) The Schur decomposition is provided by [U, T ] = schur(a). A. Donev (Courant Institute) Lecture IV 9/30/2010 22 / 23

Conclusions Conclusions/Summary Eigenvalues are well-conditioned for unitarily diagonalizable matrices (includes Hermitian matrices), but ill-conditioned for nearly non-diagonalizable matrices. Eigenvectors are well-conditioned only when eigenvalues are well-separated. Eigenvalue algorithms are always iterative. The power method and its variants can be used to find the largest or smallest eigenvalue, and they converge fast if there is a large separation between the target eigenvalue and nearby ones. Estimating all eigenvalues and/or eigenvectors can be done by combining the power method with QR factorizations. MATLAB has high-quality implementations of sophisticated variants of these algorithms. A. Donev (Courant Institute) Lecture IV 9/30/2010 23 / 23