Notes on Symmetric Matrices



Similar documents
Inner Product Spaces and Orthogonality

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

α = u v. In other words, Orthogonal Projection

Chapter 6. Orthogonality

Similarity and Diagonalization. Similar Matrices

Linear Algebra Review. Vectors

Inner products on R n, and more

Chapter 17. Orthogonal Matrices and Symmetries of Space

Math 312 Homework 1 Solutions

Inner Product Spaces

1 Introduction to Matrices

[1] Diagonal factorization

Lecture 1: Schur s Unitary Triangularization Theorem

4: EIGENVALUES, EIGENVECTORS, DIAGONALIZATION

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

DATA ANALYSIS II. Matrix Algorithms

Section Inner Products and Norms

Lectures notes on orthogonal matrices (with exercises) Linear Algebra II - Spring 2004 by D. Klain

Section 5.3. Section 5.3. u m ] l jj. = l jj u j + + l mj u m. v j = [ u 1 u j. l mj

LINEAR ALGEBRA. September 23, 2010

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Orthogonal Diagonalization of Symmetric Matrices

Numerical Analysis Lecture Notes

Vector and Matrix Norms

NOTES ON LINEAR TRANSFORMATIONS

T ( a i x i ) = a i T (x i ).

Linear Algebra Notes for Marsden and Tromba Vector Calculus

13 MATH FACTS a = The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions.

Linear Algebra: Determinants, Inverses, Rank

Introduction to Matrix Algebra

University of Lille I PC first year list of exercises n 7. Review

Inner product. Definition of inner product

Notes on Determinant

The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression

Lecture 2 Matrix Operations

3. Let A and B be two n n orthogonal matrices. Then prove that AB and BA are both orthogonal matrices. Prove a similar result for unitary matrices.

1 VECTOR SPACES AND SUBSPACES

x = + x 2 + x

Data Mining: Algorithms and Applications Matrix Math Review

October 3rd, Linear Algebra & Properties of the Covariance Matrix

Matrix Representations of Linear Transformations and Changes of Coordinates

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.

Lecture 4: Partitioned Matrices and Determinants

Math 115A HW4 Solutions University of California, Los Angeles. 5 2i 6 + 4i. (5 2i)7i (6 + 4i)( 3 + i) = 35i + 14 ( 22 6i) = i.

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

Mathematics Course 111: Algebra I Part IV: Vector Spaces

The Characteristic Polynomial

Linear Algebraic Equations, SVD, and the Pseudo-Inverse

MATH APPLIED MATRIX THEORY

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

A note on companion matrices

3 Orthogonal Vectors and Matrices

x1 x 2 x 3 y 1 y 2 y 3 x 1 y 2 x 2 y 1 0.

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

BANACH AND HILBERT SPACE REVIEW

Applied Linear Algebra I Review page 1

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES

Lecture 5: Singular Value Decomposition SVD (1)

17. Inner product spaces Definition Let V be a real vector space. An inner product on V is a function

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Eigenvalues and Eigenvectors

Orthogonal Projections

5. Orthogonal matrices

Linear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007)

MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set.

Using row reduction to calculate the inverse and the determinant of a square matrix

MAT 200, Midterm Exam Solution. a. (5 points) Compute the determinant of the matrix A =

Classification of Cartan matrices

Math 550 Notes. Chapter 7. Jesse Crawford. Department of Mathematics Tarleton State University. Fall 2010

Similar matrices and Jordan form

CS3220 Lecture Notes: QR factorization and orthogonal transformations

More than you wanted to know about quadratic forms

18.06 Problem Set 4 Solution Due Wednesday, 11 March 2009 at 4 pm in Total: 175 points.

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).

6. Cholesky factorization

Numerical Analysis Lecture Notes

MAT188H1S Lec0101 Burbulla

Math 215 HW #6 Solutions

Continuity of the Perron Root

ISOMETRIES OF R n KEITH CONRAD

MA106 Linear Algebra lecture notes

Examination paper for TMA4115 Matematikk 3

Systems of Linear Equations

Section 4.4 Inner Product Spaces

7 Gaussian Elimination and LU Factorization

Recall the basic property of the transpose (for any A): v A t Aw = v w, v, w R n.

LINEAR ALGEBRA W W L CHEN

Chapter 7. Matrices. Definition. An m n matrix is an array of numbers set out in m rows and n columns. Examples. (

Bindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

Solving Linear Systems, Continued and The Inverse of a Matrix

1 Sets and Set Notation.

Linear Algebra Review and Reference

by the matrix A results in a vector which is a reflection of the given

Linear Algebra I. Ronald van Luijk, 2012

1 Solving LPs: The Simplex Algorithm of George Dantzig

2.3 Convex Constrained Optimization Problems

1 Norms and Vector Spaces

Transcription:

CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices. All matrices that we discuss are over the real numbers. Let A be a real, symmetric matrix of size d d and let I denote the d d identity matrix. Perhaps the most important and useful property of symmetric matrices is that their eigenvalues behave very nicely. Definition 1 Let U be a d d matrix. The matrix U is called an orthogonal matrix if U T U = I. This implies that UU T = I, by uniqueness of inverses. Fact 2 (Spectral Theorem). Let A be any d d symmetric matrix. There exists an orthogonal matrix U and a (real) diagonal matrix D such that A = UDU T. This is called a spectral decomposition of A. Let u i be the ith column of U and let λ i denote the ith diagonal entry of D. Then {u 1,..., u d } is an orthonormal basis consisting of eigenvectors of A, and λ i is the eigenvalue corresponding to u i. We can also write A = d λ i u i u T i. (1) The eigenvalues λ are uniquely determined by A, up to reordering. Caution. The product of two symmetric matrices is usually not symmetric. 1.1 Positive semi-definite matrices Definition 3 Let A be any d d symmetric matrix. The matrix A is called positive semi-definite if all of its eigenvalues are non-negative. This is denoted A 0, where here 0 denotes the zero matrix. The matrix A is called positive definite if all of its eigenvalues are strictly positive. This is denoted A 0. There are many equivalent characterizations of positive semi-definiteness. One useful condition is Similarly, A 0 x T Ax 0 x R d. (2) A 0 x T Ax > 0 x R d Another condition is: A 0 if and only if there exists a matrix V such that A = V V T. The positive semi-definite condition can be used to define a partial ordering on all symmetric matrices. This is called the Löwner ordering or the positive semi-definite ordering. For any two symmetric matrices A and B, we write A B if A B 0. 1

Fact 4 A B if and only if v T Av v T Bv for all vectors v. Proof: Note that v T Av v T Bv holds if and only if v T (A B)v 0 holds. By (2), this is equivalent to A B 0 which is the definition of A B. The set of positive semi-definite matrices is closed under addition and non-negative scaling. (Such a set is called a convex cone.) Fact 5 Let A and B be positive semi-definite matrices of size d d. Let α, β be non-negative scalars. Then αa + βb 0. Proof: This follows easily from (2). Caution. The Löwner ordering does not have all of the nice properties that the usual ordering of real numbers has. For example, if A B 0 then it is not necessarily true that A 2 B 2. Fact 6 Let A and B be symmetric, d d matrices and let C be any d d matrix. If A B then CAC T CBC T. Moreover, if C is non-singular then the if is actually if and only if. Proof: Since A B, then A B 0, so there exists a matrix V such that V V T. Then C(A B)C T = CV V T C T = CV (CV ) T, which shows that C(A B)C T is positive semi-definite. Suppose that C is non-singular and CAC T CBC T. Multiply on the left by C 1 and on the right by (C 1 ) T = (C T ) 1 to get A B. 1.2 Spectral Norm Just as there are many norms of vectors, there are many norms of matrices too. We will only consider spectral norm of A. (This is also called the l 2 operator norm and the Schatten -norm.) Definition 7 Let A be a symmetric d d matrix with eigenvalues λ 1,..., λ d. The spectral norm of A is denoted A and is defined by A = max i λ i. Recall that an eigenvalue of A is any real number z such that there exists a vector u for which Au = zu. Then we also have (A + ti)u = (z + t)u, which implies that z + t is an eigenvalue of A + ti. In fact, the entire set of eigenvalues of A + ti is {λ 1 + t,..., λ d + t}. Fact 8 Let A be a symmetric matrix. Let λ min and λ max respectively denote the smallest and largest eigenvalues of A. Then λ min I A λ max I. Proof: Let λ 1,..., λ d be all the eigenvalues of A. We will show only the second inequality, which is equivalent to λ max I A 0. As observed above, the eigenvalues of λ max I A are {λ max λ 1,..., λ max λ d }. 2

The minimum of these eigenvalues is So λ max I A 0. min j ( ) λmax λ j = λmax max λ j = 0. j In particular, for any symmetric matrix A we have A A I. 1.3 Trace Definition 9 Let A be an arbitrary d d matrix (not necessarily symmetric). The trace of A, denoted tr(a), is the sum of the diagonal entries of A. Fact 10 (Linearity of Trace) Let A and B be arbitrary d d matrices and let α, β be scalars. Then tr(αa + βb) = α tr(a) + β tr(b). Fact 11 (Cyclic Property of Trace) Let A be an arbitrary n m matrix and let B be an arbitrary m n matrix. Then tr(ab) = tr(ba). Proof: This easily follows from the definitions. The ith diagonal entry of AB is m j=1 A i,jb j,i, so tr(ab) = by swapping the order of summation. n B j,ia i,j. n j=1 m A i,j B j,i = m j=1 n B j,i A i,j, This equals tr(ba) because the jth diagonal entry of BA is Fact 12 Let A be a symmetric, d d matrix with eigenvalues {λ 1,..., λ d }. Then tr(a) = d λ d. Proof: Let A = UDU T as in Fact 2. Then as required. tr(a) = tr(udu T ) = tr(u T UD) = tr(d) = d λ i, Claim 13 Let A, B and C be symmetric d d matrices satisfying A 0 and B C. Then tr(ab) tr(ac). Proof: Suppose additionally that A is rank-one, i.e., A = vv T for some vector v. Then tr(ab) = tr(vv T B) = tr(v T Bv) = v T Bv v T Cv = tr(v T Cv) = tr(vv T C) = tr(ac). Now we consider the case where A has arbitrary rank. Let A = d λ iu i u T i. Since we assume that A 0 we can set v i = λ i u i and write A = d v ivi T. Then tr(ab) = tr ( v i vi T B ) = tr(v i vi T B) i i tr(v i v T i C) = 3 i tr ( v i vi T C ) = i tr(ac).

Here the second and third equalities are by linearity of trace, and the inequality follows from the rank-one case. Corollary 14 If A 0 then tr(a B) B tr(a). Proof: Apply Claim 13 with C = B I. We get tr(a B) tr(a B I) B tr(a I), by linearity of trace since B is a scalar. 1.4 Functions applied to matrices For any function f : R R, we can extend f to a function on symmetric matrices as follows. Let A = UDU T be a spectral decomposition of A where λ i is the ith diagonal entry of D. Then we define f(λ 1 ) f(a) = U U T. f(λd) In other words, f(a) is defined simplying by applying the function f to the eigenvalues of A. Caution. f(a) is not obtained applying the function f to the entries of A! One important example of applying a function to a matrix is powering. Let A be any positive semidefinite matrix with spectral decomposition UDU T. For any c R we can define A c as follows. Let S be the diagonal matrix with S i,i = (D i,i ) c. Then we define A c = USU T. For example, consider the case c = 1/2. Note that A 1/2 A 1/2 = USU T USU T = USSU T = UDU T = A. So the square of the square root is the matrix itself, as one would expect. If A is non-singular, the matrix A 1 obtained by taking c = 1 is the same as the usual matrix inverse (by uniqueness of inverses, since A 1 A = I). So we see that the inverse of a non-singular symmetric matrix is obtained by inverting its eigenvalues. Inequalities on real-valued functions also give us inequalities on matrices. Claim 15 Let f : R R and g : R R satisfy f(x) g(x) for all x [l, u] R. Let A be a symmetric matrix for which all eigenvalues lie in [l, u] (i.e., li A ui). Then f(a) g(a). Proof: Let A = UDU T be a spectral decomposition of A where λ i is the ith diagonal entry of D. We wish to show f(a) g(a), i.e., g(a) f(a) 0. By (2), this is equivalent to g(λ 1 ) f(λ 1 ) v T U Uv 0 v R d. g(λd) f(λd) Since U is non-singular this is equivalent to g(λ 1 ) f(λ 1 ) w T w 0 w R d, g(λd) f(λd) where we have simply replaced w = Uv. The left-hand side is simply i w2 i ( g(λi ) f(λ i ) ), which is non-negative by our assumptions on g, f and the λ i s. 4

1.5 The pseudoinverse Sometimes we would like to invert a matrix which is not quite invertible. A useful alternative is to invert the part of the matrix that is invertible (its image) and to leave alone the part of the matrix that is not invertible (its kernel). We can do this by applying the real-valued function: { 1/x (x 0) f(x) = 0 (x = 0). The function f inverts all non-zero numbers and maps 0 to 0. So if we apply f to a symmetric matrix, all non-zero eigenvalues will be inverted, and the zero eigenvalues will remain unchanged. The resulting matrix is called the pseudoinverse and is denoted A +. If A is indeed invertible then A + = A 1. Suppose A is not invertible. We cannot have A A + = I since I has full rank and A does not. However, we have the next best thing: A A + is the identity operator on the image of A. More precisely A A + is the orthogonal projection onto the image of A, which we denote I im A. In other words, for every v in the image of A, we have A A + v = v and for every v in the kernel of A we have A A + v = 0. Claim 16 Let A and B 1,..., B k be symmetric, positive semi-definite matices of the same size. Let A +/2 = (A + ) 1/2 denote the square root of the pseudoinverse of A. Let l and u be arbitrary scalars. Suppose that im(b i ) is a subspace of im(a) for all i. Then la k B i ua li im A k A +/2 B i A +/2 ui im A Proof: = : Since A +/2 AA +/2 = I im A, this follows from Fact 6. =: This also follows from Fact 6, because A 1/2 I im A A 1/2 = A and A 1/2 A +/2 B i A +/2 A 1/2 = I im A B i I im A = I im Bi B i I im Bi = B i, where the second equality follows from the assumption that im(b i ) im(a). 1.6 Matrix exponentials We will mainly be interested in the case f(x) = e x. For any symmetric matrix A, note that e A is positive semi-definite. Whereas in the scalar case e a+b = e a e b holds, it is not necessarily true in the matrix case that e A+B = e A e B. It turns out that there is a useful inequality along these lines. It is called the Golden-Thompson inequality. Theorem 17 (Golden-Thompson Inequality) Let A and B be symmetric d d matrices. Then tr(e A+B ) tr(e A e B ). 5