Lecture 4 Orthonormal sets of vectors and QR factorization

EE263 Autumn 2007-08 Stephen Boyd Lecture 4 Orthonormal sets of vectors and QR factorization orthonormal sets of vectors Gram-Schmidt procedure, QR factorization orthogonal decomposition induced by a matrix 4 1

Orthonormal set of vectors set of vectors u 1,...,u k R n is normalized if u i = 1, i = 1,...,k (u i are called unit vectors or direction vectors) orthogonal if u i u j for i j orthonormal if both slang: we say u 1,...,u k are orthonormal vectors but orthonormality (like independence) is a property of a set of vectors, not vectors individually in terms of U = [u 1 u k ], orthonormal means U T U = I k Orthonormal sets of vectors and QR factorization 4 2

orthonormal vectors are independent (multiply α 1 u 1 + α 2 u 2 + + α k u k = 0 by u T i ) hence u 1,...,u k is an orthonormal basis for span(u 1,...,u k ) = R(U) warning: if k < n then UU T I (since its rank is at most k) (more on this matrix later... ) Orthonormal sets of vectors and QR factorization 4 3

Geometric properties suppose columns of U = [u 1 u k ] are orthonormal if w = Uz, then w = z multiplication by U does not change norm mapping w = U z is isometric: it preserves distances simple derivation using matrices: w 2 = Uz 2 = (Uz) T (Uz) = z T U T Uz = z T z = z 2 Orthonormal sets of vectors and QR factorization 4 4

inner products are also preserved: U z, U z = z, z if w = Uz and w = U z then w, w = Uz,U z = (Uz) T (U z) = z T U T U z = z, z norms and inner products preserved, so angles are preserved: (Uz,U z) = (z, z) thus, multiplication by U preserves inner products, angles, and distances Orthonormal sets of vectors and QR factorization 4 5

Orthonormal basis for R n suppose u 1,...,u n is an orthonormal basis for R n then U = [u 1 u n ] is called orthogonal: it is square and satisfies U T U = I (you d think such matrices would be called orthonormal, not orthogonal) it follows that U 1 = U T, and hence also UU T = I, i.e., n u i u T i = I i=1 Orthonormal sets of vectors and QR factorization 4 6

Expansion in orthonormal basis suppose U is orthogonal, so x = UU T x, i.e., x = n (u T i x)u i i=1 u T i x is called the component of x in the direction u i a = U T x resolves x into the vector of its u i components x = Ua reconstitutes x from its u i components x = Ua = n a i u i is called the (u i -) expansion of x i=1 Orthonormal sets of vectors and QR factorization 4 7

the identity I = UU T = n i=1 u iu T i is sometimes written (in physics) as I = n u i u i i=1 since x = (but we won t use this notation) n u i u i x i=1 Orthonormal sets of vectors and QR factorization 4 8

Geometric interpretation if U is orthogonal, then transformation w = Uz preserves norm of vectors, i.e., U z = z preserves angles between vectors, i.e., (U z, U z) = (z, z) examples: rotations (about some axis) reflections (through some plane) Orthonormal sets of vectors and QR factorization 4 9

Example: rotation by θ in R 2 is given by [ cos θ sinθ y = U θ x, U θ = sinθ cos θ ] since e 1 (cos θ, sinθ), e 2 ( sinθ, cos θ) reflection across line x 2 = x 1 tan(θ/2) is given by y = R θ x, R θ = [ cos θ sinθ sinθ cos θ ] since e 1 (cos θ, sinθ), e 2 (sinθ, cos θ) Orthonormal sets of vectors and QR factorization 4 10

x 2 rotation x 2 reflection e 2 e 2 θ θ e 1 x 1 e 1 x 1 can check that U θ and R θ are orthogonal Orthonormal sets of vectors and QR factorization 4 11

Gram-Schmidt procedure given independent vectors a 1,...,a k R n, G-S procedure finds orthonormal vectors q 1,...,q k s.t. span(a 1,...,a r ) = span(q 1,...,q r ) for r k thus, q 1,...,q r is an orthonormal basis for span(a 1,...,a r ) rough idea of method: first orthogonalize each vector w.r.t. previous ones; then normalize result to have norm one Orthonormal sets of vectors and QR factorization 4 12

Gram-Schmidt procedure step 1a. q 1 := a 1 step 1b. q 1 := q 1 / q 1 (normalize) step 2a. q 2 := a 2 (q1 T a 2 )q 1 (remove q 1 component from a 2 ) step 2b. q 2 := q 2 / q 2 (normalize) step 3a. q 3 := a 3 (q1 T a 3 )q 1 (q2 T a 3 )q 2 (remove q 1, q 2 components) step 3b. q 3 := q 3 / q 3 (normalize) etc. Orthonormal sets of vectors and QR factorization 4 13

a 2 q 2 = a 2 (q T 1 a 2 )q 1 q 2 q 1 q 1 = a 1 for i = 1, 2,...,k we have a i = (q T 1 a i )q 1 + (q T 2 a i )q 2 + + (q T i 1a i )q i 1 + q i q i = r 1i q 1 + r 2i q 2 + + r ii q i (note that the r ij s come right out of the G-S procedure, and r ii 0) Orthonormal sets of vectors and QR factorization 4 14

QR decomposition written in matrix form: A = QR, where A R n k, Q R n k, R R k k : [ a1 a 2 a k ] } {{ } A = [ q 1 q 2 q k ] }{{} Q Q T Q = I k, and R is upper triangular & invertible called QR decomposition (or factorization) of A r 11 r 12 r 1k 0 r 22 r 2k...... 0 0 r kk } {{ } R usually computed using a variation on Gram-Schmidt procedure which is less sensitive to numerical (rounding) errors columns of Q are orthonormal basis for R(A) Orthonormal sets of vectors and QR factorization 4 15

General Gram-Schmidt procedure in basic G-S we assume a 1,...,a k R n are independent if a 1,...,a k are dependent, we find q j = 0 for some j, which means a j is linearly dependent on a 1,...,a j 1 modified algorithm: when we encounter q j = 0, skip to next vector a j+1 and continue: r = 0; for i = 1,...,k { ã = a i r j=1 q jqj Ta i; if ã 0 { r = r + 1; q r = ã/ ã ; } } Orthonormal sets of vectors and QR factorization 4 16

on exit, q 1,...,q r is an orthonormal basis for R(A) (hence r = Rank(A)) each a i is linear combination of previously generated q j s in matrix notation we have A = QR with Q T Q = I r and R R r k in upper staircase form: possibly nonzero entries zero entries corner entries (shown as ) are nonzero Orthonormal sets of vectors and QR factorization 4 17

can permute columns with to front of matrix: where: A = Q[ R S]P Q T Q = I r R R r r is upper triangular and invertible P R k k is a permutation matrix (which moves forward the columns of a which generated a new q) Orthonormal sets of vectors and QR factorization 4 18

Applications directly yields orthonormal basis for R(A) yields factorization A = BC with B R n r, C R r k, r = Rank(A) to check if b span(a 1,...,a k ): apply Gram-Schmidt to [a 1 a k b] staircase pattern in R shows which columns of A are dependent on previous ones works incrementally: one G-S procedure yields QR factorizations of [a 1 a p ] for p = 1,...,k: [a 1 a p ] = [q 1 q s ]R p where s = Rank([a 1 a p ]) and R p is leading s p submatrix of R Orthonormal sets of vectors and QR factorization 4 19

Full QR factorization with A = Q 1 R 1 the QR factorization as above, write A = [ Q 1 Q 2 ] [ R 1 0 ] where [Q 1 Q 2 ] is orthogonal, i.e., columns of Q 2 R n (n r) are orthonormal, orthogonal to Q 1 to find Q 2 : find any matrix Ã s.t. [A Ã] is full rank (e.g., Ã = I) apply general Gram-Schmidt to [A Ã] Q 1 are orthonormal vectors obtained from columns of A Q 2 are orthonormal vectors obtained from extra columns (Ã) Orthonormal sets of vectors and QR factorization 4 20

i.e., any set of orthonormal vectors can be extended to an orthonormal basis for R n R(Q 1 ) and R(Q 2 ) are called complementary subspaces since they are orthogonal (i.e., every vector in the first subspace is orthogonal to every vector in the second subspace) their sum is R n (i.e., every vector in R n can be expressed as a sum of two vectors, one from each subspace) this is written R(Q 1 ) + R(Q 2 ) = R n R(Q 2 ) = R(Q 1 ) (and R(Q 1 ) = R(Q 2 ) ) (each subspace is the orthogonal complement of the other) we know R(Q 1 ) = R(A); but what is its orthogonal complement R(Q 2 )? Orthonormal sets of vectors and QR factorization 4 21

Orthogonal decomposition induced by A from A T = [ R T 1 0 ][ Q T 1 Q T 2 ] we see that A T z = 0 Q T 1 z = 0 z R(Q 2 ) so R(Q 2 ) = N(A T ) (in fact the columns of Q 2 are an orthonormal basis for N(A T )) we conclude: R(A) and N(A T ) are complementary subspaces: R(A) + N(A T ) = R n (recall A R n k ) R(A) = N(A T ) (and N(A T ) = R(A)) called othogonal decomposition (of R n ) induced by A R n k Orthonormal sets of vectors and QR factorization 4 22

every y R n can be written uniquely as y = z + w, with z R(A), w N(A T ) (we ll soon see what the vector z is... ) can now prove most of the assertions from the linear algebra review lecture switching A R n k to A T R k n gives decomposition of R k : N(A) + R(A T ) = R k Orthonormal sets of vectors and QR factorization 4 23