3 Orthogonal Vectors and Matrices The linear algebra portion of this course focuses on three matrix factorizations: QR factorization, singular valued decomposition (SVD), and LU factorization The first two of these factorizations involve orthogonal matrices These matrices play a fundamental role in many numerical methods This lecture first considers orthogonal vectors and then defines orthogonal matrices 3 Orthogonal Vectors A pair of vector u,v R m is said to be orthogonal if (u,v) = 0 In view of formula () in Lecture, orthogonal vectors meet at a right angle The zero-vector 0 is orthogonal to all vector, but we are more interested in nonvanishing orthogonal vectors A set of vectors S n = {v j } n in Rm is said to be orthonormal if each pair of distinct vectors in S n is orthogonal and all vectors in S n are of unit length, ie, if (v j,v k ) = { 0, j k,, j = k () Here we have used that (v k,v k ) = v k It is not difficult to show that orthonormal vectors are linearly independent; see Exercise 3 below It follows that the m vectors of an orthonormal set S m in R m form a basis for R m Example 3 The set S 3 = {e j } 3 in R5 is orthonormal, where the e j are axis vectors; cf (5) of Lecture Example 3 The set S = {v,v } in R, with v = [,] T, v = [,] T, is orthonormal Moreover, the set S forms a basis for R An arbitrary vector v R m can be decomposed into orthogonal components Consider the set S n = {v j } n of orthonormal vectors in Rm, and regard the expression r = v (v j,v)v j () The vector (v j,v)v j is referred to as the orthogonal component of v in the direction v j Moreover, the vector r is orthogonal to the vectors v j This can be seen by computing the inner products (v k,r) for all k We obtain (v k,r) = (v k,v (v j,v)v j ) = (v k,v) (v j,v)(v k,v j )
Using (), the sum in the right-hand side simplifies to which shows that (v j,v)(v k,v j ) = (v k,v), (v k,r) = 0 Thus, v can be expressed as a sum of the orthogonal vectors r,v,,v n, Example 33 v = r + (v j,v)v j Let v = [,,3,4,5] T and let the set S be the same as in Example 3 Then r = [0,0,0,4,5] T Clearly, r is orthogonal to the axis vectors e,e,e 3 Exercise 3 Let S n = {v j } n be a set of orthonormal vectors in Rm Show that the vectors v,v,,v n are linearly independent Hint: Assume this is not the case For instance, assume that v is a linear combination of the vectors v,v 3,,v n, and apply () 3 Orthogonal Matrices A square matrix Q = [q,q,,q m ] R m m is said to be orthogonal if its columns {q j } m form an orthonormal basis of R m Since the columns q,q,,q m are linearly independent, cf Exercise 3, the matrix Q is nonsingular Thus, Q has an inverse, which we denote by Q It follows from the orthonormality of the columns of Q that (q,q ) (q,q ) (q,q m ) Q T (q,q ) (q,q ) (q,q m ) Q = = I, (q m,q ) (q m,q ) (q m,q m ) where I denotes the identity matrix Multiplying the above expression by the inverse Q from the right-hand side shows that Q T = Q Thus, the transpose of an orthogonal matrix is the inverse Example 34 The identity matrix I is orthogonal
Example 35 The matrix is orthogonal Its inverse is its transpose, [ Q = Q = Q T = / / / / ] [ / / / / Geometrically, multiplying a vector by an orthogonal matrix reflects the vector in some plane and/or rotates it Therefore, multiplying a vector by an orthogonal matrices does not change its length Therefore, the norm of a vector u is invariant under multiplication by an orthogonal matrix Q, ie, ] Qu = u (3) This can be seen by using the properties (8) and (6) of Lecture We have Qu = (Qu) T (Qu) = u T Q T (Qu) = u T (Q T Q)u = u T u = u We take square-roots of the right-hand side and left-hand side, and since is nonnegative, equation (3) follows 33 Householder Matrices Matrices of the form H = I ρuu T R m m, u 0, ρ = u T u, (4) are known as Householder matrices They are used in numerical methods for least-squares approximation and eigenvalue computations We will discuss the former application in the next lecture Householder matrices are symmetric, ie, H = H T, and orthogonal The latter property follows from H T H = H = (I ρuu T )(I ρuu T ) = I ρuu T ρuu T + (ρuu T )(ρuu T ) = I ρuu T ρuu T + ρu(ρu T u)u T = I ρuu T ρuu T + ρuu T = I, where we have used that ρu T u = ; cf (4) Our interest in Householder matrices stems from that they are orthogonal and the vector u in their definition can be chosen so that an arbitrary (but fixed) vector w R m \{0} is mapped by H onto a multiple of the axis vector e We will now show how this can be done Let w 0 be given We would like for some scalar σ It follows from (3) that Hw = σe (5) w = Hw = σe = σ e = σ Therefore, σ = ± w (6) 3
Moreover, using the definition (4) of H, we obtain from which it follows that σe = Hw = (I ρuu T )w = w τu, τu = w σe τ = ρu T w The matrix H is independent of the scaling factor τ in the sense that the entries of the matrix H do not change if we replace τu by u We therefore may choose u = w σe (7) This choice of u and either one of the choices (6) of σ give a Householder matrix that satisfies (5) Nevertheless, in finite precision arithmetic, the choice of sign in (6) may be important To see this, let w = [w,w,,w m ] T and write the vector (7) in the form u = w σ w w 3 w m If the components w j for j are of small magnitude compared to w, then w w and, therefore, the first component of u satisfies u = w σ = w ± w w ± w (8) We would like to avoid the situation that w is large and u is small, because then u is determined with low relative accuracy; see Exercise 33 below We therefore let which yields Thus, u is computed by adding numbers of the same sign Example 36 σ = sign(w ) w, (9) u = w + sign(w ) w (0) Let w = [,,,] T We are interested in determining the Householder matrix H that maps w onto a multiple of e The parameter σ in (5) is chosen, such that σ = w =, ie, σ is or The vector u in (4) is given by u = w ± σe = The choice σ = yields u = [3 ] T We choose the σ positive, because the first entry of w is positive Finally, we obtain ρ = / u = /6 and / / / / H = I ρuu T = / 5/6 /6 /6 / /6 5/6 /6 / /6 /6 5/6 ± σ 4
It is easy to verify that H is orthogonal and maps w to e In most applications it is not necessary to explicitly form the Householder matrix H Matrix-vector products with a vector v are computed by using the definition (4) of H, ie, Hv = (I ρuu T )v = v (ρu T v)u The left-hand side is computed by first evaluating the scalar τ = ρu T v and then computing the vector scaling and addition v τu This way of evaluating Hw requires fewer arithmetic floating-point operations than straightforward computation of the matrix-vector product; see Exercise 36 Moreover, the entries of H do not have to be stored, only the vector u and scalar ρ The savings in arithmetic operations and storage is important for large problems Exercise 3 Let w = [,,3] T Determine the Householder matrix that maps w to a multiple of e Only the vector u in (4) has to be computed Exercise 33 This exercise illustrates the importance of the choice of the sign of σ in (6) Let w = [,05 0 8 ] T and let u be the vector in the definition of Householder matrices (4), chosen such that Hw = σe MATLAB yields >> w=[; 5e-9] w = 0000 00000 >> sigma=norm(w) sigma = >> u=w()+sigma u = >> u=w()-sigma u = 0 5
where the u denote the computed approximations of the first component, u, of the vector u How large are the absolute and relative errors in the computed approximations u of the component u of the vector u? Exercise 34 Show that the product U U of two orthogonal matrices is an orthogonal matrix Is the product of k > orthogonal matrices an orthogonal matrix? Exercise 35 Let Q be an orthogonal matrix, ie, Q T Q = I Show that QQ T = I Exercise 36 What is the count of arithmetic floating point operations for evaluating a matrix vector product with an n n Householder matrix H when the representation (4) of H is used? Only u and ρ are stored, not the entries of H What is the count of arithmetic floating point operations for evaluating a matrix-vector product with H when the entries of H (but not u and ρ) are available Correct order of magnitude of the arithmetic work when n is large suffices Exercise 37 Let the nonvanishing m-vectors w and w be given Determine an orthogonal matrix H, such that w = Hw Hint: Use Householder matrices 6