Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v (see Fig. 2, p. 386; the plane of this diagram is the plane determined by the two vectors u and v). The component of u in the direction of v, also called the projection of u onto v, denoted û; it equals αv for an appropriate choice of scalar α. The component of u orthogonal to v, a vector we label w, must therefore satisfy u = û + w. Thus, since w is orthogonal to v, we have 0 = w v = (u û) v = (u αv) v = u v α(v v) that is, α = u v. In other words, v v û = αv = u v v. v v Notice that the projection of u onto any multiple cv of v is the same vector, û = u cv (cv) = u v v. cv cv v v
That is, u has the same projection onto any nonzero vector in the linear subspace L spanned by v. For this reason, we often denote û by proj L u, recognizing that it has the same value for any vector v chosen from L: û = proj L u = u v v v v. If the vectors u and v lie in R 2 (see Fig. 3, p. 387), then the point determined by the vector û is the point on the line L through v that lies closest to the point determined by u. It follows that the distance between the point determined by u and the line L is the distance between u and û: u û = w. We can apply the formula for proj L u to a much more general setting. Suppose that S = {v 1,v 2,,v k } is an orthogonal set of vectors in R n. Then the following theorem comes into play: Theorem If S = {v 1,v 2,,v k } is an orthogonal set of vectors in R n, then it is a basis for the subspace it spans. Proof Suppose that 0 = c 1 v 1 + +c k v k for suitable scalars c 1,,c k. Then, because the v s are
orthogonal to each other, 0 = 0 v i = (c 1 v 1 + + c k v k ) v i = c 1 (v 1 v i )+ + c k (v k v i ) = c i (v i v i ) But since v i v i is never zero, it follows that each of the c s equal 0. So S is a linearly independent set, and is therefore a basis for the space it spans. // Thus, any orthogonal set of vectors is automatically an orthogonal basis for the space it spans. Theorem Let V = Span {v 1,v 2,,v k } be the subspace of R n spanned by an orthogonal set S = {v 1,v 2,,v k } of vectors. Then any vector u in V can be represented in terms of the basis S as u = u v 1 v 1 + + u v k v k. v 1 v 1 v k v k Proof The component of u in the direction of the basis vector v i is its projection onto v i, namely
proj vi u = The result follows. // u v i v i v i v i. The representation given in the last theorem is simplified considerably when, in addition to being orthogonal, the basis {v 1,v 2,,v k } is orthonormal, i.e., each of the basis vectors has unit length. For then we have v i v i = v i 2 = 1 and the denominators of the fractions disappear. In terms of an orthonormal basis, vectors in such a space have the simple form u = (u v 1 )v 1 + +(u v k )v k. Orthonormal sets of vectors can be used to build matrices that are important in many applications of linear algebra. Theorem An m n matrix U has orthonormal columns if and only if U T U = I. (Since m and n need not be equal, this is not equivalent to saying that U is invertible!)
Proof Suppose that u 1, u 2,, R m. Then U = [ u 1 u 2 ], with T u 1 U T T U = u 2 [ u 1 u 2 ] T u T 1 u 1 u T 1 u 2 u T 1 u n = u T 2 u 1 u T 2 u 2 u T 2 u T n u 1 u T n u 2 u T n u 1 u 1 u 1 u 2 u 1 u = 2 u 1 u 2 u 2 u 2 u 1 u 2 which equals the identity matrix if and only if {u 1,u 2,, } is an orthonormal set of vectors. //
Theorem Let U be an m n matrix with orthonormal columns. Then for any x,y R n, (1) Ux = x ; (2) (Ux ) (Uy ) = x y; and (3) (Ux ) (Uy ) = 0 if and only if x y = 0. That is, the transformation T :R n R m with matrix representation T (x) =Ux preserves lengths of vectors and the angle between vectors. Proof If U = [ u 1 u 2 ], x 1 y 1 x x = 2 and y = y 2, then x n y n (Ux ) (Uy ) = (x 1 u 1 + + x n ) ( y 1 u 1 + + y n ) = (x 1 u 1 + + x n ) ( y 1 u 1 ) + + (x 1 u 1 + + x n ) (y n ) = [x 1 y 1 ( u 1 u 1 )+ + x n y 1 ( u 1 )]+ + [x 1 y n (u 1 ) + + x n y n ( )] = [x 1 y 1 ( u 1 u 1 )]+ + [x n y n ( )] = x 1 y 1 + + x n y n = x y which proves (2). (1) follows, for if we set y = x, Ux 2 = (Ux ) (Ux ) = x x = x 2 Ux = x.
Finally, (3) is an even more immediate consequence of (2). // Returning to the idea of decomposing a vector with respect to an orthogonal basis, we have the following important generalization: The Orthogonal Decomposition Theorem Let V be a subspace of R n. Then every vector u in has a unique decomposition of the form u = û + w where û lies in V and w lies in V. If V has an orthogonal basis {v 1,v 2,,v k }, then R n û = u v 1 v 1 v 1 and so w = u û. Proof The vector û = u v 1 v 1 v 1 v 1 + + u v k v k v k v 1 + + u v k v k v k v k v k certainly lies in V. Also, the vector w = u û lies in V because for every v i,
w v i = u v i û v i = u v i u v 1 v 1 v i u v k v 1 v 1 v k v k = u v i u v i v i v i v i v i = 0 v k v i whereby the decomposition u = û + w does represent u as a sum of a vector in V and a vector in V. This decomposition is unique, for if there are vectors û V and w V for which u = û + w, then û + w = û + w û û = w w. But the vector on the left side of this last equation lies in V while the vector on the right side lies in V. Thus, it is orthogonal to itself. But since v v = 0 v = 0, we must have û û = 0 and w w = 0. That is, û = û and w = w. // Notice that the proof of the uniqueness of the decomposition u = û + w is independent of the choice of the basis for the space V. Thus, despite the formula given in the theorem, the vector û does not depend on the choice of basis, only on the space V. It makes sense then to use the notation proj V u for û.
Corollary When the basis {u 1,u 2,, u k } of V is orthonormal, the projection of u in R n onto V is proj V u = (u u 1 )u 1 + +( u u k )u k. If U = [ u 1 u 2 u k ], then proj V u =UU T u. Proof The first formula for proj V u follows directly from the theorem. To get the second formula, observe that the weights u u 1,, u u k in the first formula can be written in the form u T 1 u,, u T k u. That is, they are the entries of the vector U T u. Consequently, proj V u = (u u 1 )u 1 + +( u u k )u k = [ u 1 u 2 u k ]U T u =UU T u and the second formula follows. // We mentioned earlier that in R 2, the point determined by the projection of u onto the line L through v is the point on the line closest to that determined by u itself. This has a generalization to higher Euclidean space:
The Best Approximation Theorem Let V be a subspace of R n. Then given any u in R n, the vector û = proj V u is the closest point in V to u; that is, for any u û < u v v V different from û. Proof Let v V be any vector other than û. Then û v is a nonzero vector in V. But w = u û lies in V so it is orthogonal to û v. Now the sum of these orthogonal vectors is (u û) +( û v) = u v, so by the Pythagorean Theorem, u û 2 + û v 2 = u v 2. Since û v is nonzero, û v 2 > 0, so u û 2 < u v 2 from which the result follows. //