Inner products and orthogonality

Similar documents
Similarity and Diagonalization. Similar Matrices

3. INNER PRODUCT SPACES

Chapter 6. Orthogonality

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Section 4.4 Inner Product Spaces

Orthogonal Diagonalization of Symmetric Matrices

Inner Product Spaces

Recall that two vectors in are perpendicular or orthogonal provided that their dot

Section Inner Products and Norms

by the matrix A results in a vector which is a reflection of the given

Chapter 17. Orthogonal Matrices and Symmetries of Space

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

17. Inner product spaces Definition Let V be a real vector space. An inner product on V is a function

Vector and Matrix Norms

Inner Product Spaces and Orthogonality

Linear Algebra Notes

Linear Algebra Review. Vectors

Numerical Analysis Lecture Notes

Linear Algebra Notes for Marsden and Tromba Vector Calculus

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Lectures notes on orthogonal matrices (with exercises) Linear Algebra II - Spring 2004 by D. Klain

Linear Algebra I. Ronald van Luijk, 2012

MATH APPLIED MATRIX THEORY

Recall the basic property of the transpose (for any A): v A t Aw = v w, v, w R n.

Inner product. Definition of inner product

Math 4310 Handout - Quotient Vector Spaces

α = u v. In other words, Orthogonal Projection

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).

Section 1.1. Introduction to R n

1 VECTOR SPACES AND SUBSPACES

Linear algebra and the geometry of quadratic equations. Similarity transformations and orthogonal matrices

28 CHAPTER 1. VECTORS AND THE GEOMETRY OF SPACE. v x. u y v z u z v y u y u z. v y v z

Orthogonal Projections

Bindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8

Solutions to Math 51 First Exam January 29, 2015

Figure 1.1 Vector A and Vector F

Lecture 14: Section 3.3

[1] Diagonal factorization

4.5 Linear Dependence and Linear Independence

Linearly Independent Sets and Linearly Dependent Sets

BANACH AND HILBERT SPACE REVIEW

LEARNING OBJECTIVES FOR THIS CHAPTER

Numerical Analysis Lecture Notes

LINEAR ALGEBRA W W L CHEN

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Chapter 20. Vector Spaces and Bases

Elementary Linear Algebra

1 Sets and Set Notation.

NOTES ON LINEAR TRANSFORMATIONS

THREE DIMENSIONAL GEOMETRY

Unified Lecture # 4 Vectors

1. Let P be the space of all polynomials (of one real variable and with real coefficients) with the norm

Math Practice Exam 2 with Some Solutions

T ( a i x i ) = a i T (x i ).

Linear Algebra: Vectors

5.3 The Cross Product in R 3

Orthogonal Projections and Orthonormal Bases

x = + x 2 + x

A linear combination is a sum of scalars times quantities. Such expressions arise quite frequently and have the form

Mechanics 1: Vectors

x1 x 2 x 3 y 1 y 2 y 3 x 1 y 2 x 2 y 1 0.

MA106 Linear Algebra lecture notes

Notes on Determinant

Lecture L3 - Vectors, Matrices and Coordinate Transformations

LINEAR ALGEBRA. September 23, 2010

Name: Section Registered In:

Metric Spaces. Chapter Metrics

1 Norms and Vector Spaces

5. Orthogonal matrices

Introduction to Matrix Algebra

PROJECTIVE GEOMETRY. b3 course Nigel Hitchin

Examination paper for TMA4115 Matematikk 3

The Matrix Elements of a 3 3 Orthogonal Matrix Revisited

13 MATH FACTS a = The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions.

Cross product and determinants (Sect. 12.4) Two main ways to introduce the cross product

Nonlinear Iterative Partial Least Squares Method

October 3rd, Linear Algebra & Properties of the Covariance Matrix

x + y + z = 1 2x + 3y + 4z = 0 5x + 6y + 7z = 3

One advantage of this algebraic approach is that we can write down

Problem Set 5 Due: In class Thursday, Oct. 18 Late papers will be accepted until 1:00 PM Friday.

Applied Linear Algebra I Review page 1

Lecture 1: Schur s Unitary Triangularization Theorem

THE DIMENSION OF A VECTOR SPACE

Math 215 HW #6 Solutions

Finite dimensional C -algebras

Systems of Linear Equations

Problem set on Cross Product

Matrix Representations of Linear Transformations and Changes of Coordinates

Geometry of Vectors. 1 Cartesian Coordinates. Carlo Tomasi

Linear Algebra Done Wrong. Sergei Treil. Department of Mathematics, Brown University

1.3. DOT PRODUCT If θ is the angle (between 0 and π) between two non-zero vectors u and v,

9.4. The Scalar Product. Introduction. Prerequisites. Learning Style. Learning Outcomes

Solving Quadratic Equations

MATH 4330/5330, Fourier Analysis Section 11, The Discrete Fourier Transform

The Dot and Cross Products

is in plane V. However, it may be more convenient to introduce a plane coordinate system in V.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

Adding vectors We can do arithmetic with vectors. We ll start with vector addition and related operations. Suppose you have two vectors

Transcription:

Chapter 5 Inner products and orthogonality Inner product spaces, norms, orthogonality, Gram-Schmidt process Reading The list below gives examples of relevant reading. (For full publication details, see Chapter.) Leon, S.J., Linear Algebra with Applications. Chapter 5, Sections 5., 5.3, 5.5. Ostaszewski, A. Advanced Mathematical Methods. 2.3, 2.4, 2.7 and 2.8. Simon, C.P. and Blume, L., Mathematics for Economists. Chapter 0, Section 0.4. Introduction In this short chapter we examine more generally the concept of orthogonality, which has already been encountered in our work on orthogonal diagonalisation. The inner product of two real n-vectors For x, y R n, the inner product (sometimes called the dot product or scalar product) is defined to be the number x, y given by x, y = x T y = x y + x 2 y 2 + + x n y n. Example: If x = (, 2, 3) T and y = (2,, ) then x, y = (2) + 2( ) + 3() = 3. 7

It is important to realise that the inner product is just a number, not another vector or a matrix. Inner products more generally Suppose that V is a vector space (over the real numbers). An inner product on V is a mapping from (or operation on) pairs of vectors x, y to the real numbers, the result of which is a real number denoted x, y, which satisfies the following properties: (i) x, x 0 for all x V, and x, x = 0 if and only if x = 0, the zero vector of the vector space (ii) x, y = y, x for all x, y V (iii) αx + βy, z = α x, z + β y, z for all x, y, z V and all α, β R. Some other basic facts follow immediately from this definition: for example, z, αx + βy = α z, x + β z, y. Activity 5. Prove this. It is a simple matter to check that the inner product defined above for real vectors is indeed an inner product according to this more abstract definition, and we shall call it the standard inner product on R n. The abstract definition, though, applies to more than just the vector space R n, and there is some advantage in developing results in terms of the general notion of inner product. If a vector space has an inner product defined on it, we refer to it as an inner product space. Example: Suppose that V is the vector space consisting of all real polynomial functions of degree at most n; that is, V consists of all functions of the form p(x) = a 0 + a x + a 2 x 2 + + a n x n, where a 0, a,..., a n R. The addition and scalar multiplication are, as usual, defined pointwise. Let x, x 2,..., x n+ be n fixed, different, real numbers, and define, for p, q V, n+ p, q = p(x i )q(x i ). i= Then this is an inner product. To see this, we check the properties in the definition of an inner product. Property (ii) is clear. For (i), we have n+ p, p = p(x i ) 2 0. i= Clearly, if p is the zero vector of the vector space (which is the identically-0 function), then p, p = 0. To finish verifying (i) we need to check that if p, p = 0 then p must be the zero function. Now, p, p = 0 must mean that p(x i ) = 0 for i =, 2,..., n +. So p has n + different roots. But p has degree no more than n, so p must be the identically-zero function. (A non-zero polynomial of degree at most n has no more than n distinct roots.) Part (iii) is left to you: 72

Activity 5.2 Prove that, for any alpha, β R and any p, q, r V, αp + βq, r = α p, r + β q, r. Norms in a vector space For any x in an inner product space V, the inner product x, x is non-negative (by definition). Now, because x, x 0, we may take its square root (obtaining a real number). We define the norm or length x of a vector x to be x = x, x. For example, for the standard inner product on R n, x, x = x 2 + x 2 2 + + x 2 n, (which is clearly non-negative since it is a sum of squares), and we obtain the standard Euclidean length of a vector: x = x 2 + x2 2 + + x2 n. Orthogonality Orthogonal vectors We have already said (in the discussion of orthogonal diagonalisation) what it means for two vectors x, y in R n to be orthogonal: it means that x T y = 0. In other words, x, y are orthogonal if x, y = 0. We take this as the general definition of orthogonality in an inner product space: Definition 5. Suppose that V is an inner product space. Then x, y V are said to be orthogonal if x, y = 0. We write x y to mean that x, y are orthogonal. Example: With the usual inner product on R 3, the vectors x = (,, 0) T y = (2, 2, 3) T are orthogonal. and Activity 5.3 Check this! Geometrical interpretation A geometrical interpretation can be given to the notion of orthogonality in R n. Consider a very simple example with n = 2. Suppose that x = (, ) T and y = (, ) T. Then x, y are orthogonal, as is easily seen. We can represent x, y geometrically on the standard two-dimensional (x, y)-plane: x is represented as an arrow from the origin (0, 0) to the point (, ); and y is represented as an arrow from the origin to the point (, ). This is shown in the figure. It is clear that these arrows the geometrical interpretations of x, y are at right angles to each other: they are perpendicular. 73

y (, ) (0, 0) (, ) x In fact, this geometrical interpretation is valid in R n, for any n. This is because it turns out that if x, y R n then the inner product x, y equals x y cos θ where θ is the angle between the geometrical representations of the two vectors. If neither x nor y is the zero-vector, then the inner product is 0 if and only if cos θ = 0, which means that θ is π/2 or 3π/2 radians, in which case the angle between the vectors is a right angle. Orthogonality and linear independence If a set of (non-zero) vectors are pairwise orthogonal (that is, any two are orthogonal) then it turns out that the vectors are linearly independent: Theorem 5. Suppose that V is an inner product space and that vectors v, v 2,..., v k V are pairwise orthogonal (v i v j for i j), and none is the zero-vector. Then {v, v 2,..., v k } is a linearly independent set of vectors. Proof We need to show that if α v + α 2 v 2 + + α k v k = 0, (the zero-vector), then α = α 2 = = α k = 0. Let i be any integer between and k. Then v i, α v + α 2 v 2 + + α k v k = v i, 0 = 0. But, since v i, v j = 0 for j i, v i, α v +α 2 v 2 + +α k v k = α v i, v +α 2 v i, v 2 + +α k v i, v k = α i v i, v i = α i v i 2. So we have α i v i 2 = 0. Since v i 0, v i 2 0 and hence α i = 0. But i was any integer in the range to n, so we deduce that as required. α = α 2 =... = α k = 0, 74

Orthogonal matrices and orthonormal sets We have already met the word orthogonal in a different context: we spoke of orthogonal matrices when considering orthogonal diagonalisation. Recall that a matrix P is orthogonal if P T = P. Now, this means that P T P = I, the identity matrix. Suppose that the columns of P are x, x 2,..., x n. Then the fact that P T P = I means that x T i x j = 0 if i j and x T i x i =. To help see this, consider the case n = 3. Then, P = (x x 2 x 3 ) and since P T P = I we have 0 0 0 0 = I = P T P = 0 0 xt x T 2 x T 3 ( x xt x x 2 x 3 ) = x T 2 x x T x 2 x T 2 x 2 x T x 3 x T 2 x 3. x T 3 x x T 3 x 2 x T 3 x 3 But, if i j, x T i x j = 0 means precisely that the columns x i, x j are orthogonal. The second statement is that x i 2 =, which means (since x i 0) that x i = ; that is, x i is of length. This indicates the following characterisation: a matrix P is orthogonal if and only if, as vectors, its columns are pairwise orthogonal, and each has length. When a set of vectors {x, x 2,..., x k } is such that any two are orthogonal and, furthermore, each has length, we say that the vectors form an orthonormal set (ONS) of vectors. So we can restate our previous observation as follows. Theorem 5.2 A matrix P is orthogonal if and only if the columns of P form an orthonormal set of vectors. The Cauchy-Schwarz inequality This important inequality is as follows. Theorem 5.3 (Cauchy-Schwarz inequality) Suppose that V is an inner product space. Then x, y x y for all x, y V. Proof Let x, y be any two vectors of V. For any real number α, we consider the vector αx + y. Certainly, αx + y 2 0 for all α. But αx + y 2 = αx + y, αx + y = α 2 x, x + α x, y + α y, x + y, y = α 2 x 2 + 2α x, y + y 2. Now, this quadratic expression in α is non-negative for all α. Generally, we know that if a quadratic expression az 2 + bz + c is non-negative for all z then b 2 4ac 0. Applying this observation, we see that (2 x, y ) 2 4 x 2 y 2 0, or ( x, y ) 2 x 2 y 2. 75

Taking the square root of each side we obtain which is what we need. x, y x y, (Recall that x, y denotes the absolute value of the inner product.) For example, if we take V to be R n and consider the standard inner product on R n, then for all x, y R n, the Cauchy-Schwarz inequality tells us that n x i y i n x 2 n i yi 2. i= i= i= Generalised Pythagoras theorem A version of Pythagoras theorem will no doubt be familiar to almost all of you: namely, that if a is the length of the longest side of a right angled triangle, and b and c the lengths of the other two sides, then a 2 = b 2 + c 2. The generalised Pythagoras theorem is: Theorem 5.4 (Generalised Pythagoras Theorem) In an inner product space V, if x, y V are orthogonal, then x + y 2 = x 2 + y 2. Proof This is easy to prove. We know that for any z, z 2 = z, z, simply from the definition of the norm. So, x + y 2 = x + y, x + y = x, x + y + y, x + y = x, x + x, y + y, x + y, y = x 2 + 2 x, y + y 2 = x 2 + y 2, where the last line follows from the fact that, x, y being orthogonal, x, y = 0. We also have the triangle inequality for norms. Theorem 5.5 (Triangle inequality for norms) In an inner product space V, if x, y V, then x + y x + y. Proof We have x + y 2 = x + y, x + y = x, x + y + y, x + y 76

= x, x + x, y + y, x + y, y = x 2 + 2 x, y + y 2 x 2 + y 2 + 2 x, y x 2 + y 2 + 2 x y = ( x + y ) 2, where the last inequality used is the Cauchy-Schwarz inequality. Thus x + y x + y, as required. Gram-Schmidt orthonormalisation process The orthonormalisation procedure Given a set of linearly independent vectors {v, v 2,..., v k }, the Gram-Schmidt orthonormalisation process is a way of producing k vectors that span the same space as is spanned by {v, v 2,..., v k }, and that form an orthonormal set. That is, the process produces a set {e, e 2,..., e k } such that: Lin{e, e 2,..., e k } = Lin{v, v 2,..., v k } {e, e 2,..., e k } is an orthonormal set. It works as follows. First, we set Then we define and set Next, we define and set e = v v. u 2 = v 2 v 2, e e, e 2 = u 2 u 2. u 3 = v 3 v 3, e e v 3, e 2 e 2 e 3 = u 3 u 3. Generally, when we have e, e 2,..., e i, we let u i+ = v i+ i v i+, e j e j, e j+ = u i+ u i+. j= It turns out that the resulting set {e, e 2,..., e k } has the required properties. Example: In R 4, let us find an orthonormal basis for the linear span of the three vectors v = (,,, ) T, v 2 = (, 4, 4, ) T, v 3 = (4, 2, 2, 0). 77

First, we have Next, we have e = v v = v 2 + 2 + 2 + 2 = 2 v = (/2, /2, /2, /2) T. u 2 = v 2 v 2, e e = (, 4, 4, ) T (3)(/2, /2, /2, /2) T = ( 5/2, 5/2, 5/2, 5/2) T, and we set e 2 = u 2 = ( /2, /2, /2, /2). u 2 (Note: to do this last step, I merely noted that a normalised vector in the same direction as u 2 is also a normalised vector in the same direction as (,,, ) T, and this second vector is easier to work with.) Continuing, we have u 3 = v 3 v 3, e e v 3, e 2 e 2 = (4, 2, 2, 0) T (2)(/2, /2, /2, /2) T ( 2)( /2, /2, /2, /2) T = (2, 2, 2, 2) T. Then, So e 3 = u 3 u 3 = (/2, /2, /2, /2)T. /2 /2 /2 /2 /2 /2 {e, e 2, e 3 } =,, /2 /2 /2. /2 /2 /2 Activity 5.4 Verify that the set {e, e 2, e 3 } of this example is an orthonormal set. Orthogonal diagonalisation when eigenvalues are not distinct We have seen in an earlier chapter that if a symmetric matrix has distinct eigenvalues, then (since eigenvectors corrsponding to different eigenvalues are orthogonal) it is orthogonally diagonalisable. But, in fact, n n symmetric matrices are always orthogonally diagonalisable, even if they do not have n distinct eigenvalues. What we need for orthogonal diagonalisation is an orthonormal set of n eigenvectors. If it so happens that there are n different eigenvalues then any set of n corresponding eigenvectors form a pairwise orthogonal set of vectors, and all we need do to transform the set into an orthonormal set is normalise each vector. However, if we have repeated eigenvalues, more care is required. Suppose that λ 0 is a repeated eigenvalue of A, by which we mean that, for some k 2, (λ λ 0 ) k is a factor of the characteristic polynomial of A. The multiplicity of λ 0 is the largest k for which this is the case. The eigenspace corresponding to λ 0 is E(λ 0 ) = {x : (A λ 0 )x = 0}, the subspace consisting of all eigenvectors corresponding to λ 0, together with the zero-vector 0. An important fact, which we shall not prove here, is that, if A is symmetric, the dimension of E(λ 0 ) is exactly the multiplicity k of λ 0. This means that there is some basis {x, x 2,..., x k } of size k of the eigenspace E(λ 0 ). We can 78

use the Gram-Schmidt orthonormalisation process to produce an orthonormal basis of E(λ 0 ). Eigenvectors from different eigenspaces are orthogonal (and hence linearly independent). So if we compose a set of n vectors by taking orthonormal bases for each of the eigenspaces, the resulting set is orthonormal, and we can orthogonally diagonalise the matrix A by means of the matrix P with these vectors as its columns. Learning outcomes At the end of this chapter and the relevant reading, you should be able to: explain what is meant by an inner product on a vector space verify that a given inner product is indeed an inner product compute norms in inner product spaces explain why orthogonality of a set of vectors implies linear independence explain what is meant by an orthonormal set of vectors explain why an n n matrix is orthogonal if and only if it possesses an orthonormal set of n eigenvectors know and apply the Cauchy-Schwarz inequality, the Generalised Pythagoras Theorem, and the triangle inequality for norms use the Gram-Schmidt orthonormalisation process Sample examination questions The following are typical exam questions, or parts of questions. Question 5. Let V be the vector space of all m n real matrices (with matrix addition and scalar multiplication). Define, for A = (a ij ) and B = (b ij ) V, m n A, B = a ij b ij. Prove that this is an inner product on V. i= j= Question 5.2 Prove that in any inner product space V, for all x, y V. x + y 2 + x y 2 = 2 x 2 + 2 y 2, Question 5.3 Suppose that v R n. Prove that the set of vectors orthogonal to v, W = {x R n : x v}, is a subspace of R n. How would you describe this subspace geometrically? More generally, suppose that S is any (not necessarily finite) set of vectors in R n and let S denote the set Prove that S is a subspace of R n. S = {x R n : x v for all v S}. 79

Question 5.4 Use the Gram-Schmidt process to find an orthonormal basis for the subspace of R 4 spanned by the vectors 0 0 2 v =, v 2 =, v 3 =. 2 0 Sketch answers or comments on selected questions Question 5. Property (i) of the definition of inner product is easy to check: A, A = m i= j= n a 2 ij 0, and this equals zero if and only if for every i and every j, a ij = 0, which means that A is the zero matrix, which in this vector space is the zero vector. Property (ii) is easy to verify, as also is (iii). Question 5.2 We have: x + y 2 + x y 2 = x + y, x + y + x y, x y = x, x + 2 x, y + y, y + x, x 2 x, y + y, y = 2 x, x + 2 y, y = 2 x 2 + 2 y 2. Question 5.3 Suppose x, y W and α, β R. Because x v and y v, we have (by definition) x, v = y, v = 0. Therefore, αx + βy, v = α x, v + β x, v = α(0) + β(0) = 0, and hence αx + βy v; that is, αx + βy W. Therefore W is a subspace. In fact, W is the set {x : x, v = 0}, which is the hyperplane through the origin with normal vector v. (Hyperplanes are discussed again in a later chapter.) We omit the proof that S is a subspace. This is a standard result, which can be found in the texts: S is known as the orthogonal complement of S. Question 5.4 To start with, e = v / v = (/ 2)(, 0,, 0) T. Then we let 0 2 u 2 = v 2 v 2, e e = 2 0 2 2 =. 2 0 0 Then 0 e 2 = u 2 u 2 = 2. 5 0 80

Next, Normalising u 3 we obtain /5 u 3 = v 3 v 3, e 2 e 2 v 3, e e = =. 2/5 e 3 = 55 ( 5,, 5, 2) T. The required basis is {e, e 2, e 3 }. 8