Regression With Gaussian Measures

Size: px
Start display at page:

Download "Regression With Gaussian Measures"

Transcription

1 Regression With Gaussian Measures Michael J. Meyer Copyright c April 11, 2004

2 ii PREFACE We treat the basics of Gaussian processes, Gaussian measures, kernel reproducing Hilbert spaces and related topics. All mathematical details are included and every effort is made to keep this as selfcontained as possible. Only elementary Hilbert space theory and integration theory as well as basic results from probability theory are assumed. This is a work in progress and has been written up in haste. Undoubtedly there are mistakes. Please me at spyqqqdia@yahoo.com if you find mistakes or have suggestions. Michael J. Meyer April 11, 2004

3 Contents 1 Introduction 1 2 Operators on Hilbert Space Hilbert space basics Adjoint operator Selfadjoint and positive operators Compact operators between Banach spaces Compact selfadjoint operators Compact operators between Hilbert spaces Hilbert-Schmidt and trace class operators Inverse problems and regularization Regularization Kernels and integral operators Symmetric kernels L 2 -Bounded Kernels Reproducing Kernel Hilbert Spaces Positive semidefinite kernels Translation invariant kernels Reproducing kernel Hilbert spaces Bilinear kernel expansion Characterization of functions in H K Kernel domination Approximation in reproducing kernel Hilbert spaces Orthonormal bases Second description of H Gaussian Measures Probability measures in Hilbert space iii

4 iv CONTENTS 4.2 Gaussian measures on Hilbert space Cameron-Martin space Regression with Gaussian measures Model choices Square Integrable Processes Integrable processes Processes with sample paths in an RKHS Gaussian random fields Definition and construction Construction of Gaussian random fields A Vector Valued Integration 109 B Conditioning of multinormal Random Vectors 111 C Orthogonal polynomials 113 C.0.2 Legendre polynomials

5 Chapter 1 Introduction We will freely use the terminology which will be defined later. Let F be a nonempty set and f : F R a real valued function on F. Consider the following problem: we have observed the value of f at some points x 1,..., x n F as y j = f(x j ), j = 1,..., n, (1.1) and from this we want to estimate f itself. We will follow a Bayesian approach. It is assumed that the function f belongs to a real vector space H of functions on F. A prior probability P is placed on H and the regressor ˆf (the estimate of f in light of the data) is computed as the mean of P conditioned on the data (1.1). The probability P is defined on the σ-field E generated by the continuous linear functionals on H. If I : (H, E, P ) H denotes the H-valued random variable defined as I(f) = f (the identity on H) then the mean of the distribution P on H is the expectation E P [I] of I under P, that is, the H-valued integral E P [I] = IdP = f P (df), (1.2) H Do not worry if this sounds needlessly abstract since it is not how things are handled in practice. It merely serves to motivate the procedures below. The vector valued integral (1.2) commutes with all continuous linear functionals Λ on H, that is, Λ ( E P [I] ) = E P (Λ I) = Λ(f) P (df) and the same holds true if the ordinary expectation is replaced with a con- 1 H H

6 2 CHAPTER 1. INTRODUCTION ditional expectation. The regressor ˆf is the conditional expectation and so we have ˆf = E P [I data] (1.3) Λ( ˆf) = E P [Λ data] (1.4) for each continuous linear functional Λ on H. (note that Λ I = Λ). Thus rather than computing the regressor ˆf globally as in (1.3) we compute Λ(f) for enough continuous linear functionals Λ on H to obtain a good view of ˆf. For each x F let E x : f H f(x) R denote the valuation functional at the point x. If Λ = E x then Λ( ˆf) = ˆf(x) is our prediction for the value of f at the point x in light of the data (1.1). Note that the data themselves can be written in terms of the valuation functionals as E j (f) = y j, 1 j n, (1.5) where E j = E xj is the evaluation functional at the point x j. With this the regressor ˆf becomes the condional expectation and ˆf = E P [ I E j = y j, j n ] Λ( ˆf) = E P [Λ E j = y j, j n ], (1.6) for each continuous linear functional Λ on H. To make this feasible we have to assume that 1. The evaluation functionals E x, x F, are continuous on H. The computation of (1.6) involves only the finite dimensional distribution of the random vector W = (E 1,..., E n, Λ) on R n+1 under the probability P. Note that each continuous linear functional on H is a random variable on the probability space (H, E, P ). The measure P is is called a Gaussian measure on H if every continuous linear functional Λ on H is a normal random variable under P. In this case the distribution of the vector W is automatically Gaussian (multinormal) on R n+1 and the computation of the conditional epxectation (1.6) involves merely routine computations with the multinormal density.

7 3 We have chosen the particular form (1.1) for the data because this is the standard in regression problems. Note however that our approach applies to all forms of data and predictions which can be articulated in terms of events involving finitely many continuous linear functionals on H. Regression with Gaussian processes assumes that f is the trajectory of a Gaussian process Z = Z(x) on F. The mean of the process is assumed to be zero and thus the process Z completely determined by its covariance function K(x, y) which is a symmetric positive semidefinite kernel on F. The kernel K : F F R is a parameter of the regression procedure. The space H is the product space H = R F of all functions f : F R and the probability P is the distribution of Z on H. Kolmogoroff s existence theorem for product measures guarentees the existence of the probability P on H for every symmetric, positive semidefinite kernel K on F. The space H = R F is a topological vector space with only one redeeming quality: the evaluation functionals are the coordinate functionals and hence continuous in the product topololgy on H. Unfortunately there are essentially no other continuous linear functionals on H. Every continuous linear functional on H is a finite linear combination of coordinate functionals. Consequently this setup limits us to data presented in the form (1.1) and consequent predictions of values f(x) at other points x F in a point by point fashion. There are other disadvantages. For example it requires a substantial effort to extract properties of the admissible functions f, that is, the trajectories of the Gaussian process Z, from properties of the covariance kernel K and the resulting properties are often weaker than desired. Consequently we take a slightly different approach. We assume instead that f is an element of a separable Hilbert space H of functions on F. P is a Gaussian measure on H defined in terms of an orthonormal basis {ψ j } of H and a sequence (σ j ) of positive numbers (which diagonalize the covariance operator Q of P below). We can then proceed as above provided that the evaluation functionals are continuous on H. But we also have other options. The data and predictions can be articulated in any fashion which uses only finitely many continuous linear functionals Λ on H. Point estimates are one possibility. Another possibility are the coefficients Λ(f) = (f, ψ k ) of f in the expansion of f = j (f, ψ j)ψ j in the basis {ψ j } of H.

8 4 CHAPTER 1. INTRODUCTION Here we had to assume that the evaluation functionals are continuous on H. A Hilbert space of functions on F with this property is called a reproducing kernel Hilbert space on F. Such a Hilbert space H defines a unique symmetric, positive semidefinite kernel K : F F R. Conversely every symmetric, positive semidefinite kernel on K : F F R determines a unique reproducing kernel Hilbert space. There is an interesting interplay between orthonormal bases of H and the kernel K. A basic question is how to find an orthonormal basis for H. If F R d is compact and K is continuous, then we have additional structure in the form of the Euclidean topology and Lebesgue measure on X. Associated with the kernel K we have the integral operator T : L 2 (F ) L 2 (F ) defined by (T f)(x) = K(x, y)f(y)dy, f L 2 (F ), x F, F where dx denotes Lebesgue measure on F. It turns out that T is a Hilbert- Schmidt operator. Consequently the orthogonal complement of the null space of T has an orthonormal basis {φ j } consisting of eigenvectors of T. Let λ j denote the corresponding eigenvalues. Then the functions ψ j = λ j φ j are an orthonormal basis for the reproducing kernel Hilbert space H with kernel K. This establishes the connection to the spectral theory of compact, selfadjoint operators on a Hilbert space. There is another connection. For f H let Λ f be the bounded linear functional Λ f (h) = (h, f) on H. The Gaussian measure P on H defines a unique bounded linear operator Q : H H such that the covariances of the random variables Λ f, Λ g are given as Cov P (Λ f, Λ g ) = (Qf, g) H, f, g H. (1.7) The operator Q is a positive trace class operator. Conversely for every positive trace class operator Q : H H, there exists a unique Gaussian measure P on H such that (1.7) holds. Thus the material presents an interesting interaction of functional analysis and probability theory. If you are only interested in the regression problem you need only read Chapter 2, Chapter 3, sections 1-4,7,8 and Chapter 4, sections 1,2,4.

9 Chapter 2 Operators on Hilbert Space In this chapter we develop the spectral theory of compact operators between Hilbert spaces. Our scalars are the reals, that is, we consider only real Hilbert spaces. 2.1 Hilbert space basics We review the basics of Hilbert space theory. Let H be a (real) Hilbert space with inner product (, ). Let denote the closed unit ball in H and H 1 = { x H : x 1 } S 1 (H) = { x H : x = 1 } the unit sphere in H. For vectors x, y H we write x y (orthogonal) if (x, y) = 0. For subsets A, B of H we write A B if a b for all a A and b B. We let A := { x H x a, for all a A }. Then A is a closed subspace of H. If V is a closed subspace of H, then H = V + V, in particular every closed subspace of H is complemented in H. This is the first fundamental fact about Hilbert spaces. Each element x H has unique decomposition x = v + v with v V and v V. We have x 2 = v 2 + v 2 5

10 6 CHAPTER 2. OPERATORS ON HILBERT SPACE (the Law of Pythagoras). The map x v is called the perpendicular projection onto the subspace V and is denoted π V. If (φ j ) is an ON-basis of V, then π V (x) = j (x, φ j)φ j, x H. (2.1) The second fundamental property of a Hilbert space H is the fact that the continuous linear functionals on H can be identified with the elements of H if a H, then Λ a : x H (x, a) R defines a continuous linear functional on H. The converse is also true: every continuous linear functional on H has this form (Riesz Representation Theorem). Bilinear forms. Let X and Y be Hilbert spaces. A function ψ = ψ(x, y) : X Y R is called a bilinear form if it is linear in both variables x and y. The bilinear form ψ is called continuous if ψ = sup{ ψ(x, y) : x X 1, y Y 1 } <. (2.2) In this case ψ(x, y) ψ x y, for all x X and y Y. Note that the closed unit balls X 1, Y 1 can be replaced with the unit spheres S 1 (X), S 1 (Y ) with no effect on the definition of the norm of ψ. If A : X Y is a bounded linear operator, then ψ(x, y) = (Ax, y) defines a continuous bilinear form on X Y with ψ = A. Conversely Theorem (Lax-Milgram). Let ψ = ψ(x, y) be a continuous bilinear form on X Y. Then there exists a bounded linear operator A : X Y such that ψ(x, y) = (Ax, y) Y, for all x X and y Y. Proof. Fix x X. Then Λ x (y) = ψ(x, y) is a continuous linear functional on Y. By the Riesz Representation Theorem there exists an element a Y with Λ x (y) = (a, y) Y, for all y Y. Clearly a is uniquely determined by x. Write a = Ax. Thus defines a map A : X Y which satisfies ψ(x, y) = (Ax, y). The uniqueness of a and linearity of ψ in the first argument imply that the map A is linear. The continuity of ψ implies that A is continuous. If X = Y = H, then a bilinear form ψ = ψ(x, y) on X Y is called a bilinear form on H. Such a bilinear from is called symmetric if it satisfies ψ(x, y) = ψ(y, x), for all x, y H. In this case Proposition Let ψ = ψ(x, y) be a symmetric bilinear form on H. Then ψ = sup{ ψ(x, x) : x 1 } (2.3)

11 2.2. ADJOINT OPERATOR 7 Proof. Let C denote the right hand side of (2.3). Obviously C ψ and we have to show only the reverse inequality. Write φ(x) = ψ(x, x). Then φ(x) C if x 1. Using the the symmetry of ψ we can write ψ(x, y) = 1 2 [ φ ( x + y 2 ) φ ( x y 2 )]. Recall that H 1 deotes the closed unit ball in H. If x, y H 1, then (x±y)/2 H 1 and it follows that φ((x ± y)/2) C. From this ψ(x, y) 1 (C + C) = C. 2 Taking the sup over all x, y H 1 now yields ψ C. 2.2 Adjoint operator The Lax-Milgram theorem can be used show the existence of the adjoint operator. Let X, Y be Hilbert spaces and T : X Y a bounded linear operator. Then ψ(y, x) = (y, T x) Y is a continuous bilinear form on Y X. Consequently there exists a bounded linear operator T : Y X such that ψ(y, x) = (T y, x) X, for all y Y and x X. It is easy o see that the operator T is uniquely determined by its defining property (T x, y) = (x, T y), x X, y Y. Obviously T = T. We note the following Proposition We have (i) N(T T ) = N(T ). (ii) N(T ) = R(T ). (iii) N(T ) = R(T ). Proof. (i) If T x = 0, then T T x = 0. Conversely, if T T x = 0, then T x 2 = (T x, T x) = (T T x, x) = 0, thus x N(T ). (ii) Let w N(T ) and y = T x for some x X. Then (y, w) = (x, T w) = 0. Thus w R(T ). Conversely, if w R(T ), then (T w, x) = (y, T x) = 0, for all x X. This implies T y = 0 (let x = T y) and so w N(T ). Now (iii) follows from this. Replace T with T and note that T = T. Remark. By taking orthogonal complements in (ii) and (iii) we obtain R(T ) N(T ) and R(T ) N(T ) but we will not have equality in general since R(T ) and R(T ) need not be closed.

12 8 CHAPTER 2. OPERATORS ON HILBERT SPACE For any subset A X we have A = (A). Thus (ii) can be written as N(T ) = [R(T )]. Note that this implies that T is one to one on the closure R(T ) of the range of T. 2.3 Selfadjoint and positive operators A bounded linear operator T on H is called selfadjoint if it satisfies (T x, y) = (x, T y), (2.4) for all x, y H. In this case the nullspace N(T ) = { x H T x = 0 } satisfies N(T ) = R(T ). The converse R(T ) = N(T ) is not true in general simply because the range R(T ) will not in general be closed. The number λ is called an eigenvalue of T if there is a nonzero vector x H with T x = λx, that is x N(T λi), where I is the identity operator on H. We let E λ (T ) := N(T λi) = { x H T x = λx } denote the eigenspace associated with the eigenvalue λ. Obviously this space is defined wether or not λ is an eigenvalue of T. It is an eigenvalue if and only if E λ (T ) {0}. The nonzero elements of E λ (T ) are called the eigenvectors associated with the eigenvalue λ. Proposition Let T be a selfadjoint operator on H. Then λ µ implies E λ (T ) E µ (T ), in other words, eigenvectors with respect to different eigenvalues are perpendicular to each other. Proof. Assume that T x = λx and T y = µy. Then λ(x, y) = (T x, y) = (x, T y) = µ(x, y). Since λ µ this implies that (x, y) = 0. If λ = 0 then the eigenspace E λ (T ) is simply the nullspace N(T ) and λ = 0 is an eigenvalue of T if and only if T has an nontrivial nullspace. If T is selfadjoint this eigenspace is perpendicular to the range R(T ) and so no eigenvector associated with the eigenvalue zero is in the range of T. By contrast, if λ 0, then E λ (T ) R(T ) since every eigenvector associated with λ satisfies x = λ 1 T x. A subspace V X is called T -invariant if it satisfies T (V ) V. In this case the restriction of T to V is a linear operator on V.

13 2.3. SELFADJOINT AND POSITIVE OPERATORS 9 Proposition Let T be a selfadjoint operator on H and V H a T - invariant subspace. Then the orthogonal complement V is also T -invariant. Proof. Let x V. Then for all y V we have (T x, y) = (x, T y) = 0, since T y V. Thus T x V. Assume that V is a closed T invariant subspace, write H = V + V and let T 1, T 2 denote the restrictions of T to V respectively V. Then T = T 1 π V + T 2 π V, where π V, π V are the orthogonal projections onto the subspaces V, V. Thus the restrictions T 1, T 2 completely determine the operator T. Every eigenspace E λ (T ) of T and in particular the null space N(T ) is T -invariant. Write H = N(T ) + W, where W = N(T ). Then the restriction of T to W is a linear operator on W and obviously this restriction completely determines the operator T (since the restriction of T to its null space is simply zero). Thus we will often be able to disregard the eigenvectors associated with the eigenvalue zero, that is, the eigenvectors in the nullspace of T. Proposition If the operator T on H is selfadjoint, then T = sup{ (T x, x) : x = 1 }. (2.5) Proof. Clearly it will suffice to show (2.5) with x = 1 replaced with x 1. Set ψ(x, y) = (x, T y). Then ψ is a bilinear form with ψ = T. Since T is selfadjoint, ψ is symmetric. Now apply (2.3). Positive operators. A bounded linear operator A on H is called positive if it satisfies (Ax, x) 0, for all x H. If strict inequality holds for all nonzero x, then A is called strictly positive. For example, if X and Y are Hilbert spaces and T : X Y a bounded linear operator, then the operator A = T T on X is positive: (Ax, x) = (T T x, x) = (T x, T x) = T x 2 0. Proposition If the operator A on H is positive, then every eigenvalue λ of A satisfies λ 0.

14 10 CHAPTER 2. OPERATORS ON HILBERT SPACE Proof. Let x be an eigenvector with eigenvalue λ. Then λ x 2 = λ(x, x) = (Ax, x) 0. Proposition If the operator A on H is positive, then the operator αi + A has a bounded inverse on all of H, for each α > 0. Proof. Let α > 0 an set T = αi + A. Then, for each x H we have T x 2 = α 2 x 2 + 2α(Ax, x) + Ax 2 α 2 x 2. It follows that T is one to one and has closed range. Moreover T is selfadjoint. Thus R(T ) = N(T ) = {0}. Thus T has dense range. It follows that R(T ) = H and T has an inverse T 1 : H H as a linear map. The inverse is bounded since T x α x implies that T 1 y α 1 y. We will also need the following result Proposition If the operator A on H is positive, then there exists a unique positive operator S on H such that A = S 2. The operator S is called the (positive) square root of A and denoted S = A. The existence of S is a special case of the so called continuous functional calculus which is a consequence of the representation theory of commutative C -algebras. This theory is quite easy and provides the most natural proof. The reader is referred to the literature. 2.4 Compact operators between Banach spaces Let us recall without proof some facts about compact sets in a complete normed space X. A subset A X is called relatively compact if the closure of A is compact. The set A is called totally bounded if for each ɛ > 0 there are finitely many balls B(x i, ɛ), x i X, of radius ɛ which cover A. With this Theorem For a subset A X the following are equivalent: (i) A is relatively compact. (ii) A is totally bounded. (iii) Each sequence (a n ) A has a subsequence which converges in X. The proof is given in every class on metric spaces. The limit of the subsequence in (iii) will be in the closure of A but need not be in A itself.

15 2.4. COMPACT OPERATORS BETWEEN BANACH SPACES 11 Let X, Y be complete normed spaces. A linear operator T : X Y is called compact if the image T (B) Y of the unit ball B X is relatively compact in Y. T is called a finite rank operator it the range R(T ) := T (X) Y is finite dimensional. In this case T has the form T (x) = j<n Λ j(x)φ j, x X, (2.6) where n = dim(r(t )), φ j Y and the Λ j are continuous linear functionals on X. Simply let the { φ 0,..., φ n 1 } be a basis for R(T ) and Λ j = ψ j T, where ψ j is the coordinate functional associated with the basis vector φ j, that is, y = j<n ψ j(y)φ j, y R(T ). Now set y = T x. Conversely every operator of this form is finite dimensional with R(T ) = span({φ j }). Since a bounded set in a finite dimensional space is relatively compact (Bolzano-Weierstrass Theorem) every finite rank operator is compact. Theorem Let X, Y be complete normed spaces and T : X Y a linear operator. (i) If T is a finite rank operator then T is compact. (ii) If T is the limit in operator norm of compact operators, then T is compact. Proof. Assume that T n : X Y is compact, for each n 1, and T n T in operator norm. Let B X be the unit ball and ɛ > 0. Choose n such that T n T < ɛ/2. There exist finitely many balls B(y i, ɛ/2) Y which cover T n (B). Then the corresponding balls B(y i, ɛ) cover T (B). This shows that T (B) is totally bounded. Let us introduce the following notation: with B(X, Y ) we denote the space of all bounded linear operators T : X Y. Likewise F (X, Y ) and K(X, Y ) denote the set of finite rank respectively compact operators in B(X, Y ). If X = Y, we write B(X), F (X) and K(X) for B(X, X), F (X, X) and K(X, X). It is easily verified that F (X, Y ) and K(X, Y ) are in fact subspaces of B(X, Y ). Then from (ii) F (X, Y ) K(X, Y ) B(X, Y ). The converse of (ii) is not true in general but it is true if X and Y are Hilbert spaces as we shall see below. In other words, F (X, Y ) K(X, Y ) in general but we have equality in the case of Hilbert spaces X and Y.

16 12 CHAPTER 2. OPERATORS ON HILBERT SPACE For an operator T F (X, Y ) we set rank(t ) = dim(r(t ). If T has the form (2.6, then rank(t ) = n if φ 0,..., φ n 1 are linearly independent. Let T K(X, Y ). Then the image T (D) Y of each bounded subset D X is relatively compact. Using (2.4.1) we see Proposition Let T B(X, Y ). Then T is compact if and only if the sequence (T x n ) Y has a convergent subsequence for each bounded sequence (x n ) a bounded sequence in X. Let A be any set and τ, σ topologies on A with τ σ. If τ is a Hausdorff topology and A compact in the topology σ then τ = σ. It will suffice to show that each σ-closed set F A is τ-closed. Indeed, F is σ-compact and hence τ-compact (every cover with τ-open sets is a cover with σ-open sets). Since τ is Hausdorff it follows that F is τ-closed. Let X be a normed space and X the space of all continuous linear functionals on X. Recall that the weak topology on X is the weakest topology in which all functionals F X are continuous. Clearly this topology is weaker than the norm topolgy on X. It is a Hausdorff topology (the continuous linear functionals on a normed space X separate points on X). The observation above shows that the weak topology agrees with the norm topology on every norm compact subset of X. Recall that a sequence (x n ) X satisfies x n x weakly (in the weak topology) if and only if F (x n ) F (x), for each continuous linear functional F X. Proposition Let T B(X, Y ) be compact and (x n ) X bounded. If x n x X weakly, then T x n T x in norm. Proof. Since T is bounded the weak convergence x n x X implies the weak convergence T x n T x. Choose a bounded subset B X with (x n ) B and x B. Then K = T (B) Y is compact. Consequently the weak topology agrees with the norm topology on K. Since T x n, T x K and T x n T x weakly it follows that T x n T x in norm. Remark. A weakly convergent sequence (x n ) is automatically bounded, that is the assumption of boundedness above is superfluous but we don t need this result. If (x n ) is weakly convergent then it is weakly bounded, ie. sup n F (x n ) <, for each continuous linear functional F X. The Uniform Boundedness Principle now implies that the sequence (x n ) is bounded in norm.

17 2.4. COMPACT OPERATORS BETWEEN BANACH SPACES 13 Exercise. Let X, Y, Z be complete normed spaces and T : X Y, S : Y Z bounded linear operators. If one of S, T is compact then so is the product S,T. Hint: regardless of compactness T maps bounded sets to bounded sets and S maps relatively compact sets to relatively compact sets. We conclude this section with a characterization of compact operators on Hilbert space Theorem Let X and Y be Hilbert spaces and T B(X, Y ) a bounded linear operator. Then T is compact if and only if T e n 0, for each orthonormal sequence (e n ) X. Proof. ( ) Assume that T is compact and let (e n ) X be an orthonormal sequence. Then n (x, e n) 2 x 2 < and so (x, e n ) 0, as n, for each x X. By the Riesz representation theorem this means F (e n ) 0, for each continuous linear functional F X, that is, e n 0 weakly in X. According to the compactness of T now implies T e n 0 in norm. ( ) Recall that N 1 denotes the closed unit ball of a normed space N. Assume that T is not compact and hence T (X 1 ) Y not totally bounded. Let ɛ > 0 be such that the closure T (X 1 ) cannot be covered with finitely many balls of radius 2ɛ. We construct an orthonormal sequence (e n ) X such that T e n ɛ, for all n 1. (A) We claim that for every finite dimensional subspace N X there exists e N with e = 1 and T e ɛ. If this were not true let N X be a finite dimensional subspace such that T e ɛ, for all e V := N with e 1, that is T (V 1 ) ɛy 1. Note that T (N 1 ) Y is compact and hence can be covered by finitely many balls B j (y j, ɛ) of radius ɛ. Since X 1 N 1 + V 1 we have T (X 1 ) T (N 1 ) + T (V 1 ). It follows that T (X 1 ) is covered by the balls B j (y j, 2ɛ) in contradiction to the choice of ɛ. This shows (A). (B) Now we can construct the sequence (e n ) by induction. Using (A) with N = {0} find e 0 with T e 0 ɛ. Given that orthonormal e 0,..., e n with T e j ɛ have already been constructed set N = span({e 0,..., e n }) and choose e n+1 N with e n+1 = 1 such that T e n+1 ɛ. Then the sequence {e 0,..., e n+1 } is orthonormal and the construction continues.

18 14 CHAPTER 2. OPERATORS ON HILBERT SPACE 2.5 Compact selfadjoint operators Let T be a compact, selfadjoint operator on a Hilbertspace H. Then T can be diagonalized in the sense that there is an orthonormal basis for H consisting of eigenvectors of T. This result makes it very easy to work with such operators. For the proof we need the following Lemma Let T be a compact, selfadjoint operator on H. Then at least one of λ = T or λ = T is an eigenvalue of T. Proof. We may assume that T 0. From (2.5) we get a sequence of vectors x n H with x n = 1 and λ such that λ = T and (T x n, x n ) λ, as n. Then, for each n 0 we have 0 T x n λx n 2 = T x n 2 2λ(T x n, x n ) + λ 2 x n 2 (2.7) T 2 2λ(T x n, x n ) + λ 2 (2.8) As n, the rightmost quantity converges to 2λ 2 2λ 2 = 0. Thus we also have T x n λx n 0. Set y n = T x n. By compactness of T the sequence y n has a convergent subsequence. Passing to this subsequence we may assume that the sequence y n is itself convergent. But then the sequence x n = λ 1 (y n (y n λx n ) converges also. Since T x n λx n 0 the limit x = lim n x n must satisfy T x = λx. Since x n = 1, for all n, we have x = 1. With this we can now prove the main result about compact selfadjoint operators: Theorem Let T be a compact, selfadjoint operator on H. Then there exists an orthonormal basis for H consisting of eigenvectors of T. More precisely N(T ) has a countable orthonormal basis (φ j ) consisting of eigenvectors of T and if λ j are the associated eigenvalues, then T x = j λ j(x, φ j )φ j, x H, where the series converges in the norm of H. If the sequence (φ j ) is infinite, then λ j 0, as j. Proof. By induction we construct a (possibly finite) sequence of numbers λ j 0 and orthonormal vectors φ j such that (i) T φ j = λ j φ j, (ii) the restriction T j of T to { φ 0,..., φ j 1 } satisfies T j = λ j, and (iii) T = 0 on { φ 0, φ 1,... }.

19 2.5. COMPACT SELFADJOINT OPERATORS 15 Since the λ j are nonzero, each φ j is in N(T ) and from (iii) it follows that the φ j span all of N(T ) (recall that (A ) is the closed linear span of A). The quantities λ 0 and φ 0 exist by lemma (2.5.1). Assume that λ 0,... λ j and φ 0,..., φ j have already been constructed. Set X j = { φ 0,..., φ j }. If T = 0 on X j, then we are finished. Otherwise note that X j is a closed T - invariant subspace (since span({ φ 0,..., φ j }) is T -invariant). The restriction T j of T to X j is a compact selfadjoint operator on X j. Applying lemma (2.5.1) to T j we see that there is a unit vector φ j+1 X j and a number λ j+1 such that (a) λ j+1 = T j and (b) T φ j+1 = T j φ j+1 = λ j+1 φ j+1. Obviously φ j+1 φ 0..., φ j and so the resulting sequence (φ j ) is orthonormal. If T j = 0 at any time, then (iii) is already satisfied and we are finished. Assume now that T j 0, for all j 0, set X = { φ 0, φ 1,... } and let S be the restriction of T to X. We must show that S = 0. From (ii) it follows that λ 0 λ 1 λ j S, for all j 0, and so it will suffice to show that λ j 0 as j. If λ j 0, we have λ j ρ for some number ρ > 0. Then the sequence (φ j /λ j ) H is bounded and by compactness of T the sequence y j = T (φ j /λ j ) = φ j has a convergent subsequence. However this contradicts the fact that the sequence φ j is orthonormal and hence φ j φ k = 2, for all j k. Consequently we must have λ j 0. Remark (Spectrum). We claim that the sequence (λ j ) contains all the nonzero eigenvalues of T. If λ λ j, 0 were another eigenvalue, the associated eigenspace would be contained in N(T ) and perpendicular to all the φ j which contradicts the fact that the φ j span N(T ). It follows that the λ j contain all the nonzero eigenvalues of T. Note also that the convergence λ j 0 implies that the eigenspaces corresponding to nonzero eigenvalues are all finite dimensional. The sequence (λ j ) contains all nonzero eigenvalues of T but what about the spectrum of T, that is the set σ(t ) = { λ R T λi is not invertible on H }

20 16 CHAPTER 2. OPERATORS ON HILBERT SPACE Let us assume that H is not finite dimensional. Then the unit ball H 1 is not compact. It follows that T is not invertible, that is, 0 σ(t ) (regardless of wether 0 is an eigenvalue or not). However, if λ λ j, 0, for all j 0, then it can be shown that the operator T λi is invertible on H. To compute (T λi) 1 we must solve (T λi)x = y (2.9) for x in terms of y. Write V = N(T ) and x = π V (x) + π V (x) as well as y = π V (y) + π V (y). With this (2.9) becomes λπ V (x) + (T λi)π V (x) = π V (y) + π V (y) and since V is T -invariant and hence T λi-invariant, this is equivalent with λπ V (x) = π V (y) and (T λi)π V (x) = π V (y) (2.10) Since the φ j are an ON-basis for V we have π V (y) = j (y, φ j)φ j and π V (x) = j α jφ j with α j to be determined. Note that (T λi)φ j = (λ j λ)φ j. With this (2.10) becomes j α j(λ j λ)φ j = j (y, φ j)φ j which solves for α j = (y, φ j )/(λ j λ) resulting in x = π V (x) + π V (x) = 1 λ π V (y) + j (y, φ j ) λ j λ φ j. The solution x exists for each y and is a continuous linear function of y, in other words (T λi) 1 y = 1 λ π V (y) + (y, φ j ) j λ j λ φ j exists as a continuous linear operator on H. Consequently the point λ is not in the spectrum of T and we have shown that σ(t ) = {λ j } {0}. Remark (Range). The series expansion (2.5.1) also allows us to determine the range R(T ) quite easily. Let y H and consider the equation T x = y. (2.11) If this equation has a solution x, then y N(T ). Assume now that y N(T ). Then we have an expansion y = j (y, φ j)φ j. Clearly to find x H

21 2.6. COMPACT OPERATORS BETWEEN HILBERT SPACES 17 with T x = y we can restrict ourselves to x N(T ). Such x will then have an expansion x = j α jφ j (2.12) with α j to be determined. In terms of these series expansion (2.11) becomes j α jλ j φ j = T x = y = j (y, φ j)φ j which implies that we must have α j = λ 1 j (y, φ j ). However for these α j the series (2.12) converges exactly if j λ 2 j (y, φ j ) 2 <. It follows that { R(T ) = y N(T ) : } j λ 2 j (y, φ j ) 2 < 2.6 Compact operators between Hilbert spaces The case of a general compact operators T : X Y between Hilbert spaces X and Y can be reduced to the selfadjoint case by observing that the product T T is a compact, selfadjoint operator on X. The results of the last section then carry over with minimal changes. Let X and Y be Hilbert spaces, T B(X, Y ). A singular system for T is a sequence (µ j, φ j, ξ j ) j where (i) µ 0 µ 1 µ n > 0, (ii){φ j } is an ON-basis for N(T ), (iii) {ξ j } is an ON-basis for N(T ), and (iv) T φ j = µ j ξ j and T ξ j = µ j φ j, for all j 0. Assume that (µ j, φ j, ξ j ) j is such a system, set V = N(T ) and let x X. Then the orthogonal projection π V (x) of x on V has an expansion π V (x) = j (x, φ j)φ j and applying T to this expansion it follows that T x = T π V (x) = j µ j(x, φ j )ξ j (2.13) with convergence pointwise on X. For φ X and ξ Y define the rank one operator S = φ ξ as Sx = (x, φ)ξ Y, x X.

22 18 CHAPTER 2. OPERATORS ON HILBERT SPACE Then the above expansion for T can be rewritten as where the series converges pointwise on X. Set T = j µ j(φ j ξ j ) (2.14) T n = j<n µ j(φ j ξ j ) (2.15) and let x X. Using (i) and the orthonormality of the ξ n we have (T T n )x 2 = j n µ j(x, φ j )ξ 2 j = j n µ2 j (x, φ j) 2 µ 2 n (x, φ j) 2 µ 2 j n n x 2. This shows that T T n µ n. (2.16) in operator norm. Letting x = φ n above we see that we actually have equality. Consequently, if µ n 0, then the series (2.14) converges in operator norm and hence T is compact. Not every operator T B(X, Y ) has a singular system. However, if X = Y and T B(X) is selfadjoint, let {φ j } be the eigenvectors associated with the nonzero eigenvalues λ j of T arranged in decreasing order. Then (µ j, φ j, ξ j ) j with µ j = λ j and ξ j = φ j is a singular system for T. This is exactly the content of Theorem Now we generalize this fact to all compact operators T K(X, Y ): Theorem Let T : X Y be a compact operator, set A = T T, note that A is compact and selfadjoint on X and let {φ j } be the eigenvectors associated with the nonzero eigenvalues λ j of A arranged in decreasing order. Then µ j = λ j, and ξ j = µ 1 j T φ j defines a singular system (µ j, φ j, ξ j ) j for T. We have µ n 0 and hence the series (2.14) converges in operator norm. In particular T is the limit of finite rank operators. Proof. Note first that N(A) = N(T ) according to (2.2.1). Thus the φ j are an ON-basis for N(T ). By definition of (µ j, φ j, ξ j ) we have T φ j = µ j ξ j and T T φ j = µ 2 j φ j and this implies that T ξ j = µ j φ j. We claim that {ξ j } is an ON-basis for N(T ). Indeed, for j, k 0 we have (ξ j, ξ k ) = (µ 1 j T φ j, µ 1 k T φ k) = (µ j µ k ) 1 (T T φ j, φ k ) = δ jk. (2.17)

23 2.6. COMPACT OPERATORS BETWEEN HILBERT SPACES 19 Thus {ξ j } R(T ) N(T ) := W is an orthonormal system. We claim that this system spans all of W. Let w W and assume that w ξ j, for all j 0. Then T w R(T ) N(T ) and (T w, φ j ) = (w, T φ j ) = µ j (w, ξ j ) = 0, for all j 0. Since the {φ j } are an ON-basis for N(T ) it follows that T w = 0, that is w N(T ) = W. Thus w W W = {0}. This shows that the orthonormal system {ξ j } in N(T ) is complete.. Remark. If T : X Y is any bounded linear operator and (φ j ) an ON-basis for V = N(T ), then the expansion (2.1) is valid and applying T to this expansion yields T x = j (x, φ j)t φ j. What makes the expansion (2.13) interesting is the additional information contained in the singular system for T. Remark (Adjoint). Recall that T = T. If (µ j, φ j, ξ j ) j is a singular system for T then (µ j, ξ j, φ j ) j is a singular system for T and so we have the expansion T y = j µ j(ξ j φ j ). Thus if T is compact then so is the adjoint T. Remark (Range). The expansion (2.13) allows us to work with ONbases just as in the case of a compact selfadjoint operator. As an example we determine the range R(T ), that is, we study the equation T x = y. (2.18) Fix y Y. If a solution exists, then y R(T ) N(T ). Now assume that y N(T ). Then we have an expansion y = j (y, ξ j)ξ j. If there exists any solution x of (2.18) in X, then there exists a solution in V = N(T ) (in fact π V (x) is one). Thus we may assume that x V and have an expansion Applying T to this yields x = j α jφ j. (2.19) j α jµ j ξ j = T x = y = j (y, ξ j)ξ j.

24 20 CHAPTER 2. OPERATORS ON HILBERT SPACE It follows that we must have α j = µ 1 j (y, ξ j ). With this the series for x converges exactly if j µ 2 j (y, ξ j ) 2 <. Consequently { R(T ) = y N(T ) : } j λ 2 j (y, ξ j ) 2 < (2.20) exactly as in the selfadjoint case. 2.7 Hilbert-Schmidt and trace class operators Let X, Y be Hilbert spaces, T K(X, Y ) compact and (µ j, φ j, ξ j ) j a singular system for T. We know from Theorem that T is the limit in operator norm of finite operators. Now we quantify the speed of convergence. Approximation numbers. Set We have seen that then On the other hand we show now that T n = j<n µ j(φ j ξ j ). T T n µ n. (2.21) T S µ n, (2.22) for each finite rank operator S F (X, Y ) with rank(s) n. Set X n = span({φ 0,..., φ n } and note that T x µ n x, for all x X n. (2.23) Let x X n. Then x = j n (x, φ j)φ j and so T x = j n µ j(x, φ j )ξ j. It follows that T x 2 = j n µ2 j (x, φ j) 2 µ 2 n (x, φ j) 2 = µ 2 j n n x 2. Now let S F (X, Y ) with dim(r(s)) n. Then S is not one to one on X n and so there exists a unit vector u X n with Su = 0. Using (2.23) we have Thus T S µ n. The quantities (T S)u = T u µ n. a n (T ) := inf{ T S : S F (X, Y ), rank(s) n }, n 0, (2.24)

25 2.7. HILBERT-SCHMIDT AND TRACE CLASS OPERATORS 21 are called the approximation numbers of T. Here a 0 (T ) = T. The estimates (2.21) and (2.22) show that µ n = a n (T ) (2.25) and that the operator S = T n provides the best approximation of T in the operator norm among all operators of rank at most n. In particular this shows that the numbers µ n in a singular system for T are uniquely determined by T and do not depend on the singular system. The µ n are called the singular values of T. Obviously the vectors φ n and ξ n in a singular system for T are not uniquely determined. Consider the selfadjoint case and note that there are many ways to extract an orthonormal basis from each eigenspace of T. The approximation numbers a n (T ) are defined for each bounded linear operator T B(X, Y ). T is compact if and only if a n (T ) 0, as n and this is the only case of interest. In this case we have a n (T ) = µ n, where the µ j are the singular values of T (square root of the eigenvalues of T T ). For each bounded linear operator T B(X, Y ) let and let ( T = an (T ) p) 1/p S p (X, Y ) = { T B(X, Y ) : T p < }. Clearly each T S p (X, Y ) is compact. One can show that S p (X, Y ) B(X, Y ) is a closed subspace but we won t need this result. We are only interested in the cases p = 1, 2. We now assume that T K(X, Y ) is compact and (µ j, φ j, ξ j ) j a singular system for T. Hilbert-Schmidt operators. The operator T is called a Hilbert-Schmidt operator, if T S 2 (X, Y ), that is, T 2 2 := n a n(t ) 2 = n µ2 n <. Proposition If T K(X, Y ) is compact and {e α } is any ON-basis for X, then T 2 2 = α T e α 2. Remark. It follows that T is a Hilbert-Schmidt operator if and only if α T e α 2 <, for some ON-basis {e α } of X and in this case the sum is independent of the choice of the basis {e α }.

26 22 CHAPTER 2. OPERATORS ON HILBERT SPACE We do not assume that X is separable, that is that the basis {e α } is countable. However since T and hence T are compact the entire action is essentially separable: both N(T ) and R(T ) = N(T ) have countable ON-bases. Proof. Let {e α } be any ON-basis for X. N(T ) = R(T ), we have Since {ξ k } is an ON-basis for T e α 2 = k (T e α, ξ k ) 2 = k (e α, T ξ k ) 2 = k µ2 k (e α, φ k ) 2, for each α. It follows that ( α T e α 2 = α k µ2 k (e α, φ k ) 2) = k µ2 k = k µ2 k φ k 2 = k µ2 k = T 2 2. ( α (e α, φ k ) 2) Hilbert-Schmidt operators on the space X = L 2 (ν) of square integrable functions with respect to a finite measure ν will be characterized in terms of integration kernels below. Trace class operators. We now assume that X and Y have the same orthogonal dimension, that is, ON-bases {e α } of X and {f α } of Y can be indexed with the same indices α. Because of the compactness of T we can even assume both spaces to be separable. The operator T is called a trace class operator, if T S 1 (X, Y ), that is, T 1 = n a n(t ) <. Recall that (µ j, φ j, ξ j ) j denotes a singular system for T. It follows that T 1 = n µ n. Proposition Let T K(X, Y ). Then T 1 = max α (T e α, f α ), (2.26) where the maximum is taken over all ON-bases {e α } of X and {f α } of Y. Proof. Let {e α } and {f α } be ON-bases of X and Y and write T e α = j µ j(e α, φ j )ξ j. It follows that α, f α ) j (e α, φ j ) (ξ j, f α ) α α j = µ j α, φ j ) (ξ j, f α ) j α µ j j ( α (e α, φ j ) 2) 1/2 ( α (e α, ξ j ) 2) 1/2 j µ j φ j ξ j = k µ k = T 1.

27 2.7. HILBERT-SCHMIDT AND TRACE CLASS OPERATORS 23 On the other hand if we enlarge the bases {φ j } of N(T ) X and {ξ j } of R(T ) Y to ON-bases {e α } of X and {f α } of Y, then T vanishes on all e α {φ j } and the above sum becomes α (T e α, f α ) = j (T φ j, ξ j ) = j µ j = T 1. Thus T is a trace class operator if and only if the sum (2.26) is finite for all ON-bases {e α } of X and {f α } of Y. Proposition Let T K(X, Y ). Then T 1 = min n x n y n, (2.27) where the minimum is taken over all sequences (x n ) X and (y n ) Y such that T = n x n y n. Proof. Assume that T = n x n y n. Let {e α } and {f α } be ON-bases X and Y and write T e α = n (e α, x n )y n. With this (T e α, f α ) (e α, x n ) (y n, f α ) α α n ( α (e α, x n ) 2) 1/2 ( α (y n, f α ) 2) 1/2 n = n x n y n. Taking the sup over all such bases {e α } and {f α } yields T 1 n x n y n. Conversely, if we set x n = µ n φ n and y n = ξ n, then T = n x n y n and n x n y n = n µ n = T 1. Thus T is a trace class operator if and only if T has the form T = n x n y n with n x n y n <. Trace. Assume now that X = Y and T K(X) is a trace class operator. Then we define the trace of T as tr(t ) = α (T e α, e α ), (2.28)

28 24 CHAPTER 2. OPERATORS ON HILBERT SPACE where {e α } is an ON-basis for X. The series converges absolutely but we have to show that the sum does not depend on the choice of the basis {e α }. Fix a representation of T as T = x n y n with x n y n < (2.29) n n and let {e α } be an ON-basis for X. For n 0 write x n = α (x n, e α )e α and y n = β (y n, e β )e β. Entering this into the inner product (x n, y n ) we obtain (x n, y n ) = α,β (x n, e α )(y n, e β )(e α, e β ) = α (x n, e α )(y n, e α ). Now, for each α, write T e α = n (e α, x n )y n. With this we have α (T e α, e α ) = α n (e α, x n )(y n, e α ). Using Cauchy-Schwartz on sums along α we see that the double series on the right is absolutely convergent. We can thus rearrange it to obtain (T e α, e α ) = (e α, x n )(y n, e α ) = (x n, y n ). α n α n This shows that the value of the sum α (T e α, e α ) does not depend on the choice of basis {e α }. Thus the trace tr(t ) is well defined. But note that the representation of T as (2.29) was also arbitrary. Thus Proposition Let T = n x n y n with n x n y n <. Then tr(t ) = n (x n, y n ). We will see concrete examples of trace class operators and compute their trace in the treatment of kernel reproducing Hilbert spaces.

29 2.8. INVERSE PROBLEMS AND REGULARIZATION Inverse problems and regularization Let X, Y be Hilbert spaces, T B(X, Y ) be a bounded linear operator, and w R(T ). We will study the equation T v = w (2.30) to be solved for v X (inverse problem). Imagine that (2.30) arises from the theoretical study of some physical system and all ingredients are known with perfect precision. We call this problem the clean inverse problem. By assumption this problem has a solution. Unfortunately we do not know the true data w. We have only a polluted version y of w and must instead solve the practical problem T x = y (2.31) where y w is small. We do not know wether y R(T ), that is, the equation (2.31) may not have a solution, nonetheless it is all that we have to work with. We are interested in the solution v of (2.30) and not any solution x of (2.31). We reason as follows: each solution v of (2.30) is an approximate solution of (2.31), that is, T v y = w y is small. Therefore let us seek x X such that T x y is small. Then T x T v = T x w T x y + y w will be small and from this we hope to be able to conclude that x v is small, that is, x is a good approximation of the solution v of the clean problem (2.30). We are thus led to replace (2.31) with the minimization problem min x X T x y (2.32) Obviously each solution of (2.31) will be a minimizer of (2.32) but such solutions need not exist. Recall that our approach is based on the following reasoning T x T v is small x v is small. Since small is somewhat vague we might want to require more strongly that x v 0 whenever T x T v 0. This implies that N(T ) = {0} (thus T is invertible on R(T )) and the inverse S = T 1 : R(T ) X is continuous. In this case the problem (2.30) is called well posed, otherwise it is called ill posed. In short the inverse problem (2.30) is ill posed if

30 26 CHAPTER 2. OPERATORS ON HILBERT SPACE (i) the solution v is not unique, or (ii) it is unique but is not a continuous function of w R(T ). The usual definition of well posedness requires that the solution v of (2.30) exist for all w Y, be uniquenely determined and be a continuous function of w. The continuity requirement is then superfluous. It follows from the first two (N(T ) = {0} and R(T ) = Y ) by the Open Mapping Theorem. This fact is usually ignored and creates an akward situation as the continuous dependence of the solution on the right hand side is the very essence of well posedness. We are not following this lead here since it is unreasonable to require the existence of a solution of (2.30) for each right hand side w Y. In practice we are only dealing with one particular right hand side w Y and the natural assumption is that (2.30) have a solution for the given right hand side w, that is, w R(T ). It does not make much sense to seek the solution of a problem which does not have a solution by virtue of the theory underlying the problem. Condition (ii) is the crucial one. The uniqueness of the solution can always be enforced if we replace X with N(T ). Unfortunately in many cases of practical interest the operator T is compact and the space Y infinite dimensional. In this case (ii) is guarenteed to fail. We now devise a strategy to cope with the ill posedness of (2.30) rephrased as the minimization problem min x X T x y. (2.33) Note that we do not assume that T is compact, it is a general bounded linear operator T B(X, Y ). The minimization problem (2.33) does not always have a solution. This merely means that no minimum is assumed. We can still get arbitrarily close to the infimum and thus hope to find x such that T x y is small. However, in this case, it is not clear how to find such x. On the other hand if minimizers do exist we can hope that they have special properties which make them easy to find. Indeed, we will see below that they can be obtained as a solution of the so called normal equation. Let W = R(T ) = N(T ). Then the orthogonal projection b = π W (y) is the unique element u W which minimizes the distance y u. Note that T b = T y. Since b can be approximated arbitrarily closely with elements in the range of T, an element x X minimizes (2.33) if and only if T x = b and

31 2.8. INVERSE PROBLEMS AND REGULARIZATION 27 since the operator T is one to one on W (2.2.1) this is equivalent with T T x = T b = T y. (2.34) This equation is called the normal equation associated with the minimization problem (2.33). It has a solution exactly if b R(T ). This condition is automatically satisfied if T has closed range. In this case the minimizers of (2.33) are exactly the solutions of (2.34) plus arbitrary vectors in N(T T ) = N(T ). The unique solution in N(T ) is the solution with minimal norm. Example (Polynomial least sqares interpolation). If X and Y are finite dimensional then R(T ) = R(T ) and thus b R(T ) is automatically satisfied, that is, the normal equations do have a solution. Consider the following example: let n 1 and assume we are given n pairs of points (x 1, y 1 ),..., (x n, y n ) R 2. We want to find a polynomial Q of fixed degree k which minimizes the squared error n j=1 Q(x j) y j 2 = Q x y 2 where Q x = (Q(x 1 ),..., Q(x n )) and y = (y 1,..., y n ) are vectors in R n. The error is computed in the Euclidean norm of R n and with this norm R n is a Hilbert space. Write Q(x) = a 0 + a 1 x + + a k x k. and identify the polynomial Q with the vector a = (a 0,..., a k ) R k+1 of its coefficients. With this identification Q x = T a, where T : R k+1 R n is the linear operator given by the matrix 1 x 1 x x k 1 1 x 1 x x k 1 T =... 1 x n x 2 n... x k n and the normal equations T T a = T y can be solved for the coefficients a of Q. In the special case of linear least squares interpolation (k = 1) we have ( ) 1 x 1 ( ) T T = 1 x 1 x 1 x 2... x n... = n xk xk x 2 k 1 x n

32 28 CHAPTER 2. OPERATORS ON HILBERT SPACE and the normal equations assume the form a 0 n + a 1 xk = y k a 0 xk + a 1 x 2 k = x k y k Divide by n and write Ex = n 1 x k, Ey = n 1 y k, Exx = n 1 x 2 k and Exy = n 1 x k y k ( E for expected value) to obtain with solution a 0 + a 1 Ex = Ey a 0 Ex + a 1 Exx = Exy a 0 = ExExy EyExx Ex 2 Exx and a 1 = ExEy Exy Ex 2 Exx Regularization In general the normal equation (2.34) suffers from the same drawbacks as the inverse problem (2.30). If the operator T is compact then so is T T and hence the normal equation (2.34) is an ill posed inverse problem. Now let α > 0 be a small positive number. Then the operator αi + T T is invertible on X with bounded inverse (2.3.5). If we replace the normal equation (2.34) with (αi + T T )x = T y (2.35) we have a well posed problem with solution x = (αi + T T ) 1 T y. The following proposition shows that x is the solution to a modified minimization problem: Proposition The solution x = (αi + T T ) 1 T y to the regularized normal equations (2.35) is the unique minimizer of min x X ( T x y 2 + α x 2). (2.36) Proof. Let F := X Y be the product space endowed with the inner product (a b, u v) F := α(a, u) X + (b, v) Y, a, u X, b, v Y with associated norm x y 2 := y 2 + α x 2, x X, y Y.

BANACH AND HILBERT SPACE REVIEW

BANACH AND HILBERT SPACE REVIEW BANACH AND HILBET SPACE EVIEW CHISTOPHE HEIL These notes will briefly review some basic concepts related to the theory of Banach and Hilbert spaces. We are not trying to give a complete development, but

More information

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively. Chapter 7 Eigenvalues and Eigenvectors In this last chapter of our exploration of Linear Algebra we will revisit eigenvalues and eigenvectors of matrices, concepts that were already introduced in Geometry

More information

Section 6.1 - Inner Products and Norms

Section 6.1 - Inner Products and Norms Section 6.1 - Inner Products and Norms Definition. Let V be a vector space over F {R, C}. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F,

More information

1. Let P be the space of all polynomials (of one real variable and with real coefficients) with the norm

1. Let P be the space of all polynomials (of one real variable and with real coefficients) with the norm Uppsala Universitet Matematiska Institutionen Andreas Strömbergsson Prov i matematik Funktionalanalys Kurs: F3B, F4Sy, NVP 005-06-15 Skrivtid: 9 14 Tillåtna hjälpmedel: Manuella skrivdon, Kreyszigs bok

More information

Vector and Matrix Norms

Vector and Matrix Norms Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty

More information

Inner Product Spaces

Inner Product Spaces Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

More information

FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES

FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES CHRISTOPHER HEIL 1. Cosets and the Quotient Space Any vector space is an abelian group under the operation of vector addition. So, if you are have studied

More information

Notes on metric spaces

Notes on metric spaces Notes on metric spaces 1 Introduction The purpose of these notes is to quickly review some of the basic concepts from Real Analysis, Metric Spaces and some related results that will be used in this course.

More information

Orthogonal Diagonalization of Symmetric Matrices

Orthogonal Diagonalization of Symmetric Matrices MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

Metric Spaces. Chapter 7. 7.1. Metrics

Metric Spaces. Chapter 7. 7.1. Metrics Chapter 7 Metric Spaces A metric space is a set X that has a notion of the distance d(x, y) between every pair of points x, y X. The purpose of this chapter is to introduce metric spaces and give some

More information

Chapter 20. Vector Spaces and Bases

Chapter 20. Vector Spaces and Bases Chapter 20. Vector Spaces and Bases In this course, we have proceeded step-by-step through low-dimensional Linear Algebra. We have looked at lines, planes, hyperplanes, and have seen that there is no limit

More information

Finite dimensional C -algebras

Finite dimensional C -algebras Finite dimensional C -algebras S. Sundar September 14, 2012 Throughout H, K stand for finite dimensional Hilbert spaces. 1 Spectral theorem for self-adjoint opertors Let A B(H) and let {ξ 1, ξ 2,, ξ n

More information

16.3 Fredholm Operators

16.3 Fredholm Operators Lectures 16 and 17 16.3 Fredholm Operators A nice way to think about compact operators is to show that set of compact operators is the closure of the set of finite rank operator in operator norm. In this

More information

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued). MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors Jordan canonical form (continued) Jordan canonical form A Jordan block is a square matrix of the form λ 1 0 0 0 0 λ 1 0 0 0 0 λ 0 0 J = 0

More information

Linear Algebra I. Ronald van Luijk, 2012

Linear Algebra I. Ronald van Luijk, 2012 Linear Algebra I Ronald van Luijk, 2012 With many parts from Linear Algebra I by Michael Stoll, 2007 Contents 1. Vector spaces 3 1.1. Examples 3 1.2. Fields 4 1.3. The field of complex numbers. 6 1.4.

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka kosecka@cs.gmu.edu http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length

More information

Separation Properties for Locally Convex Cones

Separation Properties for Locally Convex Cones Journal of Convex Analysis Volume 9 (2002), No. 1, 301 307 Separation Properties for Locally Convex Cones Walter Roth Department of Mathematics, Universiti Brunei Darussalam, Gadong BE1410, Brunei Darussalam

More information

LINEAR ALGEBRA W W L CHEN

LINEAR ALGEBRA W W L CHEN LINEAR ALGEBRA W W L CHEN c W W L Chen, 1997, 2008 This chapter is available free to all individuals, on understanding that it is not to be used for financial gain, and may be downloaded and/or photocopied,

More information

SOLUTIONS TO EXERCISES FOR. MATHEMATICS 205A Part 3. Spaces with special properties

SOLUTIONS TO EXERCISES FOR. MATHEMATICS 205A Part 3. Spaces with special properties SOLUTIONS TO EXERCISES FOR MATHEMATICS 205A Part 3 Fall 2008 III. Spaces with special properties III.1 : Compact spaces I Problems from Munkres, 26, pp. 170 172 3. Show that a finite union of compact subspaces

More information

Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

More information

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem

More information

NOTES ON LINEAR TRANSFORMATIONS

NOTES ON LINEAR TRANSFORMATIONS NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all

More information

by the matrix A results in a vector which is a reflection of the given

by the matrix A results in a vector which is a reflection of the given Eigenvalues & Eigenvectors Example Suppose Then So, geometrically, multiplying a vector in by the matrix A results in a vector which is a reflection of the given vector about the y-axis We observe that

More information

Duality of linear conic problems

Duality of linear conic problems Duality of linear conic problems Alexander Shapiro and Arkadi Nemirovski Abstract It is well known that the optimal values of a linear programming problem and its dual are equal to each other if at least

More information

Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Numerical Analysis Lecture Notes Peter J. Olver 6. Eigenvalues and Singular Values In this section, we collect together the basic facts about eigenvalues and eigenvectors. From a geometrical viewpoint,

More information

(Basic definitions and properties; Separation theorems; Characterizations) 1.1 Definition, examples, inner description, algebraic properties

(Basic definitions and properties; Separation theorems; Characterizations) 1.1 Definition, examples, inner description, algebraic properties Lecture 1 Convex Sets (Basic definitions and properties; Separation theorems; Characterizations) 1.1 Definition, examples, inner description, algebraic properties 1.1.1 A convex set In the school geometry

More information

Metric Spaces. Chapter 1

Metric Spaces. Chapter 1 Chapter 1 Metric Spaces Many of the arguments you have seen in several variable calculus are almost identical to the corresponding arguments in one variable calculus, especially arguments concerning convergence

More information

No: 10 04. Bilkent University. Monotonic Extension. Farhad Husseinov. Discussion Papers. Department of Economics

No: 10 04. Bilkent University. Monotonic Extension. Farhad Husseinov. Discussion Papers. Department of Economics No: 10 04 Bilkent University Monotonic Extension Farhad Husseinov Discussion Papers Department of Economics The Discussion Papers of the Department of Economics are intended to make the initial results

More information

Bindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8

Bindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8 Spaces and bases Week 3: Wednesday, Feb 8 I have two favorite vector spaces 1 : R n and the space P d of polynomials of degree at most d. For R n, we have a canonical basis: R n = span{e 1, e 2,..., e

More information

Finite dimensional topological vector spaces

Finite dimensional topological vector spaces Chapter 3 Finite dimensional topological vector spaces 3.1 Finite dimensional Hausdorff t.v.s. Let X be a vector space over the field K of real or complex numbers. We know from linear algebra that the

More information

Chapter 17. Orthogonal Matrices and Symmetries of Space

Chapter 17. Orthogonal Matrices and Symmetries of Space Chapter 17. Orthogonal Matrices and Symmetries of Space Take a random matrix, say 1 3 A = 4 5 6, 7 8 9 and compare the lengths of e 1 and Ae 1. The vector e 1 has length 1, while Ae 1 = (1, 4, 7) has length

More information

Math 4310 Handout - Quotient Vector Spaces

Math 4310 Handout - Quotient Vector Spaces Math 4310 Handout - Quotient Vector Spaces Dan Collins The textbook defines a subspace of a vector space in Chapter 4, but it avoids ever discussing the notion of a quotient space. This is understandable

More information

Matrix Representations of Linear Transformations and Changes of Coordinates

Matrix Representations of Linear Transformations and Changes of Coordinates Matrix Representations of Linear Transformations and Changes of Coordinates 01 Subspaces and Bases 011 Definitions A subspace V of R n is a subset of R n that contains the zero element and is closed under

More information

Section 4.4 Inner Product Spaces

Section 4.4 Inner Product Spaces Section 4.4 Inner Product Spaces In our discussion of vector spaces the specific nature of F as a field, other than the fact that it is a field, has played virtually no role. In this section we no longer

More information

1 VECTOR SPACES AND SUBSPACES

1 VECTOR SPACES AND SUBSPACES 1 VECTOR SPACES AND SUBSPACES What is a vector? Many are familiar with the concept of a vector as: Something which has magnitude and direction. an ordered pair or triple. a description for quantities such

More information

CHAPTER IV - BROWNIAN MOTION

CHAPTER IV - BROWNIAN MOTION CHAPTER IV - BROWNIAN MOTION JOSEPH G. CONLON 1. Construction of Brownian Motion There are two ways in which the idea of a Markov chain on a discrete state space can be generalized: (1) The discrete time

More information

FUNCTIONAL ANALYSIS LECTURE NOTES CHAPTER 2. OPERATORS ON HILBERT SPACES

FUNCTIONAL ANALYSIS LECTURE NOTES CHAPTER 2. OPERATORS ON HILBERT SPACES FUNCTIONAL ANALYSIS LECTURE NOTES CHAPTER 2. OPERATORS ON HILBERT SPACES CHRISTOPHER HEIL 1. Elementary Properties and Examples First recall the basic definitions regarding operators. Definition 1.1 (Continuous

More information

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,

More information

1 Norms and Vector Spaces

1 Norms and Vector Spaces 008.10.07.01 1 Norms and Vector Spaces Suppose we have a complex vector space V. A norm is a function f : V R which satisfies (i) f(x) 0 for all x V (ii) f(x + y) f(x) + f(y) for all x,y V (iii) f(λx)

More information

Recall that two vectors in are perpendicular or orthogonal provided that their dot

Recall that two vectors in are perpendicular or orthogonal provided that their dot Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal

More information

Applied Linear Algebra I Review page 1

Applied Linear Algebra I Review page 1 Applied Linear Algebra Review 1 I. Determinants A. Definition of a determinant 1. Using sum a. Permutations i. Sign of a permutation ii. Cycle 2. Uniqueness of the determinant function in terms of properties

More information

Mathematical Methods of Engineering Analysis

Mathematical Methods of Engineering Analysis Mathematical Methods of Engineering Analysis Erhan Çinlar Robert J. Vanderbei February 2, 2000 Contents Sets and Functions 1 1 Sets................................... 1 Subsets.............................

More information

Chapter 6. Orthogonality

Chapter 6. Orthogonality 6.3 Orthogonal Matrices 1 Chapter 6. Orthogonality 6.3 Orthogonal Matrices Definition 6.4. An n n matrix A is orthogonal if A T A = I. Note. We will see that the columns of an orthogonal matrix must be

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

More information

October 3rd, 2012. Linear Algebra & Properties of the Covariance Matrix

October 3rd, 2012. Linear Algebra & Properties of the Covariance Matrix Linear Algebra & Properties of the Covariance Matrix October 3rd, 2012 Estimation of r and C Let rn 1, rn, t..., rn T be the historical return rates on the n th asset. rn 1 rṇ 2 r n =. r T n n = 1, 2,...,

More information

Let H and J be as in the above lemma. The result of the lemma shows that the integral

Let H and J be as in the above lemma. The result of the lemma shows that the integral Let and be as in the above lemma. The result of the lemma shows that the integral ( f(x, y)dy) dx is well defined; we denote it by f(x, y)dydx. By symmetry, also the integral ( f(x, y)dx) dy is well defined;

More information

Undergraduate Notes in Mathematics. Arkansas Tech University Department of Mathematics

Undergraduate Notes in Mathematics. Arkansas Tech University Department of Mathematics Undergraduate Notes in Mathematics Arkansas Tech University Department of Mathematics An Introductory Single Variable Real Analysis: A Learning Approach through Problem Solving Marcel B. Finan c All Rights

More information

1 Sets and Set Notation.

1 Sets and Set Notation. LINEAR ALGEBRA MATH 27.6 SPRING 23 (COHEN) LECTURE NOTES Sets and Set Notation. Definition (Naive Definition of a Set). A set is any collection of objects, called the elements of that set. We will most

More information

Lecture Notes on Measure Theory and Functional Analysis

Lecture Notes on Measure Theory and Functional Analysis Lecture Notes on Measure Theory and Functional Analysis P. Cannarsa & T. D Aprile Dipartimento di Matematica Università di Roma Tor Vergata cannarsa@mat.uniroma2.it daprile@mat.uniroma2.it aa 2006/07 Contents

More information

1. Prove that the empty set is a subset of every set.

1. Prove that the empty set is a subset of every set. 1. Prove that the empty set is a subset of every set. Basic Topology Written by Men-Gen Tsai email: b89902089@ntu.edu.tw Proof: For any element x of the empty set, x is also an element of every set since

More information

T ( a i x i ) = a i T (x i ).

T ( a i x i ) = a i T (x i ). Chapter 2 Defn 1. (p. 65) Let V and W be vector spaces (over F ). We call a function T : V W a linear transformation form V to W if, for all x, y V and c F, we have (a) T (x + y) = T (x) + T (y) and (b)

More information

and s n (x) f(x) for all x and s.t. s n is measurable if f is. REAL ANALYSIS Measures. A (positive) measure on a measurable space

and s n (x) f(x) for all x and s.t. s n is measurable if f is. REAL ANALYSIS Measures. A (positive) measure on a measurable space RAL ANALYSIS A survey of MA 641-643, UAB 1999-2000 M. Griesemer Throughout these notes m denotes Lebesgue measure. 1. Abstract Integration σ-algebras. A σ-algebra in X is a non-empty collection of subsets

More information

Chapter 5. Banach Spaces

Chapter 5. Banach Spaces 9 Chapter 5 Banach Spaces Many linear equations may be formulated in terms of a suitable linear operator acting on a Banach space. In this chapter, we study Banach spaces and linear operators acting on

More information

Inner product. Definition of inner product

Inner product. Definition of inner product Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product

More information

MATH 551 - APPLIED MATRIX THEORY

MATH 551 - APPLIED MATRIX THEORY MATH 55 - APPLIED MATRIX THEORY FINAL TEST: SAMPLE with SOLUTIONS (25 points NAME: PROBLEM (3 points A web of 5 pages is described by a directed graph whose matrix is given by A Do the following ( points

More information

Math 550 Notes. Chapter 7. Jesse Crawford. Department of Mathematics Tarleton State University. Fall 2010

Math 550 Notes. Chapter 7. Jesse Crawford. Department of Mathematics Tarleton State University. Fall 2010 Math 550 Notes Chapter 7 Jesse Crawford Department of Mathematics Tarleton State University Fall 2010 (Tarleton State University) Math 550 Chapter 7 Fall 2010 1 / 34 Outline 1 Self-Adjoint and Normal Operators

More information

MEASURE AND INTEGRATION. Dietmar A. Salamon ETH Zürich

MEASURE AND INTEGRATION. Dietmar A. Salamon ETH Zürich MEASURE AND INTEGRATION Dietmar A. Salamon ETH Zürich 12 May 2016 ii Preface This book is based on notes for the lecture course Measure and Integration held at ETH Zürich in the spring semester 2014. Prerequisites

More information

Methods for Finding Bases

Methods for Finding Bases Methods for Finding Bases Bases for the subspaces of a matrix Row-reduction methods can be used to find bases. Let us now look at an example illustrating how to obtain bases for the row space, null space,

More information

2.3 Convex Constrained Optimization Problems

2.3 Convex Constrained Optimization Problems 42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets. MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets. Norm The notion of norm generalizes the notion of length of a vector in R n. Definition. Let V be a vector space. A function α

More information

Some stability results of parameter identification in a jump diffusion model

Some stability results of parameter identification in a jump diffusion model Some stability results of parameter identification in a jump diffusion model D. Düvelmeyer Technische Universität Chemnitz, Fakultät für Mathematik, 09107 Chemnitz, Germany Abstract In this paper we discuss

More information

1. Let X and Y be normed spaces and let T B(X, Y ).

1. Let X and Y be normed spaces and let T B(X, Y ). Uppsala Universitet Matematiska Institutionen Andreas Strömbergsson Prov i matematik Funktionalanalys Kurs: NVP, Frist. 2005-03-14 Skrivtid: 9 11.30 Tillåtna hjälpmedel: Manuella skrivdon, Kreyszigs bok

More information

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013 Notes on Orthogonal and Symmetric Matrices MENU, Winter 201 These notes summarize the main properties and uses of orthogonal and symmetric matrices. We covered quite a bit of material regarding these topics,

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

More information

Numerical Methods I Eigenvalue Problems

Numerical Methods I Eigenvalue Problems Numerical Methods I Eigenvalue Problems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course G63.2010.001 / G22.2420-001, Fall 2010 September 30th, 2010 A. Donev (Courant Institute)

More information

Inner Product Spaces and Orthogonality

Inner Product Spaces and Orthogonality Inner Product Spaces and Orthogonality week 3-4 Fall 2006 Dot product of R n The inner product or dot product of R n is a function, defined by u, v a b + a 2 b 2 + + a n b n for u a, a 2,, a n T, v b,

More information

4: EIGENVALUES, EIGENVECTORS, DIAGONALIZATION

4: EIGENVALUES, EIGENVECTORS, DIAGONALIZATION 4: EIGENVALUES, EIGENVECTORS, DIAGONALIZATION STEVEN HEILMAN Contents 1. Review 1 2. Diagonal Matrices 1 3. Eigenvectors and Eigenvalues 2 4. Characteristic Polynomial 4 5. Diagonalizability 6 6. Appendix:

More information

Introduction: Overview of Kernel Methods

Introduction: Overview of Kernel Methods Introduction: Overview of Kernel Methods Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University

More information

Moving Least Squares Approximation

Moving Least Squares Approximation Chapter 7 Moving Least Squares Approimation An alternative to radial basis function interpolation and approimation is the so-called moving least squares method. As we will see below, in this method the

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2014 Timo Koski () Mathematisk statistik 24.09.2014 1 / 75 Learning outcomes Random vectors, mean vector, covariance

More information

Differential Operators and their Adjoint Operators

Differential Operators and their Adjoint Operators Differential Operators and their Adjoint Operators Differential Operators inear functions from E n to E m may be described, once bases have been selected in both spaces ordinarily one uses the standard

More information

Finite Dimensional Hilbert Spaces and Linear Inverse Problems

Finite Dimensional Hilbert Spaces and Linear Inverse Problems Finite Dimensional Hilbert Spaces and Linear Inverse Problems ECE 174 Lecture Supplement Spring 2009 Ken Kreutz-Delgado Electrical and Computer Engineering Jacobs School of Engineering University of California,

More information

Lecture 1: Schur s Unitary Triangularization Theorem

Lecture 1: Schur s Unitary Triangularization Theorem Lecture 1: Schur s Unitary Triangularization Theorem This lecture introduces the notion of unitary equivalence and presents Schur s theorem and some of its consequences It roughly corresponds to Sections

More information

University of Lille I PC first year list of exercises n 7. Review

University of Lille I PC first year list of exercises n 7. Review University of Lille I PC first year list of exercises n 7 Review Exercise Solve the following systems in 4 different ways (by substitution, by the Gauss method, by inverting the matrix of coefficients

More information

Quotient Rings and Field Extensions

Quotient Rings and Field Extensions Chapter 5 Quotient Rings and Field Extensions In this chapter we describe a method for producing field extension of a given field. If F is a field, then a field extension is a field K that contains F.

More information

Notes on Symmetric Matrices

Notes on Symmetric Matrices CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.

More information

Continuity of the Perron Root

Continuity of the Perron Root Linear and Multilinear Algebra http://dx.doi.org/10.1080/03081087.2014.934233 ArXiv: 1407.7564 (http://arxiv.org/abs/1407.7564) Continuity of the Perron Root Carl D. Meyer Department of Mathematics, North

More information

Metric Spaces Joseph Muscat 2003 (Last revised May 2009)

Metric Spaces Joseph Muscat 2003 (Last revised May 2009) 1 Distance J Muscat 1 Metric Spaces Joseph Muscat 2003 (Last revised May 2009) (A revised and expanded version of these notes are now published by Springer.) 1 Distance A metric space can be thought of

More information

3. INNER PRODUCT SPACES

3. INNER PRODUCT SPACES . INNER PRODUCT SPACES.. Definition So far we have studied abstract vector spaces. These are a generalisation of the geometric spaces R and R. But these have more structure than just that of a vector space.

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete

More information

LINEAR ALGEBRA. September 23, 2010

LINEAR ALGEBRA. September 23, 2010 LINEAR ALGEBRA September 3, 00 Contents 0. LU-decomposition.................................... 0. Inverses and Transposes................................. 0.3 Column Spaces and NullSpaces.............................

More information

Linear Algebra Notes for Marsden and Tromba Vector Calculus

Linear Algebra Notes for Marsden and Tromba Vector Calculus Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of

More information

F. ABTAHI and M. ZARRIN. (Communicated by J. Goldstein)

F. ABTAHI and M. ZARRIN. (Communicated by J. Goldstein) Journal of Algerian Mathematical Society Vol. 1, pp. 1 6 1 CONCERNING THE l p -CONJECTURE FOR DISCRETE SEMIGROUPS F. ABTAHI and M. ZARRIN (Communicated by J. Goldstein) Abstract. For 2 < p

More information

0 <β 1 let u(x) u(y) kuk u := sup u(x) and [u] β := sup

0 <β 1 let u(x) u(y) kuk u := sup u(x) and [u] β := sup 456 BRUCE K. DRIVER 24. Hölder Spaces Notation 24.1. Let Ω be an open subset of R d,bc(ω) and BC( Ω) be the bounded continuous functions on Ω and Ω respectively. By identifying f BC( Ω) with f Ω BC(Ω),

More information

INVARIANT METRICS WITH NONNEGATIVE CURVATURE ON COMPACT LIE GROUPS

INVARIANT METRICS WITH NONNEGATIVE CURVATURE ON COMPACT LIE GROUPS INVARIANT METRICS WITH NONNEGATIVE CURVATURE ON COMPACT LIE GROUPS NATHAN BROWN, RACHEL FINCK, MATTHEW SPENCER, KRISTOPHER TAPP, AND ZHONGTAO WU Abstract. We classify the left-invariant metrics with nonnegative

More information

Introduction to Topology

Introduction to Topology Introduction to Topology Tomoo Matsumura November 30, 2010 Contents 1 Topological spaces 3 1.1 Basis of a Topology......................................... 3 1.2 Comparing Topologies.......................................

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

Linear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007)

Linear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007) MAT067 University of California, Davis Winter 2007 Linear Maps Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007) As we have discussed in the lecture on What is Linear Algebra? one of

More information

Linear Algebra. A vector space (over R) is an ordered quadruple. such that V is a set; 0 V ; and the following eight axioms hold:

Linear Algebra. A vector space (over R) is an ordered quadruple. such that V is a set; 0 V ; and the following eight axioms hold: Linear Algebra A vector space (over R) is an ordered quadruple (V, 0, α, µ) such that V is a set; 0 V ; and the following eight axioms hold: α : V V V and µ : R V V ; (i) α(α(u, v), w) = α(u, α(v, w)),

More information

Lecture L3 - Vectors, Matrices and Coordinate Transformations

Lecture L3 - Vectors, Matrices and Coordinate Transformations S. Widnall 16.07 Dynamics Fall 2009 Lecture notes based on J. Peraire Version 2.0 Lecture L3 - Vectors, Matrices and Coordinate Transformations By using vectors and defining appropriate operations between

More information

1 if 1 x 0 1 if 0 x 1

1 if 1 x 0 1 if 0 x 1 Chapter 3 Continuity In this chapter we begin by defining the fundamental notion of continuity for real valued functions of a single real variable. When trying to decide whether a given function is or

More information

Lecture 18 - Clifford Algebras and Spin groups

Lecture 18 - Clifford Algebras and Spin groups Lecture 18 - Clifford Algebras and Spin groups April 5, 2013 Reference: Lawson and Michelsohn, Spin Geometry. 1 Universal Property If V is a vector space over R or C, let q be any quadratic form, meaning

More information

Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Numerical Analysis Lecture Notes Peter J. Olver 5. Inner Products and Norms The norm of a vector is a measure of its size. Besides the familiar Euclidean norm based on the dot product, there are a number

More information

n k=1 k=0 1/k! = e. Example 6.4. The series 1/k 2 converges in R. Indeed, if s n = n then k=1 1/k, then s 2n s n = 1 n + 1 +...

n k=1 k=0 1/k! = e. Example 6.4. The series 1/k 2 converges in R. Indeed, if s n = n then k=1 1/k, then s 2n s n = 1 n + 1 +... 6 Series We call a normed space (X, ) a Banach space provided that every Cauchy sequence (x n ) in X converges. For example, R with the norm = is an example of Banach space. Now let (x n ) be a sequence

More information

[1] Diagonal factorization

[1] Diagonal factorization 8.03 LA.6: Diagonalization and Orthogonal Matrices [ Diagonal factorization [2 Solving systems of first order differential equations [3 Symmetric and Orthonormal Matrices [ Diagonal factorization Recall:

More information