Master MVA: Apprentissage par renforcement Lecture: 5. Références bibliographiques: [LGM10b, MS08, MMLG10, ASM08] i=1 X ] log 2/δ (b i a i ) n.

Size: px
Start display at page:

Download "Master MVA: Apprentissage par renforcement Lecture: 5. Références bibliographiques: [LGM10b, MS08, MMLG10, ASM08] i=1 X ] log 2/δ (b i a i ) n."

Transcription

1 Master MVA: Apprentissage par renforcement Lecture: 5 Sample complexity en apprentissage par renforcement Professeur: Rémi Munos munos/master-mva/ Références bibliographiques: [LGM0b, MS08, MMLG0, ASM08] Plan:. Inégalité d Azuma. Sample complexity of LSTD 3. Other results Inégalité d Azuma Etend l inégalité de Chernoff-Hoeffding à des variables aléatoires qui peuvent être dépendantes mais qui forment une Martingale. Proposition. Soient X i [a i, b i ] variables aléatoires telles que E[X i X,..., X i ] = 0. Alors P ( n X i ɛ ) e ɛ P n (b i a i ). () Autrement dit, pour tout δ (0, ], on a avec probabilité au moins δ, n X i log /δ (b i a i ) n n n. Preuve. On a: P( X i ɛ) = P(e s P n X i e sɛ ) e sɛ E[e s P n X i ], par Markov [ e sɛ [ E X,...,X n E Xn e s P n X ] i X ],..., X n, [ e sɛ E X,...,X n e s P n X i [ E Xn e sx n ] ] X,..., X n, e sɛ+s (b n a n ) /8 E X,...,X n [ e s P n X i ], par Hoeffding e sɛ+s P n (b i a i ) /8 En choisissant s = 4ɛ/ n (b i a i ) on déduit P ( n X i µ i ɛ ) e ɛ P n (b i a i ). En refaisant le même calcul pour P ( n X i µ i ɛ ) on déduit ().

2 Sample complexity en apprentissage par renforcement Sample complexity of LSTD. Pathwise LSTD We follow a fixed policy π. Our goal is to approximate the value function V π (written V removing reference to π to simplify notations). We use a linear approximation space F spanned by a set of d basis functions ϕ i : X R. We denote by φ : X R d, φ( ) = ( ϕ ( ),..., ϕ d ( ) ) the feature vector. Thus F = { f α α R d and f α ( ) = φ( ) α }. Let (X,..., X n ) be a sample path (trajectory) of size n generated by following policy π. Let v R n and r R n such that v t = V (X t ) and r t = R(X t ) be the value vector and the reward vector, respectively. Also, let Φ = [φ(x ) ;... ; φ(x n ) ] be the feature matrix defined at the states, and F n = {Φα, α R d } R n be the corresponding vector space. We denote by Π : R n F n the empirical orthogonal projection onto F n, defined as Πy = arg min z F n y z n, where y n = n n y t. Note that Π is a non-expansive mapping w.r.t. the l -norm: Πy Πz n y z n. Define the empirical Bellman operator T : R n R n as ( T y) t = { rt + γy t+ t < n, r t t = n. Proposition. The operator Π T is a contraction in l -norm, thus possesses a unique fixed point ˆv. Preuve. Note that by defining the operator P : R n R n as ( P y) t = y t+ for t < n and ( P y) n = 0, we have T y = r + γ P y. The empirical Bellman operator is a γ-contraction in l -norm since, for any y, z R n, we have T y T z n = γ P (y z) n γ y z n. Now, since the orthogonal projection Π is non-expansive w.r.t. l -norm, from Banach fixed point theorem, there exists a unique fixed-point ˆv of the mapping Π T, i.e., ˆv = Π T ˆv. Since ˆv is the unique fixed point of Π T, the vector ˆv T ˆv is perpendicular to the space F n, and thus, Φ (ˆv T ˆv) = 0. By replacing ˆv with Φα, we obtain Φ Φα = Φ (r+γ P Φα) and then Φ (I γ P )Φ α = }{{}}{{} Φ r. A b Therefore, by setting A i,j = b i = n φ i (x t ) [ φ j (x t ) γφ j (x t+ ) ] + φ i (x n )φ j (x n ), φ i (x t )r t, we have that the system Aα = b always has at least one solution (since the fixed point ˆv exists) and we call the solution with minimal norm, ˆα = A + b, where A + is the Moore-Penrose pseudo-inverse of A, the pathwise LSTD solution.

3 Sample complexity en apprentissage par renforcement 3. Performance Bound Here we derive a bound for the performance of ˆv evaluated on the states of the trajectory used by the pathwise LSTD algorithm. Théorème. Let X,..., X n be a trajectory of the Markov chain, and v, ˆv R n be the vectors whose components are the value function and the pathwise LSTD solution at {X t } n, respectively. Then with probability δ (the probability is w.r.t. the random trajectory), we have ˆv v n v Πv n + γ γ [ d ( 8 log(d/δ) γv max L + ν n n n) ], () where the random variable ν n is the smallest strictly-positive eigenvalue of the sample-based Gram matrix n Φ Φ. Remark When the eigenvalues of the sample-based Gram matrix n Φ Φ are all non-zero, Φ Φ is invertible, and thus, Π = Φ(Φ Φ) Φ. In this case, the uniqueness of ˆv implies the uniqueness of ˆα since ˆv = Φα = Φ ˆv = Φ Φα = ˆα = (Φ Φ) Φ ˆv. On the other hand, when the sample-based Gram matrix n Φ Φ is not invertible, the system Ax = b may have many solutions. Among all the possible solutions, one may choose the one with minimal norm: ˆα = A + b. Remark 3 Theorem provides a bound without any reference to the stationary distribution of the Markov chain. In fact, the bound of Equation holds even when the chain does not possess a stationary distribution. For example, consider a Markov chain on the real line where the transitions always move the states to the right, i.e., p(x t+ dy X t = x) = 0 for y x. For simplicity assume that the value function V is bounded and belongs to F. This Markov chain is not recurrent, and thus, does not have a stationary distribution. We also assume that the feature vectors φ(x ),..., φ(x n ) are sufficiently independent, so that the eigenvalues of n Φ Φ are greater than ν > 0. Then according to Theorem, pathwise LSTD is able to estimate the value function at the samples at a rate O(/ n). This may seem surprising because at each state X t the algorithm is only provided with a noisy estimation of the expected value of the next state. However, the estimates are unbiased conditioned on the current state, and we will see in the proof that using a concentration inequality for martingale, pathwise LSTD is able to learn a good estimate of the value function at a state X t using noisy pieces of information at other states that may be far away from X t. In other words, learning the value function at a given state does not require making an average over many samples close to that state. This implies that LSTD does not require the Markov chain to possess a stationary distribution. In order to prove Theorem, we first introduce the model of regression with Markov design and then state and prove a lemma about this model. Définition. The model of regression Markov design is a regression problem where the data (X t, Y t ) t n are generated according to the following model: X,..., X n is a sample path generated by a Markov chain, Y t = f(x t ) + ξ t, where f is the target function, and the noise term ξ t is a random variable which is adapted to the filtration generated by X,..., X t+ and is such that ξ t C and E[ξ t X,..., X t ] = 0. (3) The next lemma reports a risk bound for the Markov design setting. Lemme (Regression bound for the Markov design setting). Let ŵ F n be the least-squares estimate of the (noisy) values Y = {Y t } n, i.e., ŵ = ΠY, and w F n be the least-squares estimate of the (noiseless) values

4 4 Sample complexity en apprentissage par renforcement ξ Z Y ˆξ F n ŵ ˆξ w Figure : This figure shows the components used in Lemma. and its proof such as w, ŵ, ξ, and ˆξ, and the fact that ˆξ, ξ n = ˆξ n. Z = {Z t = f(x t )} n, i.e., w = ΠZ. Then for any δ > 0, with probability at least δ (the probability is w.r.t. the random sample path X,..., X n ), we have ŵ w n CL d log(d/δ) nν n, (4) where ν n is the smallest strictly-positive eigenvalue of the sample-based Gram matrix n Φ Φ. Preuve. We define ξ R n to be the vector with components ξ t, and ˆξ = ŵ w = Π(Y Z) = Πξ. Since the projection is orthogonal we have ˆξ, ξ n = ˆξ n (see Figure ). Since ˆξ F n, there exists at least one α R d such that ˆξ = Φα, so by Cauchy-Schwarz inequality we have ˆξ n = ˆξ, ξ n = n d n α i [ d ] ξ t ϕ i (X t ) ( n / n α ξ t ϕ i (X t )). (5) Now among the vectors α such that ˆξ = Φα, we define ˆα to be the one with minimal l -norm, i.e., ˆα = Φ + ˆξ. Let K denote the null space of Φ, which is also the null space of n Φ Φ. Then ˆα can be decomposed as ˆα = ˆα K + ˆα K, where ˆα K K and ˆα K K, and because the decomposition is orthogonal, we have ˆα = ˆα K + ˆα K. Since ˆα is of minimal norm among all the vectors α such that ˆξ = Φα, its component in K must be zero, thus ˆα K. The Gram matrix n Φ Φ is positive-semidefinite, thus its eigenvectors corresponding to zero eigenvalues generate K and the other eigenvectors generate its orthogonal complement K. Therefore, from the assumption that the smallest strictly-positive eigenvalue of n Φ Φ is ν n, we deduce that since ˆα K, By using the result of Equation 6 in Equation 5, we obtain ˆξ n = n ˆα Φ Φˆα ν n ˆα ˆα = ν n ˆα. (6) ˆξ n n ν n [ d ] ( n / ξ t ϕ i (X t )). (7)

5 Sample complexity en apprentissage par renforcement 5 T v T ˆv v F n ˆv = Π T ˆv Πv Π T v Figure : This figure represents the space R n, the linear vector subspace F n and some vectors used in the proof of Theorem. Now, from Equation 3, we have that for any i =,..., d E[ξ t ϕ i (X t ) X,..., X t ] = ϕ i (X t )E[ξ t X,..., X t ] = 0, and since ξ t ϕ i (X t ) is adapted to the filtration generated by X,..., X t+, it is a martingale difference sequence w.r.t. that filtration. Thus one may apply Azuma s inequality to deduce that with probability δ, ξ t ϕ i (X t ) CL n log(/δ). where we used that ξ t ϕ i (X t ) CL for any i and t. By a union bound over all features, we have that with probability δ, for all i d ξ t ϕ i (X t ) CL n log(d/δ). (8) The result follows by combining Equations 8 and 7. Remarks about this Lemma In the Markov design model considered in this lemma, states {X t } n are random variables generated according to the Markov chain and the noise terms ξ t may depend on the next state X t+ (but should be centered conditioned on the past states X,..., X t ). This lemma will be used in order to prove Theorem, where we replace the target function f with the value function V, and the noise term ξ t with the temporal difference r(x t ) + γv (X t+ ) V (X t ). Note that this lemma is an extension of the bound for the model of regression with deterministic design in which the states, {X t } n, are fixed and the noise terms, ξ t s, are independent. In deterministic design, usual concentration results provide high probability bounds similar to Equation 4, but without the dependence on ν n. An open question is whether it is possible to remove ν n in the bound for the Markov design regression setting. Preuve. [Théorème ]

6 6 Sample complexity en apprentissage par renforcement Step : Using the Pythagorean theorem and the triangle inequality, we have (see Figure ) ˆv v n = v Πv n + ˆv Πv n ˆv Πv n + ( ˆv Π T v n + Π T v Πv ). n (9) From the γ-contraction of the operator Π T and the fact that ˆv is its unique fixed point, we obtain ˆv Π T v n = Π T ˆv Π T v n γ ˆv v n, (0) Thus from Equation 9 and 0, we have ˆv v n ˆv Πv n + ( γ ˆv v n + Π T v Πv ). n () Step : We now provide a high probability bound on Π T v Πv n. This is a consequence of Lemma. applied to the vectors Y = T v and Z = v. Since v is the value function at the points {X t } n, from the definition of the pathwise Bellman operator, we have that for t n, ξ t = y t v t = r(x t ) + γv (X t+ ) V (X t ) = γ [ V (X t+ ) P (dy X t )V (y) ], and ξ n = y n v n = γ P (dy X n )V (y). Thus, Equation 3 holds for t n. Here we may choose C = γv max for a bound on ξ t, t n, and C = γv max for a bound on ξ n. Azuma s inequality may only be applied to the sequence of n terms (the n-th term adds a contribution to the bound), thus instead of Equation 8, we obtain ξ t ϕ i (X t ) γv max L ( n log(d/δ) + ), with probability δ, for all i d. Combining with Equation 7, we deduce that with probability δ, we have Π T v Πv d ( 8 log(d/δ) n γv max L + ), () ν n n n where ν n is the smallest strictly-positive eigenvalue of n Φ Φ. The claim follows by combining Equations and, and solving the result for ˆv v n..3 Generalization bound When the Markov chain is ergodic (say β-mixing) and possesses a stationnary distribution µ, then it is possible to derive generalization bounds of the form: with probability δ, ˆV V µ which provides a bound expressed in terms of c inf V f µ + O γ f F the best possible approximation of V in F measured with µ the smallest eigenvalue ν of the Gram matrix ( φ i φ j dµ ) i,j β-mixing coefficients of the chain (hidden in O). ( d log(d/δ) (see [Lazaric, Ghavamzadeh, Munos, Finite-sample analysis of LSTD, 00]). nν ),

7 Sample complexity en apprentissage par renforcement 7 3 Other results Similar results have been obtained for different algorithms: Approximate Value iteration [MS08] Policy iteration with Bellman residual minimization [MMLG0] Policy iteration with modified Bellman residual minimization [ASM08] Classification based policy iteration algorithm [LGM0a] But there remains many open problems... References [ASM08] [LGM0a] [LGM0b] A. Antos, Cs. Szepesvari, and R. Munos. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning Journal, 7:89 9, 008. A. Lazaric, M. Ghavamzadeh, and R. Munos. Analysis of a classification-based policy iteration algorithm. In International Conference on Machine Learning, pages , 00. A. Lazaric, M. Ghavamzadeh, and R. Munos. Finite-sample analysis of LSTD. In International Conference on Machine Learning, pages 65 6, 00. [MMLG0] O. A. Maillard, R. Munos, A. Lazaric, and M. Ghavamzadeh. Finite sample analysis of bellman residual minimization. In Masashi Sugiyama and Qiang Yang, editors, Asian Conference on Machine Learning. JMLR: Workshop and Conference Proceedings, volume 3, pages , 00. [MS08] R. Munos and Cs. Szepesvári. Finite time bounds for sampling based fitted value iteration. Journal of Machine Learning Research, 9:85 857, 008.

Inner product. Definition of inner product

Inner product. Definition of inner product Math 20F Linear Algebra Lecture 25 1 Inner product Review: Definition of inner product. Slide 1 Norm and distance. Orthogonal vectors. Orthogonal complement. Orthogonal basis. Definition of inner product

More information

Orthogonal Diagonalization of Symmetric Matrices

Orthogonal Diagonalization of Symmetric Matrices MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding

More information

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets. MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets. Norm The notion of norm generalizes the notion of length of a vector in R n. Definition. Let V be a vector space. A function α

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

NOTES ON LINEAR TRANSFORMATIONS

NOTES ON LINEAR TRANSFORMATIONS NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all

More information

BANACH AND HILBERT SPACE REVIEW

BANACH AND HILBERT SPACE REVIEW BANACH AND HILBET SPACE EVIEW CHISTOPHE HEIL These notes will briefly review some basic concepts related to the theory of Banach and Hilbert spaces. We are not trying to give a complete development, but

More information

These axioms must hold for all vectors ū, v, and w in V and all scalars c and d.

These axioms must hold for all vectors ū, v, and w in V and all scalars c and d. DEFINITION: A vector space is a nonempty set V of objects, called vectors, on which are defined two operations, called addition and multiplication by scalars (real numbers), subject to the following axioms

More information

Inner Product Spaces

Inner Product Spaces Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

More information

Orthogonal Projections

Orthogonal Projections Orthogonal Projections and Reflections (with exercises) by D. Klain Version.. Corrections and comments are welcome! Orthogonal Projections Let X,..., X k be a family of linearly independent (column) vectors

More information

MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set.

MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set. MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set. Vector space A vector space is a set V equipped with two operations, addition V V (x,y) x + y V and scalar

More information

α = u v. In other words, Orthogonal Projection

α = u v. In other words, Orthogonal Projection Orthogonal Projection Given any nonzero vector v, it is possible to decompose an arbitrary vector u into a component that points in the direction of v and one that points in a direction orthogonal to v

More information

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem

More information

HOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)!

HOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)! Math 7 Fall 205 HOMEWORK 5 SOLUTIONS Problem. 2008 B2 Let F 0 x = ln x. For n 0 and x > 0, let F n+ x = 0 F ntdt. Evaluate n!f n lim n ln n. By directly computing F n x for small n s, we obtain the following

More information

Inner Product Spaces and Orthogonality

Inner Product Spaces and Orthogonality Inner Product Spaces and Orthogonality week 3-4 Fall 2006 Dot product of R n The inner product or dot product of R n is a function, defined by u, v a b + a 2 b 2 + + a n b n for u a, a 2,, a n T, v b,

More information

1. Let P be the space of all polynomials (of one real variable and with real coefficients) with the norm

1. Let P be the space of all polynomials (of one real variable and with real coefficients) with the norm Uppsala Universitet Matematiska Institutionen Andreas Strömbergsson Prov i matematik Funktionalanalys Kurs: F3B, F4Sy, NVP 005-06-15 Skrivtid: 9 14 Tillåtna hjälpmedel: Manuella skrivdon, Kreyszigs bok

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

More information

Adaptive Online Gradient Descent

Adaptive Online Gradient Descent Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650

More information

Notes on Symmetric Matrices

Notes on Symmetric Matrices CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.

More information

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy. Blue vs. Orange. Review Jeopardy Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

More information

Master s Theory Exam Spring 2006

Master s Theory Exam Spring 2006 Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

More information

2.3 Convex Constrained Optimization Problems

2.3 Convex Constrained Optimization Problems 42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions

More information

1 Short Introduction to Time Series

1 Short Introduction to Time Series ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The

More information

3. INNER PRODUCT SPACES

3. INNER PRODUCT SPACES . INNER PRODUCT SPACES.. Definition So far we have studied abstract vector spaces. These are a generalisation of the geometric spaces R and R. But these have more structure than just that of a vector space.

More information

17. Inner product spaces Definition 17.1. Let V be a real vector space. An inner product on V is a function

17. Inner product spaces Definition 17.1. Let V be a real vector space. An inner product on V is a function 17. Inner product spaces Definition 17.1. Let V be a real vector space. An inner product on V is a function, : V V R, which is symmetric, that is u, v = v, u. bilinear, that is linear (in both factors):

More information

Factorization Theorems

Factorization Theorems Chapter 7 Factorization Theorems This chapter highlights a few of the many factorization theorems for matrices While some factorization results are relatively direct, others are iterative While some factorization

More information

160 CHAPTER 4. VECTOR SPACES

160 CHAPTER 4. VECTOR SPACES 160 CHAPTER 4. VECTOR SPACES 4. Rank and Nullity In this section, we look at relationships between the row space, column space, null space of a matrix and its transpose. We will derive fundamental results

More information

Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

More information

(67902) Topics in Theory and Complexity Nov 2, 2006. Lecture 7

(67902) Topics in Theory and Complexity Nov 2, 2006. Lecture 7 (67902) Topics in Theory and Complexity Nov 2, 2006 Lecturer: Irit Dinur Lecture 7 Scribe: Rani Lekach 1 Lecture overview This Lecture consists of two parts In the first part we will refresh the definition

More information

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively. Chapter 7 Eigenvalues and Eigenvectors In this last chapter of our exploration of Linear Algebra we will revisit eigenvalues and eigenvectors of matrices, concepts that were already introduced in Geometry

More information

FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES

FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES CHRISTOPHER HEIL 1. Cosets and the Quotient Space Any vector space is an abelian group under the operation of vector addition. So, if you are have studied

More information

LINEAR ALGEBRA W W L CHEN

LINEAR ALGEBRA W W L CHEN LINEAR ALGEBRA W W L CHEN c W W L Chen, 1997, 2008 This chapter is available free to all individuals, on understanding that it is not to be used for financial gain, and may be downloaded and/or photocopied,

More information

Finite Dimensional Hilbert Spaces and Linear Inverse Problems

Finite Dimensional Hilbert Spaces and Linear Inverse Problems Finite Dimensional Hilbert Spaces and Linear Inverse Problems ECE 174 Lecture Supplement Spring 2009 Ken Kreutz-Delgado Electrical and Computer Engineering Jacobs School of Engineering University of California,

More information

Continuity of the Perron Root

Continuity of the Perron Root Linear and Multilinear Algebra http://dx.doi.org/10.1080/03081087.2014.934233 ArXiv: 1407.7564 (http://arxiv.org/abs/1407.7564) Continuity of the Perron Root Carl D. Meyer Department of Mathematics, North

More information

Metric Spaces. Chapter 7. 7.1. Metrics

Metric Spaces. Chapter 7. 7.1. Metrics Chapter 7 Metric Spaces A metric space is a set X that has a notion of the distance d(x, y) between every pair of points x, y X. The purpose of this chapter is to introduce metric spaces and give some

More information

(Quasi-)Newton methods

(Quasi-)Newton methods (Quasi-)Newton methods 1 Introduction 1.1 Newton method Newton method is a method to find the zeros of a differentiable non-linear function g, x such that g(x) = 0, where g : R n R n. Given a starting

More information

Section 6.1 - Inner Products and Norms

Section 6.1 - Inner Products and Norms Section 6.1 - Inner Products and Norms Definition. Let V be a vector space over F {R, C}. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F,

More information

[1] Diagonal factorization

[1] Diagonal factorization 8.03 LA.6: Diagonalization and Orthogonal Matrices [ Diagonal factorization [2 Solving systems of first order differential equations [3 Symmetric and Orthonormal Matrices [ Diagonal factorization Recall:

More information

The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method

The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method Robert M. Freund February, 004 004 Massachusetts Institute of Technology. 1 1 The Algorithm The problem

More information

Lecture 5 Least-squares

Lecture 5 Least-squares EE263 Autumn 2007-08 Stephen Boyd Lecture 5 Least-squares least-squares (approximate) solution of overdetermined equations projection and orthogonality principle least-squares estimation BLUE property

More information

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued). MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors Jordan canonical form (continued) Jordan canonical form A Jordan block is a square matrix of the form λ 1 0 0 0 0 λ 1 0 0 0 0 λ 0 0 J = 0

More information

Orthogonal Projections and Orthonormal Bases

Orthogonal Projections and Orthonormal Bases CS 3, HANDOUT -A, 3 November 04 (adjusted on 7 November 04) Orthogonal Projections and Orthonormal Bases (continuation of Handout 07 of 6 September 04) Definition (Orthogonality, length, unit vectors).

More information

Recall that two vectors in are perpendicular or orthogonal provided that their dot

Recall that two vectors in are perpendicular or orthogonal provided that their dot Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal

More information

Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

1 Review of Least Squares Solutions to Overdetermined Systems

1 Review of Least Squares Solutions to Overdetermined Systems cs4: introduction to numerical analysis /9/0 Lecture 7: Rectangular Systems and Numerical Integration Instructor: Professor Amos Ron Scribes: Mark Cowlishaw, Nathanael Fillmore Review of Least Squares

More information

by the matrix A results in a vector which is a reflection of the given

by the matrix A results in a vector which is a reflection of the given Eigenvalues & Eigenvectors Example Suppose Then So, geometrically, multiplying a vector in by the matrix A results in a vector which is a reflection of the given vector about the y-axis We observe that

More information

Linear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007)

Linear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007) MAT067 University of California, Davis Winter 2007 Linear Maps Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007) As we have discussed in the lecture on What is Linear Algebra? one of

More information

Analysis of Mean-Square Error and Transient Speed of the LMS Adaptive Algorithm

Analysis of Mean-Square Error and Transient Speed of the LMS Adaptive Algorithm IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 7, JULY 2002 1873 Analysis of Mean-Square Error Transient Speed of the LMS Adaptive Algorithm Onkar Dabeer, Student Member, IEEE, Elias Masry, Fellow,

More information

MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

More information

T ( a i x i ) = a i T (x i ).

T ( a i x i ) = a i T (x i ). Chapter 2 Defn 1. (p. 65) Let V and W be vector spaces (over F ). We call a function T : V W a linear transformation form V to W if, for all x, y V and c F, we have (a) T (x + y) = T (x) + T (y) and (b)

More information

Universal Algorithm for Trading in Stock Market Based on the Method of Calibration

Universal Algorithm for Trading in Stock Market Based on the Method of Calibration Universal Algorithm for Trading in Stock Market Based on the Method of Calibration Vladimir V yugin Institute for Information Transmission Problems, Russian Academy of Sciences, Bol shoi Karetnyi per.

More information

Linear Algebra I. Ronald van Luijk, 2012

Linear Algebra I. Ronald van Luijk, 2012 Linear Algebra I Ronald van Luijk, 2012 With many parts from Linear Algebra I by Michael Stoll, 2007 Contents 1. Vector spaces 3 1.1. Examples 3 1.2. Fields 4 1.3. The field of complex numbers. 6 1.4.

More information

Vector and Matrix Norms

Vector and Matrix Norms Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty

More information

16.3 Fredholm Operators

16.3 Fredholm Operators Lectures 16 and 17 16.3 Fredholm Operators A nice way to think about compact operators is to show that set of compact operators is the closure of the set of finite rank operator in operator norm. In this

More information

CONTINUED FRACTIONS AND PELL S EQUATION. Contents 1. Continued Fractions 1 2. Solution to Pell s Equation 9 References 12

CONTINUED FRACTIONS AND PELL S EQUATION. Contents 1. Continued Fractions 1 2. Solution to Pell s Equation 9 References 12 CONTINUED FRACTIONS AND PELL S EQUATION SEUNG HYUN YANG Abstract. In this REU paper, I will use some important characteristics of continued fractions to give the complete set of solutions to Pell s equation.

More information

Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Numerical Analysis Lecture Notes Peter J. Olver 5. Inner Products and Norms The norm of a vector is a measure of its size. Besides the familiar Euclidean norm based on the dot product, there are a number

More information

MATH 551 - APPLIED MATRIX THEORY

MATH 551 - APPLIED MATRIX THEORY MATH 55 - APPLIED MATRIX THEORY FINAL TEST: SAMPLE with SOLUTIONS (25 points NAME: PROBLEM (3 points A web of 5 pages is described by a directed graph whose matrix is given by A Do the following ( points

More information

Notes on metric spaces

Notes on metric spaces Notes on metric spaces 1 Introduction The purpose of these notes is to quickly review some of the basic concepts from Real Analysis, Metric Spaces and some related results that will be used in this course.

More information

Subspaces of R n LECTURE 7. 1. Subspaces

Subspaces of R n LECTURE 7. 1. Subspaces LECTURE 7 Subspaces of R n Subspaces Definition 7 A subset W of R n is said to be closed under vector addition if for all u, v W, u + v is also in W If rv is in W for all vectors v W and all scalars r

More information

October 3rd, 2012. Linear Algebra & Properties of the Covariance Matrix

October 3rd, 2012. Linear Algebra & Properties of the Covariance Matrix Linear Algebra & Properties of the Covariance Matrix October 3rd, 2012 Estimation of r and C Let rn 1, rn, t..., rn T be the historical return rates on the n th asset. rn 1 rṇ 2 r n =. r T n n = 1, 2,...,

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete

More information

1 2 3 1 1 2 x = + x 2 + x 4 1 0 1

1 2 3 1 1 2 x = + x 2 + x 4 1 0 1 (d) If the vector b is the sum of the four columns of A, write down the complete solution to Ax = b. 1 2 3 1 1 2 x = + x 2 + x 4 1 0 0 1 0 1 2. (11 points) This problem finds the curve y = C + D 2 t which

More information

SPECTRAL POLYNOMIAL ALGORITHMS FOR COMPUTING BI-DIAGONAL REPRESENTATIONS FOR PHASE TYPE DISTRIBUTIONS AND MATRIX-EXPONENTIAL DISTRIBUTIONS

SPECTRAL POLYNOMIAL ALGORITHMS FOR COMPUTING BI-DIAGONAL REPRESENTATIONS FOR PHASE TYPE DISTRIBUTIONS AND MATRIX-EXPONENTIAL DISTRIBUTIONS Stochastic Models, 22:289 317, 2006 Copyright Taylor & Francis Group, LLC ISSN: 1532-6349 print/1532-4214 online DOI: 10.1080/15326340600649045 SPECTRAL POLYNOMIAL ALGORITHMS FOR COMPUTING BI-DIAGONAL

More information

TD(0) Leads to Better Policies than Approximate Value Iteration

TD(0) Leads to Better Policies than Approximate Value Iteration TD(0) Leads to Better Policies than Approximate Value Iteration Benjamin Van Roy Management Science and Engineering and Electrical Engineering Stanford University Stanford, CA 94305 bvr@stanford.edu Abstract

More information

How To Find Out How To Calculate A Premeasure On A Set Of Two-Dimensional Algebra

How To Find Out How To Calculate A Premeasure On A Set Of Two-Dimensional Algebra 54 CHAPTER 5 Product Measures Given two measure spaces, we may construct a natural measure on their Cartesian product; the prototype is the construction of Lebesgue measure on R 2 as the product of Lebesgue

More information

On the mathematical theory of splitting and Russian roulette

On the mathematical theory of splitting and Russian roulette On the mathematical theory of splitting and Russian roulette techniques St.Petersburg State University, Russia 1. Introduction Splitting is an universal and potentially very powerful technique for increasing

More information

Mathematical finance and linear programming (optimization)

Mathematical finance and linear programming (optimization) Mathematical finance and linear programming (optimization) Geir Dahl September 15, 2009 1 Introduction The purpose of this short note is to explain how linear programming (LP) (=linear optimization) may

More information

Chapter 5. Banach Spaces

Chapter 5. Banach Spaces 9 Chapter 5 Banach Spaces Many linear equations may be formulated in terms of a suitable linear operator acting on a Banach space. In this chapter, we study Banach spaces and linear operators acting on

More information

Inner products on R n, and more

Inner products on R n, and more Inner products on R n, and more Peyam Ryan Tabrizian Friday, April 12th, 2013 1 Introduction You might be wondering: Are there inner products on R n that are not the usual dot product x y = x 1 y 1 + +

More information

Math 4310 Handout - Quotient Vector Spaces

Math 4310 Handout - Quotient Vector Spaces Math 4310 Handout - Quotient Vector Spaces Dan Collins The textbook defines a subspace of a vector space in Chapter 4, but it avoids ever discussing the notion of a quotient space. This is understandable

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix. MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix. Nullspace Let A = (a ij ) be an m n matrix. Definition. The nullspace of the matrix A, denoted N(A), is the set of all n-dimensional column

More information

SOLUTIONS TO EXERCISES FOR. MATHEMATICS 205A Part 3. Spaces with special properties

SOLUTIONS TO EXERCISES FOR. MATHEMATICS 205A Part 3. Spaces with special properties SOLUTIONS TO EXERCISES FOR MATHEMATICS 205A Part 3 Fall 2008 III. Spaces with special properties III.1 : Compact spaces I Problems from Munkres, 26, pp. 170 172 3. Show that a finite union of compact subspaces

More information

Sensitivity analysis of utility based prices and risk-tolerance wealth processes

Sensitivity analysis of utility based prices and risk-tolerance wealth processes Sensitivity analysis of utility based prices and risk-tolerance wealth processes Dmitry Kramkov, Carnegie Mellon University Based on a paper with Mihai Sirbu from Columbia University Math Finance Seminar,

More information

Convex analysis and profit/cost/support functions

Convex analysis and profit/cost/support functions CALIFORNIA INSTITUTE OF TECHNOLOGY Division of the Humanities and Social Sciences Convex analysis and profit/cost/support functions KC Border October 2004 Revised January 2009 Let A be a subset of R m

More information

Mathematical Methods of Engineering Analysis

Mathematical Methods of Engineering Analysis Mathematical Methods of Engineering Analysis Erhan Çinlar Robert J. Vanderbei February 2, 2000 Contents Sets and Functions 1 1 Sets................................... 1 Subsets.............................

More information

Convex Programming Tools for Disjunctive Programs

Convex Programming Tools for Disjunctive Programs Convex Programming Tools for Disjunctive Programs João Soares, Departamento de Matemática, Universidade de Coimbra, Portugal Abstract A Disjunctive Program (DP) is a mathematical program whose feasible

More information

x1 x 2 x 3 y 1 y 2 y 3 x 1 y 2 x 2 y 1 0.

x1 x 2 x 3 y 1 y 2 y 3 x 1 y 2 x 2 y 1 0. Cross product 1 Chapter 7 Cross product We are getting ready to study integration in several variables. Until now we have been doing only differential calculus. One outcome of this study will be our ability

More information

MAT 200, Midterm Exam Solution. a. (5 points) Compute the determinant of the matrix A =

MAT 200, Midterm Exam Solution. a. (5 points) Compute the determinant of the matrix A = MAT 200, Midterm Exam Solution. (0 points total) a. (5 points) Compute the determinant of the matrix 2 2 0 A = 0 3 0 3 0 Answer: det A = 3. The most efficient way is to develop the determinant along the

More information

NOV - 30211/II. 1. Let f(z) = sin z, z C. Then f(z) : 3. Let the sequence {a n } be given. (A) is bounded in the complex plane

NOV - 30211/II. 1. Let f(z) = sin z, z C. Then f(z) : 3. Let the sequence {a n } be given. (A) is bounded in the complex plane Mathematical Sciences Paper II Time Allowed : 75 Minutes] [Maximum Marks : 100 Note : This Paper contains Fifty (50) multiple choice questions. Each question carries Two () marks. Attempt All questions.

More information

Observability Index Selection for Robot Calibration

Observability Index Selection for Robot Calibration Observability Index Selection for Robot Calibration Yu Sun and John M Hollerbach Abstract This paper relates 5 observability indexes for robot calibration to the alphabet optimalities from the experimental

More information

SMOOTHING APPROXIMATIONS FOR TWO CLASSES OF CONVEX EIGENVALUE OPTIMIZATION PROBLEMS YU QI. (B.Sc.(Hons.), BUAA)

SMOOTHING APPROXIMATIONS FOR TWO CLASSES OF CONVEX EIGENVALUE OPTIMIZATION PROBLEMS YU QI. (B.Sc.(Hons.), BUAA) SMOOTHING APPROXIMATIONS FOR TWO CLASSES OF CONVEX EIGENVALUE OPTIMIZATION PROBLEMS YU QI (B.Sc.(Hons.), BUAA) A THESIS SUBMITTED FOR THE DEGREE OF MASTER OF SCIENCE DEPARTMENT OF MATHEMATICS NATIONAL

More information

Introduction: Overview of Kernel Methods

Introduction: Overview of Kernel Methods Introduction: Overview of Kernel Methods Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University

More information

Vector Spaces 4.4 Spanning and Independence

Vector Spaces 4.4 Spanning and Independence Vector Spaces 4.4 and Independence October 18 Goals Discuss two important basic concepts: Define linear combination of vectors. Define Span(S) of a set S of vectors. Define linear Independence of a set

More information

Duality of linear conic problems

Duality of linear conic problems Duality of linear conic problems Alexander Shapiro and Arkadi Nemirovski Abstract It is well known that the optimal values of a linear programming problem and its dual are equal to each other if at least

More information

Machine Learning and Pattern Recognition Logistic Regression

Machine Learning and Pattern Recognition Logistic Regression Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,

More information

Efficient Prediction of Value and PolicyPolygraphic Success

Efficient Prediction of Value and PolicyPolygraphic Success Journal of Machine Learning Research 14 (2013) 1181-1227 Submitted 10/11; Revised 10/12; Published 4/13 Performance Bounds for λ Policy Iteration and Application to the Game of Tetris Bruno Scherrer MAIA

More information

is in plane V. However, it may be more convenient to introduce a plane coordinate system in V.

is in plane V. However, it may be more convenient to introduce a plane coordinate system in V. .4 COORDINATES EXAMPLE Let V be the plane in R with equation x +2x 2 +x 0, a two-dimensional subspace of R. We can describe a vector in this plane by its spatial (D)coordinates; for example, vector x 5

More information

Math 312 Homework 1 Solutions

Math 312 Homework 1 Solutions Math 31 Homework 1 Solutions Last modified: July 15, 01 This homework is due on Thursday, July 1th, 01 at 1:10pm Please turn it in during class, or in my mailbox in the main math office (next to 4W1) Please

More information

3.1 Least squares in matrix form

3.1 Least squares in matrix form 118 3 Multiple Regression 3.1 Least squares in matrix form E Uses Appendix A.2 A.4, A.6, A.7. 3.1.1 Introduction More than one explanatory variable In the foregoing chapter we considered the simple regression

More information

CS3220 Lecture Notes: QR factorization and orthogonal transformations

CS3220 Lecture Notes: QR factorization and orthogonal transformations CS3220 Lecture Notes: QR factorization and orthogonal transformations Steve Marschner Cornell University 11 March 2009 In this lecture I ll talk about orthogonal matrices and their properties, discuss

More information

Systems of Linear Equations

Systems of Linear Equations Systems of Linear Equations Beifang Chen Systems of linear equations Linear systems A linear equation in variables x, x,, x n is an equation of the form a x + a x + + a n x n = b, where a, a,, a n and

More information

1 VECTOR SPACES AND SUBSPACES

1 VECTOR SPACES AND SUBSPACES 1 VECTOR SPACES AND SUBSPACES What is a vector? Many are familiar with the concept of a vector as: Something which has magnitude and direction. an ordered pair or triple. a description for quantities such

More information

DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II. Matrix Algorithms DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

More information

Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Numerical Analysis Lecture Notes Peter J. Olver 6. Eigenvalues and Singular Values In this section, we collect together the basic facts about eigenvalues and eigenvectors. From a geometrical viewpoint,

More information

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2. Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given

More information

Mathematical Finance

Mathematical Finance Mathematical Finance Option Pricing under the Risk-Neutral Measure Cory Barnes Department of Mathematics University of Washington June 11, 2013 Outline 1 Probability Background 2 Black Scholes for European

More information

Applied Algorithm Design Lecture 5

Applied Algorithm Design Lecture 5 Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 5 1 / 86 Approximation Algorithms Pietro Michiardi (Eurecom) Applied Algorithm Design

More information

LINEAR ALGEBRA. September 23, 2010

LINEAR ALGEBRA. September 23, 2010 LINEAR ALGEBRA September 3, 00 Contents 0. LU-decomposition.................................... 0. Inverses and Transposes................................. 0.3 Column Spaces and NullSpaces.............................

More information