MATRICES, PARTIAL DERIVATIVES AND THE CHAIN RULE. The definition of the dot-product can be easily extended to dimensions > 3.

Similar documents
Mathematics Course 111: Algebra I Part IV: Vector Spaces

Linear Algebra Notes for Marsden and Tromba Vector Calculus

DERIVATIVES AS MATRICES; CHAIN RULE

F Matrix Calculus F 1

1 3 4 = 8i + 20j 13k. x + w. y + w

NOTES ON LINEAR TRANSFORMATIONS

Similarity and Diagonalization. Similar Matrices

3. INNER PRODUCT SPACES

Inner Product Spaces

Physics 235 Chapter 1. Chapter 1 Matrices, Vectors, and Vector Calculus

1 Introduction to Matrices

Figure 1.1 Vector A and Vector F

1 Local Brouwer degree

13 MATH FACTS a = The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions.

A QUICK GUIDE TO THE FORMULAS OF MULTIVARIABLE CALCULUS

A Primer on Index Notation

Inner product. Definition of inner product

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

LINEAR MAPS, THE TOTAL DERIVATIVE AND THE CHAIN RULE. Contents

Lecture Notes 2: Matrices as Systems of Linear Equations

28 CHAPTER 1. VECTORS AND THE GEOMETRY OF SPACE. v x. u y v z u z v y u y u z. v y v z

Recall that two vectors in are perpendicular or orthogonal provided that their dot

RESULTANT AND DISCRIMINANT OF POLYNOMIALS

Discrete Convolution and the Discrete Fourier Transform

Chapter 7. Matrices. Definition. An m n matrix is an array of numbers set out in m rows and n columns. Examples. (

Section Inner Products and Norms

Linear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007)

SOLUTIONS. f x = 6x 2 6xy 24x, f y = 3x 2 6y. To find the critical points, we solve

9.2 Summation Notation

LINEAR ALGEBRA W W L CHEN

LECTURE 1: DIFFERENTIAL FORMS forms on R n

The Assignment Problem and the Hungarian Method

MA106 Linear Algebra lecture notes

T ( a i x i ) = a i T (x i ).

Matrix Representations of Linear Transformations and Changes of Coordinates

MULTIPLE INTEGRALS. h 2 (y) are continuous functions on [c, d] and let f(x, y) be a function defined on R. Then

Direct Methods for Solving Linear Systems. Matrix Factorization

Math 115A HW4 Solutions University of California, Los Angeles. 5 2i 6 + 4i. (5 2i)7i (6 + 4i)( 3 + i) = 35i + 14 ( 22 6i) = i.

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.

Computational Foundations of Cognitive Science

Lectures notes on orthogonal matrices (with exercises) Linear Algebra II - Spring 2004 by D. Klain

Chapter 19. General Matrices. An n m matrix is an array. a 11 a 12 a 1m a 21 a 22 a 2m A = a n1 a n2 a nm. The matrix A has n row vectors

v w is orthogonal to both v and w. the three vectors v, w and v w form a right-handed set of vectors.

it is easy to see that α = a

A linear combination is a sum of scalars times quantities. Such expressions arise quite frequently and have the form

Lecture 2 Matrix Operations

Vectors VECTOR PRODUCT. Graham S McDonald. A Tutorial Module for learning about the vector product of two vectors. Table of contents Begin Tutorial

Torgerson s Classical MDS derivation: 1: Determining Coordinates from Euclidean Distances

Solutions to Homework 5

Solving Systems of Linear Equations

Systems of Linear Equations

Section 3.7. Rolle s Theorem and the Mean Value Theorem. Difference Equations to Differential Equations

Linear Algebra Notes

2.3 Convex Constrained Optimization Problems

Solution to Homework 2

THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok

INTRODUCTORY SET THEORY

3 Contour integrals and Cauchy s Theorem

Vector Calculus Solutions to Sample Final Examination #1

Brief Introduction to Vectors and Matrices

4.5 Linear Dependence and Linear Independence

Chapter 6. Orthogonality

The continuous and discrete Fourier transforms

Lecture L3 - Vectors, Matrices and Coordinate Transformations

1 VECTOR SPACES AND SUBSPACES

Notes on Matrix Calculus

5.3 The Cross Product in R 3

MA4001 Engineering Mathematics 1 Lecture 10 Limits and Continuity

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS

Matrix Differentiation

Math 312 Homework 1 Solutions

Linear Algebra I. Ronald van Luijk, 2012

Three dimensional coordinates into two dimensional coordinates transformation.

6. LECTURE 6. Objectives

9.4. The Scalar Product. Introduction. Prerequisites. Learning Style. Learning Outcomes

Vector and Matrix Norms

Chapter 17. Orthogonal Matrices and Symmetries of Space

Unified Lecture # 4 Vectors

Notes on Symmetric Matrices

The Convolution Operation

Methods for Finding Bases

2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system

Name: Section Registered In:

Section 1.1. Introduction to R n

Let H and J be as in the above lemma. The result of the lemma shows that the integral

Microeconomic Theory: Basic Math Concepts

( ) which must be a vector

Geometric Transformations

9 MATRICES AND TRANSFORMATIONS

Limits and Continuity

1 if 1 x 0 1 if 0 x 1

Unit 18 Determinants

GROUP ACTIONS KEITH CONRAD

MATH 4330/5330, Fourier Analysis Section 11, The Discrete Fourier Transform

FIBER PRODUCTS AND ZARISKI SHEAVES

SECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA

The Characteristic Polynomial

BANACH AND HILBERT SPACE REVIEW

DETERMINANTS IN THE KRONECKER PRODUCT OF MATRICES: THE INCIDENCE MATRIX OF A COMPLETE GRAPH

Transcription:

MATRICES, PARTIAL DERIVATIVES AND THE CHAIN RULE STEFAN GESCHKE 1 The dot-product in higher dimensions The definition of the dot-product can be easily extended to dimensions > 3 Definition 11 If x,, x n and y y 1,, y n are vectors in R n, then the dot-product x y is defined by n x y,, x n y 1,, y n x i y i y 1 + + x n y n Note that the dot-product of two vectors is a real number Example 12 We compute the dot-product of two vectors in R 4 : 1, 2, 3, 4 1, 2, 0, 1 1 1 + 2 2 + 3 0 + 4 1 1 + 4 + 0 4 1 The dot-product in dimension n behaves as well as in dimension 3 Theorem 13 Let x, y, z R n and let λ R Then the following hold: 1 x y y x 2 x y + z x y + x z 3 x + y z x z + y z 4 λx y x λy λx y 5 0 x x 0 0 2 Matrices Definition 21 Let m and n be natural numbers positive integers An m-by-n matrix is an array a 11 a 12 a 1n a 21 a 22 a 2n a m1 a m2 a mn of real numbers with m rows and n columns Each entry has two indices, the first denoting the row and the second the column The matrix above is often denoted by a ij 1 i m,1 j n or just a ij if m and n are clear from the context If A a ij 1 i m,1 j n and B b ij 1 i m,1 j n are matrices of the same format, then their sum A + B is defined componentwise, ie, A + B a ij + b ij 1 i m,1 j n 1 i1

2 STEFAN GESCHKE Example 22 a The vector 1, 2, 3 is a 1-by-3 matrix The array 15 2 π 4 5 e is a 3-by-2 matrix b 15 2 π 4 5 e + 1 0 1 e 1 + 15 2 + 0 π + 0 4 + 1 5 + 1 e + e 25 2 π 5 6 2e Using the dot-product, we can define products of matrices of suitable formats Definition 23 Let l, m, n be natural numbers If A a ij 1 i l,1 j m and B b jk 1 j m,1 k n are matrices, then the product A B is defined to be the l-by-n matrix C c ik 1 i l,1 k n where m c ik a ij b jk a i1 b 1k + + a im b mk j1 In other words, A B is the matrix whose entry in the i th row and k th column is the dot-product of the i th row of A and the k th column of B Note that the product A B can only be formed if A has as many of columns as B has rows Moreover, if A is a 1-by-n matrix a 1 a 2 a n and B is an n-by-1 matrix b 2 b 1 b n, then A B is simply the dot-product a 1,, a n b 1,, b n Example 24 a e π e 0 + π 2 e 1 + π 1 2π e + π 2 1 0 0 + 1 2 0 1 + 1 1 2 1 b 2 1 e π 2e 2π + 1 Together with a, this shows that matrix multiplication is not commutative c e π 1 2 3 1 0 e + 2 π + 3

MATRICES, PARTIAL DERIVATIVES AND THE CHAIN RULE 3 d 1 2 3 1 0 0 1 e π 1 0 e + 2 π + 3 e π 1 1 Theorem 25 Let A and B be l-by-m matrices and let C and D be m-by-n matrices Then It follows that A C + D A C + A D and A + B C A C + B C A + B C + D A C + A B + B C + B D 3 Derivatives Definition 31 Let f : R n R m be a function and let f 1,, f m : R n R be its component functions coordinate functions We assume that all the partial derivatives i x j, 1 i m, 1 j n, exist and are continuous, at least on some open region U R n Then for each a 1,, a n U we define the derivative of f at a 1,, a n to be the m-by-n matrix 1 d a 1,, a n 1 dx 2 a 1,, a n 1 dx n a 1,, a n 2 dx Dfa 1,, a n 1 a 1,, a n 2 dx 2 a 1,, a n 2 dx n a 1,, a n m d a 1,, a n m dx 2 a 1,, a n m dx n a 1,, a n Example 32 a Let f : R R 2 be defined by ft cos t, sin t Then for each a R we have Dfa cos t t sin t t a a sin a cos a Note that this differs in notation from the previously defined f a sin a, cos a For the chain rule that we will discuss below, it is however important to pay attention to the fact that Dfa is a 2-by-1 matrix, ie, a vector written vertically, as opposed to a 1-by-2 matrix, ie, a vector written horizontally b Let f : R 3 R be defined by fx, y, z x 2 + y 2 + z 2 Then for all a, b, c R 3, Dfa, b, c x a, b, c y a, b, c z a, b, c c Let f : R 2 R 2 be defined by 1 2 x x + 2y fx, y y y 2a 2b 2c The component functions are f 1 x, y x + 2y and f 2 x, y y a, b R 2, Dfa, b What do you observe? 1 x a, b 1 y a, b 1 2 1 x a, b 1 y a, b Now for all

4 STEFAN GESCHKE d In the previous examples we considered functions of the form f,, x n and computed the derivative at a point a 1,, a n This was to point out the distinction between the variables,, x n with respect to which we take partial derivatives and the points at which we compute the derivative In the future we will not be as careful, see the following example Let f : R 3 R 3 be defined by fr, ϕ, z r cos ϕ, r sin ϕ, z Note that f computes from cylindrical coordinates the cartesian coordinates of a point have Dfr, ϕ, z cos ϕ sin ϕ 0 r sin ϕ r cos ϕ 0 0 Theorem 33 Let f, g : R n R m be functions and assume that all the relevant derivatives exist Then the following hold: 1 If f is constant, then Df 0, where 0 denotes the m-by-n matrix whose entries are all the real number 0 2 Df + g Df + Dg, where the first + denote the sum of two functions and the second + denotes the sum of two matrices 3 If f is of the form f,, x n A x 2 x n for some m-by-n matrix A, then for all,, x n R n, See Example 32 c Df,, x n A 4 The chain rule in higher dimensions Definition 41 Let f : R n R m and g : R m R l be functions Then their composition g f : R n R l is defined by g f, x n gf,, x n Note that this is a reasonable definition because the range of f is contained in R m and g is defined on R m Example 42 a ht sin 2 t is the composition g f of the functions gx x 2 and ft sin t Note that f gx sin x 2 b Let ft cos t, sin t and let gx, y x 2 + y 2 Then for all t R g ft cos 2 t + sin 2 t 1 There is a close connection between matrix multiplication and composition of functions We

MATRICES, PARTIAL DERIVATIVES AND THE CHAIN RULE 5 Theorem 43 If f : R l R m and g : R m R n are functions such that there is an l-by-m matrix A and an n-by-m matrix B such that f,, A and gy 1,, y m B y 1, then g f,, B A y m This theorem is just a special case of the fact that matrix multiplication satisfies the associative law: if A, B and C are matrices of suitable formats, then A B C A B C More precisely, if f, g, A and B are as in the theorem above, then g f,, B A B A Theorem 44 Chain Rule Let f : R n R m and g : R m R l and assume that all the relevant partial derivatives exist and are continuous a 1,, a n R n, rule Dg fa 1,, a n Dgfa 1,, a n Dfa 1,, a n Then for all Note that for functions from R to R this is just the usual 1-dimensional chain Example 45 a Let f and g be as in Example 42 b Since g f is constant, On the other hand, Dg ft g f t 0 Dg ft Dgft Dft Dgcos t, sin t Dft sin t 2 cos t 2 sin t 2 cos t sin t + 2 sin t cos t 0 cos t b Let fx, y, z x 2 + y z, x y 2 + 3z and gu, v u + v, u v Then 1 1 Dgu, v 1 1 and therefore Dg fx, y, z 1 1 1 1 2 1 1 2y 3 2x + 1 1 2y 2 2x 1 1 + 2y 4

6 STEFAN GESCHKE Definition 46 If f : R n R, then,, x n,,,, x n x n is called the gradient of f at,, x n and denoted by f,, x n Note that f,, x n is the vector with the same entries as the 1-by-n matrix Df,, x n Example 47 Let fx, y, z x 2 + 2y z 3 Then fx, y, z 2x, 2, 3z 2 Corollary 48 If f : R R m and g : R m R, then the chain rule reduces to Dg ft gft f t g ft f 1 t + + g ft f m t, x m where f 1,, f m are the component functions of f See Example 44 a Example 49 A typical application of this corollary is the following: f : R R 3 describes the position of an airplane at time t, for instance ft 100 cos t, 100 sin t, t The function g : R 3 R describe the temperature at a point x, y, z, for instance gx, y, z 70 z The gradient of g at x, y, z is gx, y, z 0, 0, 1 The derivative of f at t is f t 100 sin t, 100 cos t, 1 Now Dg ft g f t g100 cos t, 100 sin t, t f t 0, 0, 1 100 sin t, 100 cos t, 1 1 The reason this is so simple in this particular case is that gx, y, z only depends on z It is actually easier to compute the derivative of the composition by first computing the composition and then its derivative We have g ft 70 t If g is more complicated, the chain rule actually helps gx, y, z 70 + x2 200 z Then gx, y, z x 100, 0, 1 Hence Dg ft g f t g100 cos t, 100 sin t, t f t Suppose now that cos t, 0, 1 100 sin t, 100 cos t, 1 100 cos t sin t 1