F Matrix Calculus F 1



Similar documents
1 Introduction to Matrices

Linear Algebra: Vectors

Matrix Differentiation

5.3 The Cross Product in R 3

Lecture 2 Matrix Operations

Physics 235 Chapter 1. Chapter 1 Matrices, Vectors, and Vector Calculus

Linear Algebra Notes for Marsden and Tromba Vector Calculus

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

Inner Product Spaces

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

Abstract: We describe the beautiful LU factorization of a square matrix (or how to write Gaussian elimination in terms of matrix multiplication).

Notes on Matrix Calculus

5. Orthogonal matrices

v w is orthogonal to both v and w. the three vectors v, w and v w form a right-handed set of vectors.

Solving Linear Systems, Continued and The Inverse of a Matrix

Linear Algebra: Determinants, Inverses, Rank

Lecture L3 - Vectors, Matrices and Coordinate Transformations

A Primer on Index Notation

MATH 304 Linear Algebra Lecture 20: Inner product spaces. Orthogonal sets.

Continued Fractions and the Euclidean Algorithm

1 VECTOR SPACES AND SUBSPACES

Introduction to Matrix Algebra

NOTES ON LINEAR TRANSFORMATIONS

Matrices 2. Solving Square Systems of Linear Equations; Inverse Matrices

13 MATH FACTS a = The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions.

TOPIC 4: DERIVATIVES

A linear combination is a sum of scalars times quantities. Such expressions arise quite frequently and have the form

Introduction to Matrices for Engineers

Scalar Valued Functions of Several Variables; the Gradient Vector

Similarity and Diagonalization. Similar Matrices

Linear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007)

Multi-variable Calculus and Optimization

ASEN Structures. MDOF Dynamic Systems. ASEN 3112 Lecture 1 Slide 1

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

A QUICK GUIDE TO THE FORMULAS OF MULTIVARIABLE CALCULUS

88 CHAPTER 2. VECTOR FUNCTIONS. . First, we need to compute T (s). a By definition, r (s) T (s) = 1 a sin s a. sin s a, cos s a

Unified Lecture # 4 Vectors

Math 312 Homework 1 Solutions

9.4. The Scalar Product. Introduction. Prerequisites. Learning Style. Learning Outcomes

Vector and Matrix Norms

CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION

Chapter 17. Orthogonal Matrices and Symmetries of Space

[1] Diagonal factorization

CURVE FITTING LEAST SQUARES APPROXIMATION

1 Determinants and the Solvability of Linear Systems

Cross product and determinants (Sect. 12.4) Two main ways to introduce the cross product

Linear Algebra Review. Vectors

x = + x 2 + x

Høgskolen i Narvik Sivilingeniørutdanningen STE6237 ELEMENTMETODER. Oppgaver

9 MATRICES AND TRANSFORMATIONS

Math 241, Exam 1 Information.

DERIVATIVES AS MATRICES; CHAIN RULE

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

28 CHAPTER 1. VECTORS AND THE GEOMETRY OF SPACE. v x. u y v z u z v y u y u z. v y v z

October 3rd, Linear Algebra & Properties of the Covariance Matrix

MAT188H1S Lec0101 Burbulla

To give it a definition, an implicit function of x and y is simply any relationship that takes the form:

Notes on Symmetric Matrices

Numerical Methods for Solving Systems of Nonlinear Equations

Elementary Linear Algebra

6. Cholesky factorization

Vectors VECTOR PRODUCT. Graham S McDonald. A Tutorial Module for learning about the vector product of two vectors. Table of contents Begin Tutorial

3 Contour integrals and Cauchy s Theorem

Elementary Matrices and The LU Factorization

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

Notes on Determinant

Chapter 6. Linear Transformation. 6.1 Intro. to Linear Transformation

Solution to Homework 2

SOLVING LINEAR SYSTEMS

Linearly Independent Sets and Linearly Dependent Sets

Homogeneous systems of algebraic equations. A homogeneous (ho-mo-geen -ius) system of linear algebraic equations is one in which

Torgerson s Classical MDS derivation: 1: Determining Coordinates from Euclidean Distances

160 CHAPTER 4. VECTOR SPACES

Unit 18 Determinants

3.1 Least squares in matrix form

Rotation Matrices and Homogeneous Transformations

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Mathematics Course 111: Algebra I Part IV: Vector Spaces

8 Square matrices continued: Determinants

Microeconomic Theory: Basic Math Concepts

Systems of Linear Equations

7 Gaussian Elimination and LU Factorization

LS.6 Solution Matrices

5 Homogeneous systems

Chapter 2. Parameterized Curves in R 3

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

Recall that two vectors in are perpendicular or orthogonal provided that their dot

1.2 Solving a System of Linear Equations

4.5 Linear Dependence and Linear Independence

Using row reduction to calculate the inverse and the determinant of a square matrix

17. Inner product spaces Definition Let V be a real vector space. An inner product on V is a function

MATH APPLIED MATRIX THEORY

DIFFERENTIABILITY OF COMPLEX FUNCTIONS. Contents

LINEAR MAPS, THE TOTAL DERIVATIVE AND THE CHAIN RULE. Contents

WHEN DOES A CROSS PRODUCT ON R n EXIST?

A characterization of trace zero symmetric nonnegative 5x5 matrices

7.4. The Inverse of a Matrix. Introduction. Prerequisites. Learning Style. Learning Outcomes

The Matrix Elements of a 3 3 Orthogonal Matrix Revisited

Geometric Transformations

Row Echelon Form and Reduced Row Echelon Form

Transcription:

F Matrix Calculus F 1

Appendix F: MATRIX CALCULUS TABLE OF CONTENTS Page F1 Introduction F 3 F2 The Derivatives of Vector Functions F 3 F21 Derivative of Vector with Respect to Vector F 3 F22 Derivative of a Scalar with Respect to Vector F 3 F23 Derivative of Vector with Respect to Scalar F 3 F24 Jacobian of a Variable Transformation F 4 F3 The Chain Rule for Vector Functions F 5 F4 The Derivative of Scalar Functions of a Matrix F 6 F41 Functions of a Matrix Determinant F 7 F5 The Matrix Differential F 8 F 2

F2 THE DERIVATIVES OF VECTOR FUNCTIONS F1 Introduction In this Appendix we collect some useful formulas of matrix calculus that often appear in finite element derivations F2 The Derivatives of Vector Functions Let x and y be vectors of orders n and m respectively: x 1 x 2 x x n, y y 1 y 2 y m, (F1) where each component y i may be a function of all the x j, a fact represented by saying that y is a function of x,or y y(x) (F2) If n 1, x reduces to a scalar, which we call x Ifm 1, y reduces to a scalar, which we call y Various applications are studied in the following subsections F21 Derivative of Vector with Respect to Vector The derivative of the vector y with respect to vector x is the n m matrix 1 2 1 m 1 1 def 1 2 2 m 2 2 1 2 n m n n (F3) F22 Derivative of a Scalar with Respect to Vector If y is a scalar, def 1 2 n (F4) F23 Derivative of Vector with Respect to Scalar If x is a scalar, def [ 1 2 m ] (F5) F 3

Appendix F: MATRIX CALCULUS Remark F1 Many authors, notably in statistics and economics, define the derivatives as the transposes of those given above 1 This has the advantage of better agreement of matrix products with composition schemes such as the chain rule Evidently the notation is not yet stable Example F1 Given and y [ y1 ], x y 2 y 1 x 2 1 x 2 y 2 x 2 3 + 3x 2 [ x1 x 2 x 3 ] (F6) (F7) the partial derivative matrix / is computed as follows: 1 2 1 1 1 2 2 2 1 3 2 3 [ 2x1 0 ] 1 3 0 2x 3 (F8) F24 Jacobian of a Variable Transformation In multivariate analysis, if x and y are of the same order, the determinant of the square matrix /, that is J (F9) is called the Jacobian of the transformation determined by y y(x) The inverse determinant is J 1 (F10) Example F2 The transformation from spherical to Cartesian coordinates is defined by x r sin θ cos ψ, y r sin θ sin ψ, z r cos θ (F11) where r > 0, 0 < θ < π and 0 ψ < 2π To obtain the Jacobian of the transformation, let Then x x 1, y x 2, z x 3 (F12) r y 1, θ y 2, ψ y 3 J sin y 2 cos y 3 sin y 2 sin y 3 cos y 2 y 1 cos y 2 cos y 3 y 1 cos y 2 sin y 3 y 1 sin y 2 y 1 sin y 2 sin y 3 y 1 sin y 2 cos y 3 0 (F13) y 2 1 sin y 2 r 2 sin θ The foregoing definitions can be used to obtain derivatives to many frequently used expressions, including quadratic and bilinear forms 1 One author puts it this way: When one does matrix calculus, one quickly finds that there are two kinds of people in this world: those who think the gradient is a row vector, and those who think it is a column vector F 4

F3 THE CHAIN RULE FOR VECTOR FUNCTIONS Example F3 Consider the quadratic form y x T Ax where A is a square matrix of order n Using the definition (D3) one obtains (F14) Ax + AT x (F15) and if A is symmetric, 2Ax (F16) We can of course continue the differentiation process: 2 y ( ) A + A T, (F17) 2 and if A is symmetric, 2 y 2 2A (F18) The following table collects several useful vector derivative formulas y Ax x T A x T x x T Ax A T A 2x Ax + A T x F3 The Chain Rule for Vector Functions Let x 1 y 1 z 1 x x 2, y y 2 and z z 2 (F19) x n y r z m where z is a function of y, which is in turn a function of x Using the definition (D2), we can write z 1 z 1 z 1 2 1 n ( ) z T z 2 z 2 z 1 2 2 n (F20) 2 Each entry of this matrix may be expanded as z i j r q1 1 z i q q j F 5 n { i 1, 2,,m j 1, 2,,n (F21)

Appendix F: MATRIX CALCULUS Then ( ) z T z1 y q q 1 z2 q q 1 zm q z 1 z 1 1 q 1 2 z 2 1 z 2 2 1 ( z 2 ) T ( On transposing both sides, we finally obtain z1 q q 2 z2 q q 2 zm q q z2 q q n z2 q q n zm q q n 2 z 1 1 1 r 1 2 1 n z 2 r 2 2 1 2 2 n r r r 1 2 r n ) T ( ) z T (F22) z z, (F23) which is the chain rule for vectors If all vectors reduce to scalars, z z z, (F24) which is the conventional chain rule of calculus Note, however, that when we are dealing with vectors, the chain of matrices builds toward the left For example, if w is a function of z, which is a function of y, which is a function of x, w z w z (F25) On the other hand, in the ordinary chain rule one can indistictly build the product to the right or to the left because scalar multiplication is commutative F4 The Derivative of Scalar Functions of a Matrix Let X (x ij ) be a matrix of order (m n) and let y f (X), (F26) be a scalar function of X The derivative of y with respect to X, denoted by X, (F27) F 6

is defined as the following matrix of order (m n): 11 12 G X 21 22 m2 F4 THE DERIVATIVE OF SCALAR FUNCTIONS OF A MATRIX m1 1n 2n mn [ ] ij i, j E ij ij, (F28) where E ij denotes the elementary matrix* of order (m n) This matrix G is also known as a gradient matrix Example F4 Find the gradient matrix if y is the trace of a square matrix X of order n, that is n y tr(x) x ii i1 (F29) Obviously all non-diagonal partials vanish whereas the diagonal partials equal one, thus where I denotes the identity matrix of order n G X I, (F30) F41 Functions of a Matrix Determinant An important family of derivatives with respect to a matrix involves functions of the determinant of a matrix, for example y X or y AX Suppose that we have a matrix Y [y ij ] whose components are functions of a matrix X [x rs ], that is y ij f ij (x rs ), and set out to build the matrix X (F31) Using the chain rule we can write rs i j Y ij ij ij rs (F32) But Y j y ij Y ij, (F33) where Y ij is the cofactor of the element y ij in Y Since the cofactors Y i1, Y i2, are independent of the element y ij,wehave ij Y ij (F34) It follows that rs i j Y ij ij rs (F35) * The elementary matrix E ij of order m n has all zero entries except for the (i, j) entry, which is one F 7

Appendix F: MATRIX CALCULUS There is an alternative form of this result which is ocassionally useful Define a ij Y ij, A [a ij ], b ij ij rs, B [b ij ] (F36) Then it can be shown that rs tr(ab T ) tr(b T A) (F37) Example F5 IfX is a nonsingular square matrix and Z X X 1 its cofactor matrix, If X is also symmetric, G X X ZT G X X 2ZT diag(z T ) (F38) (F39) F5 The Matrix Differential For a scalar function f (x), where x is an n-vector, the ordinary differential of multivariate calculus is defined as n f df dx i (F40) i1 i In harmony with this formula, we define the differential of an m n matrix X [x ij ]tobe dx def dx 11 dx 12 dx 1n dx 21 dx 22 dx 2n dx m1 dx m2 dx mn This definition complies with the multiplicative and associative rules (F41) d(αx) α dx, d(x + Y) dx + dy (F42) If X and Y are product-conforming matrices, it can be verified that the differential of their product is d(xy) (dx)y + X(dY) (F43) which is an extension of the well known rule d(xy) ydx+ xdyfor scalar functions Example F6 If X [x ij ] is a square nonsingular matrix of order n, and denote Z X X 1 Find the differential of the determinant of X: X d X dx ij X ij dx ij tr( X X 1 ) T dx) tr(z T dx), (F44) ij where X ij denotes the cofactor of x ij in X i, j i, j F 8

F5 THE MATRIX DIFFERENTIAL Example F7 With the same assumptions as above, find d(x 1 ) The quickest derivation follows by differentiating both sides of the identity X 1 X I: from which If X reduces to the scalar x we have d(x 1 )X + X 1 dx 0, d(x 1 ) X 1 dxx 1 d ( ) 1 dx x x 2 (F45) (F46) (F47) F 9