Topics. Lecture 1: Intro/refresher in Matrix Algebra. Matrix/linear algebra. Matrices: An array of elements

Similar documents
Introduction to Matrix Algebra

1 Introduction to Matrices

Lecture 2 Matrix Operations

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

Data Mining: Algorithms and Applications Matrix Math Review

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

13 MATH FACTS a = The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions.

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

Linear Algebra Notes for Marsden and Tromba Vector Calculus

LINEAR ALGEBRA. September 23, 2010

Similarity and Diagonalization. Similar Matrices

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

Review Jeopardy. Blue vs. Orange. Review Jeopardy

9 MATRICES AND TRANSFORMATIONS

Linear Algebra Review. Vectors

Matrix Differentiation

Chapter 6. Orthogonality

Notes on Determinant

[1] Diagonal factorization

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

Similar matrices and Jordan form

DATA ANALYSIS II. Matrix Algorithms

Matrix Algebra. Some Basic Matrix Laws. Before reading the text or the following notes glance at the following list of basic matrix algebra laws.

Eigenvalues and Eigenvectors

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS

Math 115A HW4 Solutions University of California, Los Angeles. 5 2i 6 + 4i. (5 2i)7i (6 + 4i)( 3 + i) = 35i + 14 ( 22 6i) = i.

Inner products on R n, and more

Using row reduction to calculate the inverse and the determinant of a square matrix

by the matrix A results in a vector which is a reflection of the given

The Characteristic Polynomial

Chapter 17. Orthogonal Matrices and Symmetries of Space

Introduction to Matrices for Engineers

Multivariate Analysis of Variance (MANOVA): I. Theory

Vector and Matrix Norms

Dimensionality Reduction: Principal Components Analysis

F Matrix Calculus F 1

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

3.1 Least squares in matrix form

1 VECTOR SPACES AND SUBSPACES

SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89. by Joseph Collison

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Notes on Symmetric Matrices

Chapter 7. Matrices. Definition. An m n matrix is an array of numbers set out in m rows and n columns. Examples. (

MATH APPLIED MATRIX THEORY

MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set.

Solution to Homework 2

Orthogonal Diagonalization of Symmetric Matrices

Introduction to Matrix Algebra

Recall the basic property of the transpose (for any A): v A t Aw = v w, v, w R n.

MAT188H1S Lec0101 Burbulla

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

Lecture 5: Singular Value Decomposition SVD (1)

October 3rd, Linear Algebra & Properties of the Covariance Matrix

Factor Analysis. Chapter 420. Introduction

x1 x 2 x 3 y 1 y 2 y 3 x 1 y 2 x 2 y 1 0.

Linear Algebra I. Ronald van Luijk, 2012

Some probability and statistics

Typical Linear Equation Set and Corresponding Matrices

x = + x 2 + x

Introduction to General and Generalized Linear Models

Chapter 19. General Matrices. An n m matrix is an array. a 11 a 12 a 1m a 21 a 22 a 2m A = a n1 a n2 a nm. The matrix A has n row vectors

Brief Introduction to Vectors and Matrices

Matrix Algebra in R A Minimal Introduction

Linear Algebra: Vectors

Math 312 Homework 1 Solutions

1 Symmetries of regular polyhedra

Linear Algebra Done Wrong. Sergei Treil. Department of Mathematics, Brown University

Inner Product Spaces and Orthogonality

Inner Product Spaces

Chapter Two: Matrix Algebra and its Applications 1 CHAPTER TWO. Matrix Algebra and its applications

Linear Algebra Review and Reference

THREE DIMENSIONAL GEOMETRY

The Matrix Elements of a 3 3 Orthogonal Matrix Revisited

Lecture 4: Partitioned Matrices and Determinants

Bindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8

Recall that two vectors in are perpendicular or orthogonal provided that their dot

Linear Algebra and TI 89

1 Sets and Set Notation.

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

Orthogonal Projections

Multiple regression - Matrices

CONTROLLABILITY. Chapter Reachable Set and Controllability. Suppose we have a linear system described by the state equation

Linear algebra and the geometry of quadratic equations. Similarity transformations and orthogonal matrices

u = [ 2 4 5] has one row with three components (a 3 v = [2 4 5] has three rows separated by semicolons (a 3 w = 2:5 generates the row vector w = [ 2 3

Linear Algebra: Determinants, Inverses, Rank

Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics

Matrix Calculations: Applications of Eigenvalues and Eigenvectors; Inner Products

Figure 1.1 Vector A and Vector F

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Applied Linear Algebra I Review page 1

LINEAR ALGEBRA W W L CHEN

Linear Algebra Notes

Lecture 5 Principal Minors and the Hessian

Torgerson s Classical MDS derivation: 1: Determining Coordinates from Euclidean Distances

Lecture Notes 2: Matrices as Systems of Linear Equations

Mathematics Course 111: Algebra I Part IV: Vector Spaces

The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression

Introduction to Principal Component Analysis: Stock Market Values

Here are some examples of combining elements and the operations used:

Transcription:

Lecture : Intro/refresher in Matrix Algebra Bruce Walsh lecture notes SISG -Mixed Model Course version 8 June 0 Topics Definitions, dimensionality, addition, subtraction Matrix multiplication Inverses, solving systems of equations Quadratic products and covariances The multivariate normal distribution Ordinary least squares Vector/matrix calculus (taking derivatives) 3 Matrix/linear algebra Compact way for treating the algebra of systems of linear equations Most common statistical methods can be written in matrix form y = X! + e is the general linear model OLS solution:! = (X T X) - X T y Y = X! + Z a + e is the general mixed model Matrices: An array of elements Vectors: A matrix with either one row or one column Usually written in bold lowercase, eg a, b, c a = 3 b = ( 0 5 ) 47 Column vector Row vector (3 x ) ( x 4) Dimensionality of a matrix: r x c (rows x columns) think of Railroad Car 4

General Matrices Usually written in bold uppercase, eg A, C, D C = 3 5 4 (3 x 3) Square matrix D = 0 3 4 9 (3 x ) Dimensionality of a matrix: r x c (rows x columns) think of Railroad Car A matrix is defined by a list of its elements B has ij-th element B ij -- the element in row i and column j 5 Partitioned Matrices It will often prove useful to divide (or partition) the elements of a matrix into a matrix whose elements are itself matrices C = 3 3 5 4 = a b 5 4 = d B a = ( 3 ), b = ( ), d = 5 4, B = One useful partition is to write the matrix as either a row vector of column vectors or a column vector of row vectors 7 Addition and Subtraction of Matrices If two matrices have the same dimension (same number of rows and columns), then matrix addition and subtraction simply follows by adding (or subtracting) on an element by element basis A column vector whose elements are row vectors Matrix addition: (A+B) ij = A ij + B ij Matrix subtraction: (A-B) ij = A ij - B ij Examples: A = 3 0 and B = A row vector whose elements are column vectors C = A + B = 4 3 3 - and D = A B = ( ) 6 8

Towards Matrix Multiplication: dot products The least-squares solution for the linear model y = µ + β z + β n z n The dot (or inner) product of two vectors (both of length n) is defined as follows: Example: a b = n i= aibi a = 3 and b = ( 4 5 7 9 ) 4 a b = *4 + *5 + 3*7 + 4*9 = 7 9 yields the following system of equations for the! i σ(y,z ) = β σ (z ) + β σ(z, z ) + + β nσ(z, z n) σ(y,z )= β σ(z,z ) + β σ (z ) + + β nσ(z, z n) σ(y, z n )= β σ(z, z n ) + β σ(z, z n )+ + β n σ (z n ) This can be more compactly written in matrix form as σ (z ) σ(z, z ) σ(z, z n) β σ(y, z ) σ(z,z ) σ (z ) σ(z, z n) β = σ(y, z ) σ(z,z n ) σ(z,z n ) σ (z n ) β n σ(y, z n ) X T X or,! = (X T X) - X T y! X T y Matrices are compact ways to write systems of equations Matrix Multiplication: The order in which matrices are multiplied affects the matrix product, eg AB = BA For the product of two matrices to exist, the matrices must conform For AB, the number of columns of A must equal the number of rows of B The matrix C = AB has the same number of rows as A and the same number of columns as B 0 C (rxc) = A (rxk) B (kxc) ij-th element of C is given by k C ij = A il B lj l= Elements in the jth column of B Elements in the ith row of matrix A

Outer indices given dimensions of resulting matrix, with r rows (A) and c columns (B) C (rxc) = A (rxk) B (kxc) Inner indices must match columns of A = rows of B Example: Is the product ABCD defined? If so, what is its dimensionality? Suppose A 3x5 B 5x9 C 9x6 D 6x3 Yes, defined, as inner indices match Result is a 3 x 3 matrix (3 rows, 3 columns) 3 Example a b e f ae + bg af + bh AB = = c d g h ce + dg cf + dh Likewise ae + cf eb + df BA = ga + ch gd + dh ORDER of multiplication matters! Indeed, consider C 3x5 D 5x5 which gives a 3 x 5 matrix, versus D 5x5 C 3x5, which is not defined 5 More formally, consider the product L = MN Express the matrix M as a column vector of row vectors m m M = where m i = ( Mi Mi Mic ) m r Likewise express N as a row vector of column vectors N = ( n n n b ) where n j = The ij-th element of L is the inner product of M s row i with N s column j m n m n m n b L = m n m n m n b m r n m r n m r n b N j N j Ncj 4 The Transpose of a Matrix The transpose of a matrix exchanges the rows and columns, A T ij = A ji Useful identities (AB) T = B T A T (ABC) T = C T B T A T Inner product = a T b = a T ( X n) b (n X ) Indices match, matrices conform a a = Dimension of resulting product is X (ie a scalar) b n ( a a n ) = a T b = a i b i b i= n a n b b = b n Note that b T a = (b T a) T = a T b 6

Outer product = ab T = a (n X ) b T ( X n) Resulting product is an n x n matrix The Identity Matrix, I The identity matrix serves the role of the number in matrix multiplication: AI =A, IA = A I is a square diagonal matrix, with all diagonal elements being one, all off-diagonal elements zero I ij = for i = j 0 otherwise I 3x3 = 0 0 0 0 0 0 7 9 Solving equations The identity matrix I Serves the same role as in scalar algebra, eg, a*=*a =a, with AI=IA= A The inverse matrix A - (IF it exists) Defined by A A - = I, A - A = I Serves the same role as scalar division To solve ax = c, multiply both sides by (/a) to give (/a)*ax = (/a)c or (/a)*a*x = *x = x, Hence x = (/a)c To solve Ax = c, A - Ax = A - c Or A - Ax = Ix = x = A - c 8 For The Inverse Matrix, A - For a square matrix A, define is Inverse A -, as the matrix satisfying ( A = a b c d A - A = AA - = I ) A = ad bc d b c a If this quantity (the determinant) is zero, the inverse does not exist 0

If det(a) is not zero, A - exists and A is said to be non-singular If det(a) = 0, A is singular, and no unique inverse exists (generalized inverses do) Generalized inverses, and their uses in solving systems of equations, are discussed in Appendix 3 of Lynch & Walsh A - is the typical notation to denote the G-inverse of a matrix When a G-inverse is used, provided the system is consistent, then some of the variables have a family of solutions (eg, x =, but x + x 3 = 6) β = - ρ β = ρ If # = 0, these reduce to the two univariate slopes, β = σ(y, z ) σ (z ) [ σ(y, z ) σ (z ) ρ [ σ(y, z ) σ (z ) ρ ] σ(y,z ) σ(z )σ(z ) ] σ(y,z ) σ(z )σ(z ) and β = σ(y, z ) σ (z ) Likewise, if # =, this reduces to a univariate regression, 3 Example: solve the OLS for! in y = " +! z +! z + e! = V - c β β c = σ(y, z ) V = σ (z ) σ(z, z ) σ(y, z ) σ(z,z ) σ (z ) It is more compact to use σ(z,z ) = ρ σ(z )σ(z ) V = σ (z )σ (z ) ( ρ ) σ (z ) σ(z, z ) σ(z, z ) σ (z ) = σ (z )σ (z ) ( ρ ) σ (z ) σ(z, z ) σ(z, z ) σ (z ) σ(y,z ) σ(y,z ) Useful identities (A T ) - = (A - ) T (AB) - = B - A - For a diagonal matrix D, then Det = D = product of the diagonal elements Also, the determinant of any square matrix A, det(a), is simply the product of the eigenvalues $ of A, which statisfy Ae = $e If A is n x n, solutions to $ are an n-degree polynomial If any of the roots to the equation are zero, A - is not defined Further, some some linear combination b, we have Ab = 0 e is the eigenvector associated with $ 4

Variance-Covariance matrix A very important square matrix is the variance-covariance matrix V associated with a vector x of random variables V ij = Cov(x i,x j ), so that the i-th diagonal element of V is the variance of x i, and offdiagonal elements are covariances V is a symmetric, square matrix Quadratic and Bilinear Forms Quadratic product: for A n x n and x n x x T Ax = n i= j= n a ij x i x j Scalar ( x ) Bilinear Form (generalization of quadratic product) for A m x n, a n x, b m x their bilinear form is b T x m A m x n a n x m n b T Aa = A ij b i a j i= j= 5 Note that b T A a = at A T b 7 The trace The trace, tr(a) or trace(a), of a square matrix A is simply the sum of its diagonal elements The importance of the trace is that it equals the sum of the eigenvalues of A, tr(a) = % $ i For a covariance matrix V, tr(v) measures the total amount of variation in the variables $ i / tr(v) is the fraction of the total variation in x contained in the linear combination e it x, where e i, the i-th principal component of V is also the i-th eigenvector of V (Ve i = $ i e i ) 6 Covariance Matrices for Transformed Variables What is the variance of the linear combination, c x + c x + + c n x n? (note this is a scalar) σ ( c T x ) ( n ) = σ c i x i = σ n n c i x i, c j x j i= i= j= n n n n = σ (c i x i,c j x j) = c i c j σ (x i,x j) i= j= = c T Vc i= j= Likewise, the covariance between two linear combinations can be expressed as a bilinear form, σ(a T x,b T x) = a T V b 8

Example: Suppose the variances of x, x, and x 3 are 0, 0, and 30 x and x have a covariance of -5, x and x 3 of 0, while x and x 3 are uncorrelated What are the variances of the new variables y = x -x +5x 3 and y = 6x -4x 3? Var(y ) = Var(c T x) = c T Var(x) c = 960 Var(y ) = Var(c T x) = c T Var(x) c = 00 Cov(y,y ) = Cov(c T x, c T x) = c T Var(x) c = -90 9 Positive-definite matrix A matrix V is positive-definite if for all vectors c contained at least one non-zero member, c T Vc > 0 A non-negative definite matrix satisfies c T Vc > 0 Any covariance-matrix is (at least) nonnegative definite, as Var(c T x) = c T Vc > 0 Any nonsingular covariance matrix is positive-definite Nonsingular means det(v) > 0 Equivalently, all eigenvalues of V are positive, $ i > 0 3 Now suppose we transform one vector of random variables into another vector of random variables Transform x into (i) y k x = A k x n x n x (ii) z m x = B m x n x n x The covariance between the elements of these two transformed vectors of the original is a k x m covariance matrix = AVB T For example, the covariance between y i and y j is given by the ij-th element of AVA T Likewise, the covariance between y i and z j is given by the ij-th element of AVB T 30 The Multivariate Normal Distribution (MVN) Consider the pdf for n independent normal random variables, the ith of which has mean µ i and variance & i n ( p(x) = (π) / σ exp (x ) - - i µ i ) i σ i= i ( n - ( n = (π) σi) n/ exp i= This can be expressed more compactly in matrix form i= ) (xi µi) σi 3

Define the covariance matrix V for the vector x of the n normal random variable by σ 0 0 0 σ V = 0 0 σn Define the mean vector µ by µ = µ µ µ n n i= V = Hence in matrix from the MVN pdf becomes [ p(x) = (π) n/ V / exp ] - (x µ)t V (x µ) Notice this holds for any vector µ and symmetric positive-definite matrix V, as V > 0 n i= σ i (x i - µ i) = (x µ) T V (x µ) σ i 33 Vector of means µ determines location Spread (geometry) about µ determined by V µ x, x equal variances, positively correlated µ x, x equal variances, uncorrelated Eigenstructure (the eigenvectors and their corresponding eigenvalues) determines the geometry of V 35 The multivariate normal Vector of means µ determines location Spread (geometry) about µ determined by V Just as a univariate normal is defined by its mean and spread, a multivariate normal is defined by its mean vector µ (also called the centroid) and variance-covariance matrix V µ x, x equal variances, negatively correlated µ Var(x ) < Var(x ), uncorrelated 34 Positive tilt = positive correlations Negative tilt = negative correlation No tilt = uncorrelated 36

$ e Eigenstructure of V The direction of the largest axis of variation is given by the unit-length vector e, the st eigenvector of V µ $ e The next largest axis of orthogonal (at 90 degrees from) e, is given by e, the nd eigenvector Properties of the MVN - II 3) Conditional distributions are also MVN Partition x into two components, x (m dimensional column vector) and x ( n-m dimensional column vector) x = ( x x ) µ µ = µ and V = V x x Vx x V T xx Vx x x x is MVN with m-dimensional mean vector µ x x = µ + V xxv xx (x µ ) 37 and m x m covariance matrix Vx x = Vxx - VxxV x x Vx T x - 39 Properties of the MVN - I ) If x is MVN, any subset of the variables in x is also MVN ) If x is MVN, any linear combination of the elements of x is also MVN If x ~ MVN(µ,V) for y = x + a, y is MVN n (µ + a, V) n for y = a T x = aixi, y is N(a T µ,a T V a) k= for y = Ax, y is MVNm ( ) Aµ,A T VA Properties of the MVN - III 4) If x is MVN, the regression of any subset of x on another subset is linear and homoscedastic x = µ x x + e = µ + VxxV xx (x µ ) + e Where e is MVN with mean vector 0 and variance-covariance matrix Vx x 38 40

µ + VxxV xx (x µ ) + e The regression is linear because it is a linear function of x The regression is homoscedastic because the variancecovariance matrix for e does not depend on the value of the x s Vx x = Vxx - VxxV x x Vx T x All these matrices are constant, and hence the same for any value of x - 4 Regression of Offspring value on Parental values (cont) Hence, z o = µ + VxxV xx (x µ ) + e Vx,x = σ z, Vx,x = h σz ( ), V x,x = σz = µ o + h σ z ( ) σ z 0 zs µ s 0 z d + e µ d = µ o + h (zs µ s) + h (zd µ d) + e Where e is normal with mean zero and variance Vx x = Vxx - VxxV x x Vx T x σ e = σz h σz ( ) σz - 0 h σz 0 = σz h4-0 ( 0 ) 43 Example: Regression of Offspring value on Parental values Assume the vector of offspring value and the values of both its parents is MVN Then from the correlations among (outbred) relatives, zo z s z d MVN µo µ s µ d Let x = ( zo ), x =, σ z h / h / h / 0 h / 0 ( zs Vx,x = σ z, Vx,x = h σz ( ), V x,x = σz zd ) = µ + VxxV xx (x µ ) + e 0 ( 0 ) 4 Hence, the regression of offspring trait value given the trait values of its parents is z o = µ o + h /(z s - µ s ) + h /(z d - µ d ) + e where the residual e is normal with mean zero and Var(e) = & z (-h 4 /) Similar logic gives the regression of offspring breeding value on parental breeding value as A o = µ o + (A s - µ s )/ + (A d - µ d )/ + e = A s / + A d / + e where the residual e is normal with mean zero and Var(e) = & A / 44

Ordinary least squares Some common derivatives Hence, we need to discuss vector/matrix derivatives 45 47 The gradient, the derivative of a vector-valued function 46 48

Simple matrix commands in R R is case-sensitive! t(a) = transpose of A A%*%B = matrix product AB %*% command = matrix multiplication solve(a) = compute inverse of A x <- = assigns the variable x what is to the right of the arrow eg, x <- 35 Example : Compute Var(y ), Var(y ), and Cov(y,y ), where y i = c it x c T Vc c T Vc 49 c T Vc 5 Example: Solve the equation det(a) returns the determinant of A First, input the matrix A In R, c denotes a list, and we built the matrix by columns, eigen(a) returns the eigenvalues and vectors of A Next, input the vector c Finally, compute A - c 50 5

Additional references Lynch & Walsh Chapter 8 (intro to matrices) In notes package: Appendix 4 (Matrix geometry) Appendix 5 (Matrix derivatives) Matrix Calculations in R 53 55 54