Matrix Multiplication Stephen Boyd EE103 Stanford University October 17, 2016
Outline Matrix multiplication Composition of linear functions Matrix powers QR factorization Matrix multiplication 2
Matrix multiplication can multiply m p matrix A and p n matrix B to get C = AB: C ij = p A ik B kj = A i1 B 1j + + A ip B pj k=1 for i = 1,..., m, j = 1,..., n to get C ij : move along ith row of A, jth column of B example: [ 1.5 3 2 1 1 0 ] 1 1 0 2 1 0 = [ 3.5 4.5 1 1 ] Matrix multiplication 3
Special cases of matrix multiplication scalar-vector product (with scalar on right!) xα inner product a T b matrix-vector multiplication Ax outer product ab T = a 1 b 1 a 1 b 2 a 1 b n a 2 b 1 a 2 b 2 a 2 b n... a m b 1 a m b 2 a m b n Matrix multiplication 4
Properties (AB)C = A(BC), so both can be written ABC A(B + C) = AB + AC (AB) T = B T A T AI = A; IA = A AB = BA does not hold in general Matrix multiplication 5
Block matrices block matrices can be multiplied using the same formula, e.g., [ ] [ ] [ ] A B E F AE + BG AF + BH = C D G H CE + DG CF + DH (provided the products all make sense) Matrix multiplication 6
Column interpretation write B = [ b 1 b 2 b n ] (bi is ith column of B) then we have AB = A [ b 1 b 2 b n ] [ = Ab1 Ab 2 Ab n ] so AB is batch multiply of A times columns of B Matrix multiplication 7
Inner product interpretation with a T i the rows of A, b j the columns of B, we have a T 1 b 1 a T 1 b 2 a T 1 b n a T 2 b 1 a T 2 b 2 a T 2 b n AB =... a T mb 1 a T mb 2 a T mb n so matrix product is all inner products of rows of A and columns of B, arranged in a matrix Matrix multiplication 8
Gram matrix the Gram matrix of an m n matrix A is a T 1 a 1 a T 1 a 2 a T 1 a n G = A T a T 2 a 1 a T 2 a 2 a T 2 a n A =...... a T n a 1 a T n a 2 a T n a n Gram matrix gives all inner products of columns of A example: G = A T A = I means columns of A are orthonormal Matrix multiplication 9
Complexity to compute C ij = (AB) ij is inner product of p-vectors so total required flops is (mn)(2p) = 2mnp flops multiplying two 1000 1000 matrices requires 2 billion flops... and can be done in well under a second on current computers Matrix multiplication 10
Outline Matrix multiplication Composition of linear functions Matrix powers QR factorization Composition of linear functions 11
Composition of linear functions A is an m p matrix, B is p n define f : R p R m as f(u) = Au, g : R n R p as g(v) = Bv f and g are linear functions composition is h : R n R m, h(x) = f(g(x)) we have h(x) = f(g(x)) = A(Bx) = (AB)x so composition of linear functions is linear associated matrix is product of matrices of the functions Composition of linear functions 12
Second difference matrix D n is (n 1) n difference matrix: D n x = (x 2 x 1,..., x n x n 1 ) D n 1 is (n 2) (n 1) difference matrix: D n y = (y 2 y 1,..., y n 1 y n 2 ) = D n 1 D n is (n 2) n second difference matrix: x = (x 1 2x 2 + x 3, x 2 2x 3 + x 4,..., x n 2 2x n 1 + x n ) Composition of linear functions 13
Second difference matrix for n = 5, = D n 1 D n is 1 2 1 0 0 0 1 2 1 0 = 0 0 1 2 1 = 1 1 0 0 0 1 1 0 0 0 1 1 1 1 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 1 1 Composition of linear functions 14
Outline Matrix multiplication Composition of linear functions Matrix powers QR factorization Matrix powers 15
Matrix powers for A square, A 2 means AA, and same for higher powers with convention A 0 = I we have A k A l = A k+l negative powers later; fractional powers in other courses Matrix powers 16
Directed graph n n matrix A is adjacency matrix of directed graph: { 1 there is a edge from vertex j to vertex i A ij = 0 otherwise example: 2 3 5 A = 0 1 0 0 1 1 0 1 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 1 0 1 4 Matrix powers 17
Paths in directed graph (A 2 ) ij = n k=1 A ika kj = number of paths of length 2 from j to i for the example, 1 0 1 1 0 0 1 1 1 2 A 2 = 1 0 1 2 1 0 1 0 0 1 1 0 0 0 0 e.g., there are two paths from 4 to 3 (via 3 and 5) more generally, (A l ) ij = number of paths of length l from j Matrix powers 18
Outline Matrix multiplication Composition of linear functions Matrix powers QR factorization QR factorization 19
Gram-Schmidt in matrix notation run Gram-Schmidt on columns a 1,..., a k of n k matrix A if columns are independent, get orthonormal q 1,..., q k define n k matrix Q with columns q 1,..., q k Q T Q = I from Gram-Schmidt algorithm a i = (q1 T a i )q 1 + + (qi 1a T i )q i 1 + q i q i = R 1i q 1 + + R ii q i with R ij = qi T a j for i < j, R ii = q i defining R ij = 0 for i > j we have A = QR R is upper triangular, with positive diagonal entries QR factorization 20
QR factorization A = QR is called QR factorization of A factors satisfy Q T Q = I, R upper triangular with positive diagonal entries can be computed using Gram-Schmidt algorithm (or some variations) has a huge number of uses, which we ll see soon QR factorization 21