How To Understand The Kroncker Product

Similar documents

1 Determinants and the Solvability of Linear Systems

Notes on Determinant

Inner product. Definition of inner product

1 Introduction to Matrices

Continuity of the Perron Root

Math 115A HW4 Solutions University of California, Los Angeles. 5 2i 6 + 4i. (5 2i)7i (6 + 4i)( 3 + i) = 35i + 14 ( 22 6i) = i.

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Unit 18 Determinants

Notes on Matrix Calculus

The Determinant: a Means to Calculate Volume

Similarity and Diagonalization. Similar Matrices

MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set.

A note on companion matrices

Math 312 Homework 1 Solutions

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

Similar matrices and Jordan form

Lecture 4: Partitioned Matrices and Determinants

Decision-making with the AHP: Why is the principal eigenvector necessary

More than you wanted to know about quadratic forms

5. Orthogonal matrices

Split Nonthreshold Laplacian Integral Graphs

THREE DIMENSIONAL GEOMETRY

Data Mining: Algorithms and Applications Matrix Math Review

Orthogonal Diagonalization of Symmetric Matrices

LINEAR ALGEBRA. September 23, 2010

Section Inner Products and Norms

The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression

Factorization Theorems

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

1 Symmetries of regular polyhedra

3. INNER PRODUCT SPACES

Linear Algebra Notes for Marsden and Tromba Vector Calculus

Brief Review of Tensors

[1] Diagonal factorization

8 Square matrices continued: Determinants

4: EIGENVALUES, EIGENVECTORS, DIAGONALIZATION

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

Chapter 6. Orthogonality

DETERMINANTS IN THE KRONECKER PRODUCT OF MATRICES: THE INCIDENCE MATRIX OF A COMPLETE GRAPH

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

Problem Set 5 Due: In class Thursday, Oct. 18 Late papers will be accepted until 1:00 PM Friday.

Theory of Matrices. Chapter 5

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Adding vectors We can do arithmetic with vectors. We ll start with vector addition and related operations. Suppose you have two vectors

Linear Algebra: Determinants, Inverses, Rank

The cover SU(2) SO(3) and related topics

26. Determinants I. 1. Prehistory

Modélisation et résolutions numérique et symbolique

Lectures notes on orthogonal matrices (with exercises) Linear Algebra II - Spring 2004 by D. Klain

MAT188H1S Lec0101 Burbulla

Lecture 2 Matrix Operations

Lecture Notes on The Mechanics of Elastic Solids

AXIOMS FOR INVARIANT FACTORS*

9 MATRICES AND TRANSFORMATIONS

DATA ANALYSIS II. Matrix Algorithms

A Primer on Index Notation

6. Cholesky factorization

Vector Algebra CHAPTER 13. Ü13.1. Basic Concepts

RINGS WITH A POLYNOMIAL IDENTITY

Inner products on R n, and more

Continued Fractions and the Euclidean Algorithm

Linear Algebra Notes

A QUICK GUIDE TO THE FORMULAS OF MULTIVARIABLE CALCULUS

Determinants in the Kronecker product of matrices: The incidence matrix of a complete graph

Solution to Homework 2

Recall that two vectors in are perpendicular or orthogonal provided that their dot

Systems of Linear Equations

Matrix algebra for beginners, Part I matrices, determinants, inverses

BILINEAR FORMS KEITH CONRAD

Finite dimensional C -algebras

Stiffness Matrices of Isoparametric Four-node Finite Elements by Exact Analytical Integration

A characterization of trace zero symmetric nonnegative 5x5 matrices

Introduction to Matrices for Engineers

Quantum Computers. And How Does Nature Compute? Kenneth W. Regan 1 University at Buffalo (SUNY) 21 May, Quantum Computers

UNCOUPLING THE PERRON EIGENVECTOR PROBLEM

α = u v. In other words, Orthogonal Projection

MATH APPLIED MATRIX THEORY

The Characteristic Polynomial

Statistical Machine Learning

CS3220 Lecture Notes: QR factorization and orthogonal transformations

Inner Product Spaces and Orthogonality

Inner Product Spaces

Section 5.3. Section 5.3. u m ] l jj. = l jj u j + + l mj u m. v j = [ u 1 u j. l mj

Classification of Cartan matrices

Høgskolen i Narvik Sivilingeniørutdanningen STE6237 ELEMENTMETODER. Oppgaver

Completely Positive Cone and its Dual

4 MT210 Notebook Eigenvalues and Eigenvectors Definitions; Graphical Illustrations... 3

October 3rd, Linear Algebra & Properties of the Covariance Matrix

SALEM COMMUNITY COLLEGE Carneys Point, New Jersey COURSE SYLLABUS COVER SHEET. Action Taken (Please Check One) New Course Initiated

LS.6 Solution Matrices

Eigenvalues and Eigenvectors

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

Chapter 7. Matrices. Definition. An m n matrix is an array of numbers set out in m rows and n columns. Examples. (

PUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include

Elasticity Theory Basics

CIRCLE COORDINATE GEOMETRY

Transcription:

The Kronecker Product A Product of the Times Charles Van Loan Department of Computer Science Cornell University

The Kronecker Product B C is a block matrix whose ij-th block is b ij C. E.g., [ b11 b 12 b 21 b 22 ] C = b 11C b 12 C b 21 C b 22 C Also called the Direct Product or the Tensor Product

Every b ij c kl Shows Up [ b11 b 12 b 21 b 22 ] c 11 c 12 c 13 c 21 c 22 c 23 c 31 c 32 c 33 = b 11 c 11 b 11 c 12 b 11 c 13 b 12 c 11 b 12 c 12 b 12 c 13 b 11 c 21 b 11 c 22 b 11 c 23 b 12 c 21 b 12 c 22 b 12 c 23 b 11 c 31 b 11 c 32 b 11 c 33 b 12 c 31 b 12 c 32 b 12 c 33 b 21 c 11 b 21 c 12 b 21 c 13 b 22 c 11 b 22 c 12 b 22 c 13 b 21 c 21 b 21 c 22 b 21 c 23 b 22 c 21 b 22 c 22 b 22 c 23 b 21 c 31 b 21 c 32 b 21 c 33 b 22 c 31 b 22 c 32 b 22 c 33

Basic Algebraic Properties (B C) T = B T C T (B C) 1 = B 1 C 1 (B C)(D F) = BD CF B (C D) C B = (B C) D = (Perfect Shuffle) T (B C)(Perfect Shuffle)

Reshaping KP Computations Suppose B,C IR n n and x IR n2. The operation y = (B C)x is O(n 4 ): y = kron(b,c)*x The operation Y = CXB T is O(n 3 ): y = reshape(c*reshape(x,n,n)*b,n,n)

Talk Outline 1. The 1800 s Origins: Z 2. The 1900 s Heightened Profile: 3. The 2000 s Future:

The 1800 s Z

Products and Deltas δ ij Leopold Kronecker (1823 1891)

The Kronecker Delta: Not as Interesting! U T δ ij V = δij κ 2 (δ ij ) = 1 δ ij

Acknowledgement H.V. Henderson, F. Pukelsheim, and S.R. Searle (1983). On the History of the Kronecker Product, Linear and Multilinear Algebra 14, 113 120. Shayle Searle, Professor Emeritus, Cornell University (right)

Scandal! H.V. Henderson, F. Pukelsheim, and S.R. Searle (1983). On the History of the Kronecker Product, Linear and Multilinear Algebra 14, 113 120. Abstract History reveals that what is today called the Kronecker product should be called the Zehfuss Product. This fact is somewhat appreciated by the modern (numerical) linear algebra community: R.J. Horn and C.R. Johnson(1991). Topics in Matrix Analysis, Cambridge University Press, NY, p. 254. A.N. Langville and W.J. Stewart (2004). The Kronecker product and stochastic automata networks, J. Computational and Applied Mathematics 167, 429 447.

Let s Get to the Bottom of This! History Reveals That What is Today Called The Kronecker Product Should Be Called The Zehfuss Product

Who Was Zeyfuss? Born 1832. Obscure professor of mathematics at University of Heidelberg for a while. Then went on to other things. Wrote papers on determinants... G. Zeyfuss (1858). Uber eine gewasse Determinante, Zeitscrift fur Mathematik und Physik 3, 298 301.

Main Result a.k.a. The Z Theorem If B IR m m and C IR n n then det (B C) = det(b) n det(c) m Proof: Noting that I n B and I m C are block diagonal, take determinants in B C = (B I n )(I m C) = P(I n B)P T (I m C) where P is a perfect shuffle permutation.

Zehfuss(1858) = a 1 A 1 a 1 B 1 b 1 A 1 b 1 B 1 c 1 A 1 c 1 B 1 d 1 A 1 d 1 B 1 a 1 A 2 a 1 B 2 b 1 A 2 b 1 B 2 c 1 A 2 c 1 B 2 d 1 A 2 d 1 B 2 a 2 A 1 a 2 B 1 b 2 A 1 b 2 B 1 c 2 A 1 c 2 B 1 d 2 A 1 d 2 B 1 a 2 A 2 a 2 B 2 b 2 A 2 b 2 B 2 c 2 A 2 c 2 B 2 d 2 A 2 d 2 B 2 a 3 A 1 a 3 B 1 b 3 A 1 b 3 B 1 c 3 A 1 c 3 B 1 d 3 A 1 d 3 B 1 a 3 A 2 a 3 B 2 b 3 A 2 b 3 B 2 c 3 A 2 c 3 B 2 d 3 A 2 d 3 B 2 a 4 A 1 a 4 B 1 b 4 A 1 b 4 B 1 c 4 A 1 c 4 B 1 d 4 A 1 d 4 B 1 a 4 A 2 a 4 B 2 b 4 A 2 b 4 B 2 c 4 A 2 c 4 B 2 d 4 A 2 d 4 B 2

Zehfuss(1858) p = a 1 b 1 c 1 d 1 a 2 b 2 c 2 d 2 a 3 b 3 c 3 d 3 a 4 b 4 c 4 d 4 und P = A 1 B 1 A 2 B 2 2 2,2 = p 2 4 P 4 2 2 Mm = p M P m

Hensel (1891) Student in Berlin 1880-1884. Maintains that Kronecker presented the Z-theorem in his lectures. K. Hensel (1891). Uber die Darstellung der Determinante eines Systems, welsches aus zwei a irt ist, ACTA Mathematica 14, 317 319.

The 1900 s

Muir (1911) Attributes the Z-theorem to Zehfuss. Calls det(b C) the Zehfuss determinant. T. Muir (1911). The Theory of Determinants in the Historical Order of Development, Vols 1-4, Dover, NY.

Rutherford(1933) On the Condition that Two Zehfuss Matrices are Equal B C??? = F G If B (m b n b ), C (m c n c ), F (m f n f ), G (m g n g ), then (B C) ij = (F G) ij means B(floor(i/m c ), floor(j/n c )) C(i mod m c,j mod n c ) = F(floor(i/m g ), floor(j/n g )) G(i mod m g,j mod n g )

Van Loan(1934) On the Condition that Two Square Zehfuss Matrices are Equal B C??? = F G If then U T b BV b = diag(σ i (B)) U T c CV c = diag(σ i (C)) U T b FV b = α diag(σ i (F)) U T c GV c = 1 α diag(σ i(g))

Heightened Profile Beginning in the 60s Some Reasons: Regular Grids Tensoring Low Dimension Ideas Higher Order Statistics Fast Transforms Preconditioners Numerical Multi-linear Algebra

Regular Grids (M +1)-by-(N +1) discretization of the Laplacian on a rectangle... A = I M T N + T M I N T 5 = 2 1 0 0 0 1 2 1 0 0 0 1 2 1 0 0 0 1 2 1 0 0 0 1 2

Tensoring Low Dimension Ideas b a f(x) dx n i=1 w i f(x i ) = w T f(x) b1 b2 b3 a 1 a 2 a 3 f(x,y,z) dxdy dz n x i=1 n y j=1 n z k=1 w (x) i w (y) j w (z) k f(x i,y j,z k ) = (w (x) w (y) w (z) ) T f(x y z)

Higher Order Statistics E(xx T ) E(x x) E(x x x) Kronecker powers: k A = A A A (k times)

Fast Transforms FFT B 4 = F 16 P 16 = B 16 (I 2 B 8 )(I 4 B 4 )(I 8 B 2 ) 1 0 1 0 0 1 0 ω 4 1 0 1 0 0 1 0 ω 4 ω n = exp( 2πi/n) Haar Wavelet Transform W 2m = [ W m ( 1 1 ) I m ( 1 1 ) ] if m > 1 [ 1 ] if m = 1.

Preconditioners If A B C, then B C has potential as a preconditioner. It captures the essence of A. It is easy to solve (B C)z = r. Best Example: A band block Toeplitz with banded Toeplitz blocks. B and C chosen to be band Toeplitz. Nagy, Kamm, Kilmer, Perrone, others

Tensor Decompositions/Approximation Given A = A(1:n, 1:n, 1:n, 1:n), find orthogonal Q = [ q 1 q n ] U = [ u 1 u n ] V = [ v 1 v n ] W = [ w 1 w n ] and a core tensor σ so n vec(a) σ ijk,l w i v j u k q l i,j,k,l=1

Offspring 1. The Left Kronecker Product 2. The Hadamard Product 3. The Tracy-Singh Product 4. The Khatri-Rao Product 5. The Generalized Kronecker Product 6. The Symmetric Kronecker Product 7. The Bi-Alternate Product

Left Kronecker Product Definition: ] [ c11 B c 12 B ] B Left [ c11 c 12 c 21 c 22 = c 21 B c 22 B = C B Fact: If B IR m b n b and C IR m c n c then B Left C = Π T m c, m b m c (B C)Π nc, n b n c Perfect Shuffles

The Hadamard Product Definition: b 11 b 12 b 21 b 22 Had c 11 c 12 c 21 c 22 = b 11 c 11 b 12 c 12 b 21 c 21 b 22 c 22 b 31 b 32 c 31 c 32 b 31 c 31 b 32 c 32 B Had C = B. C

The Hadamard Product Fact: If Ã = B C and B,C IR m n, then B Had C = Ã(1:(m+1):m2, 1:(n+1):n 2 ) E.g., b 11 b 12 b 21 b 22 b 31 b 32 c 11 c 12 c 21 c 22 c 31 c 32 = b 11 c 11 b 11 c 12 b 12 c 11 b 12 c 12 b 11 c 21 b 11 c 22 b 12 c 21 b 12 c 22 b 11 c 31 b 11 c 32 b 12 c 31 b 12 c 32 b 21 c 11 b 21 c 12 b 22 c 11 b 22 c 12 b 21 c 21 b 21 c 22 b 22 c 21 b 22 c 22 b 21 c 31 b 21 c 32 b 22 c 31 b 22 c 32 b 31 c 11 b 31 c 12 b 32 c 11 b 32 c 12 b 31 c 21 b 31 c 22 b 32 c 21 b 32 c 22 b 31 c 31 b 31 c 32 b 32 c 31 b 32 c 32

The Tracy-Singh Product Definition: B = B TS C = B 11 B 12 B 21 B 22 B 31 B 32 C = [ C11 C 12 C 21 C 22 B 11 C 11 B 11 C 12 B 12 C 11 B 12 C 12 B 11 C 21 B 11 C 22 B 12 C 21 B 12 C 22 B 21 C 11 B 21 C 12 B 22 C 11 B 22 C 12 B 21 C 21 B 21 C 22 B 22 C 21 B 22 C 22 B 31 C 11 B 31 C 12 B 32 C 11 B 32 C 12 B 31 C 21 B 31 C 22 B 32 C 21 B 32 C 22 ]

The Khatri-Rao Product Definition: B = B 11 B 12 B 21 B 22 C = C 11 C 12 C 21 C 22 B 31 B 32 B K-R C = C 31 C 32 B 11 C 11 B 12 C 12 B 21 C 21 B 22 C 22 B 31 C 31 B 32 C 32

The Khatri-Rao Product Fact: B TS C = B KR C is a submatrix of B TS C B 11 C 11 B 11 C 12 B 12 C 11 B 12 C 12 B 11 C 21 B 11 C 22 B 12 C 21 B 12 C 22 B 11 C 31 B 11 C 32 B 12 C 31 B 12 C 32 B 21 C 11 B 21 C 12 B 22 C 11 B 22 C 12 B 21 C 21 B 21 C 22 B 22 C 21 B 22 C 22 B 21 C 31 B 21 C 32 B 22 C 31 B 22 C 32 B 31 C 11 B 31 C 12 B 32 C 11 B 32 C 12 B 31 C 21 B 31 C 22 B 32 C 21 B 32 C 22 B 31 C 31 B 31 C 32 B 32 C 31 B 32 C 32

The Generalized Kronecker Product Regalia and Mitra (1989), Kronecker Products, Unitary Matrices, and Signal Processing Applications, SIAM Review, 31, 586 613. B 1 B 2 B 3 B 4 gen C = B 1 Left C(1, :) B 2 Left C(2, :) B 3 Left C(3, :) B 4 Left C(4, :)]

The Generalized Kronecker Product B 1 B 2 B 3 B 4 GEN { C1 C 2 } = { B1 B 2 { B3 B 4 } gen C 1 } gen C 2

The Symmetric Kronecker Product Kronecker Product turns matrix equations into vector equations: CXB T = G (B C) vec(x) = vec(g) The symmetric Kronecker product does the same thing for matrix equations with symmetric solutions: 1 2 ( CXB T + BXC T) = G (symmetric) (B sym C) svec(x) = svec(g) where svec x 11 x 12 x 13 x 12 x 22 x 23 x 13 x 23 x 33 = [ ] T x 11 2x12 x 22 2x13 2x23 x 33

Symmetric Kronecker Product Fact: If P = 1 0 0 0 0 0 0 α 0 0 0 0 0 0 0 α 0 0 0 α 0 0 0 0 0 0 1 0 0 0 0 0 0 0 α 0 0 0 0 α 0 0 0 0 0 0 α 0 0 0 α 0 0 1 α = 1/ 2 then vec(x) = P svec(x) and B sym C = P T (B C)P

Bi-Alternate Product B C = 1 2 (B C + C B) W. Govaerts, Numerical Methods for Bifurcations of Dynamical Equilibria, SIAM.

The 2000 s Three Predictions

Big N Will Mean Big d Will Mean KP A = [ a11 a 12 a 21 a 22 ] [ b11 b 12 b 21 b 22 ] [ z11 z 12 z 21 z 22 ] N = 2 d

Think Scalar Think Block Think Tensor If for all 1 m i n we have, n i 1,i 2,i 3,i 4 =1 B(m 1,m 2,m 3,m 4 ) = W(i 1,m 1 )Y (i 2,m 2 )X(i 3,m 3 )Z(i 4,m 4 )A(i 1,i 2,i 3,i 4 ) then B = (W Y ) T A(X Z)

Data-Sparse Approximate Factorizations A (B 1 C 1 )(B 2 C 2 )(B 3 C 3 )

Det(Log(A)) A (B C)(D E)(F G) log(det(a)) n c log(det(b)) + n b log(det(c)) + n e log(det(d)) + n d log(det(e)) + n g log(det(f)) + n f log(det(g)) See Zehfuss (2058).