# Algebraic Complexity Theory and Matrix Multiplication

Save this PDF as:

Size: px
Start display at page:

Download "Algebraic Complexity Theory and Matrix Multiplication"

## Transcription

1 Algebraic Complexity Theory and Matrix Multiplication [Handout for the Tutorial] François Le Gall Department of Computer Science Graduate School of Information Science and Technology The University of Tokyo 1 Introduction Algebraic complexity theory is the study of computation using algebraic models. One of the main achievements of this field has been the introduction of methods to prove lower bounds on the computational complexity, in algebraic models of computation, of concrete problems. Another major achievement has been the development of powerful techniques to construct fast algorithms for computational problems with an algebraic structure. This tutorial will give an overview of the main algorithmic applications of algebraic complexity theory, focusing on the construction of bilinear algorithms for computational problems from linear algebra. Our presentation will be systematically illustrated by showing how these ideas from algebraic complexity theory have been used to design asymptotically fast (although not necessarily practical) algorithms for matrix multiplication, as summarized in Table 1. We will show in particular how the techniques described in this tutorial can be applied to construct an algorithm that multiplies two n n matrices over a field using O(n 2.38 ) arithmetic operations, which is the best known upper bound on the asymptotic complexity of square matrix multiplication and was first obtained by Coppersmith and Winograd [3]. Table 1: History of the main improvements on the exponent of square matrix multiplication. Upper bound Year Reference Notes ω 3 Trivial algorithm ω < Strassen [11] ω < Pan [6] Not discussed in this tutorial ω < Bini et al. [1] ω < Schönhage [9] ω < Pan [7] Not discussed in this tutorial ω < Romani [8] Not discussed in this tutorial ω < Coppersmith and Winograd [2] Not discussed in this tutorial ω < Strassen [12] Not discussed in this tutorial ω < Coppersmith and Winograd [3] ω < Stothers [10] (see also [4]) ω < Vassilevska Williams [13] ω < Le Gall [5] 1

2 2 Basics of Bilinear Complexity Theory 2.1 Algebraic complexity and the exponent of matrix multiplication The computation model considered in algebraic complexity theory corresponds to algebraic circuits where each gate represents an elementary algebraic operation over a given field F: addition, subtraction, multiplication, division of two terms, and multiplication of one term by a constant in the field. The algebraic complexity of a problem is the size (i.e., the number of gates) of the smallest algebraic circuit needed to solve the problem. For instance, for the task of computing the matrix product of two n n matrices A = (a ij ) 1 i,j n and B = (b ij ) 1 i,j n with entries in F, we assume that the 2n 2 entries a ij and b ij are given as input, and want to compute the value n c ij = a ik b kj, (2.1) k=1 corresponding to the entry in the i-th row and the j-th column of the product AB, for all (i, j) {1,..., n} {1,..., n}. We will call C(n) the algebraic complexity of this problem. The exponent of matrix multiplication is denoted ω and defined as ω = inf { α C(n) n α for all large enough n }. Obviously, 2 ω Strassen algorithm Consider the matrix product of two 2 2 matrices. It is easy to see from the definition (Eq. (2.1)) that this product can be computed using 8 multiplications and 4 additions, which implies C(2) 12. In 1969, Strassen [11] showed how to compute this product using only seven multiplications: 1. Compute the following seven terms: m 1 = a 11 (b 12 b 22 ), m 2 = (a 11 + a 12 ) b 22, m 3 = (a 21 + a 22 ) b 11, m 4 = a 22 (b 21 b 11 ), m 5 = (a 11 + a 22 ) (b 11 + b 22 ), m 6 = (a 12 a 22 ) (b 21 + b 22 ), m 7 = (a 11 a 21 ) (b 11 + b 12 ). 2. Output m 2 + m 4 + m 5 + m 6 = c 11, m 1 + m 2 = c 12, m 3 + m 4 = c 21, m 1 m 3 + m 5 m 7 = c 22. While the number of multiplication is decreased to seven, the number of additions increases, which does not lead to any improvement for C(2). The key point is that Strassen s approach can be used recursively to compute the product of two 2 k 2 k matrices, for any k 1. By analyzing the complexity of this recursion, we obtain C(2 k ) = O(7 k ), which implies that which was the upper bound obtained in [11]. ω log 2 (7) = , 2

3 2.3 Bilinear algorithms A bilinear algorithm for matrix multiplication is an algebraic algorithm that proceeds in two steps. First, t products of the form m 1 = (linear combination of the a ij s) (linear combination of the b ij s). m t = (linear combination of the a ij s) (linear combination of the b ij s) are computed. Then, each entry c ij is computed by taking a linear combination of m 1,..., m t. The integer t is called the bilinear complexity of this algorithm. Strassen s algorithm is an example of bilinear algorithm that computes the product of two 2 2 matrices with bilinear complexity t = 7. Strassen s recursive approach can actually be generalized to any bilinear algorithm for matrix multiplication, as stated in the following proposition. Proposition 1. Let m be a positive integer. Suppose that there exists a bilinear algorithm that computes the product of two m m matrices with bilinear complexity t. Then ω log m (t). 3 First Techniques We introduce the concepts of tensor and rank. We then use these concepts to obtain Theorem 1 below, which generalizes Proposition Tensors Consider three finite-dimensional vector spaces U, V and W over the field F. Take a basis {x 1,..., x dim(u) } of U, a basis {y 1,..., y dim(v ) } of V, and a basis {z 1,..., z dim(w ) } of W. A tensor over (U, V, W ) is an element of U V W or, equivalently, a formal sum T = dim U u=1 dim V v=1 dim W w=1 d uvw x u y v z w with coefficient d uvw F for each (u, v, w) {1,..., dim(u)} {1,..., dim(v )} {1,..., dim(w )}. The tensor corresponding to the multiplication of an m n matrix by an n p matrix is defined as follows. First take dim(u) = mn, dim(v ) = np and dim(w ) = mp. It is convenient to change the notation for the bases and denote {a ij } 1 i m,1 j n the basis of U, {b ij } 1 i n,1 j p the basis of V and {c ij } 1 i m,1 j p the basis of W. In this case, an arbitrary tensor a over (U, V, W ) is a formal sum T = m n p i,i =1 k,k =1 j,j =1 d ii jj kk a ik b k j c i j. The tensor corresponding to the multiplication of an m n matrix by an n p matrix is the tensor with We summarize this definition as follows. d ii jj kk = { 1 if i = i and j = j and k = k, 0 otherwise. 3

4 Definition 1. The tensor corresponding to the multiplication of an m n matrix by an n p matrix is m p n a ik b kj c ij. (3.1) This tensor is denoted m, n, p. j=1 k=1 One can define in a natural way the tensor product of two tensors. In particular, for matrix multiplication tensors, we obtain the following identity: for any positive integers m, m, n, n, p, p, 3.2 The rank of a tensor We now define the rank of tensor. m, n, p m, n, p = mm, nn, pp. (3.2) Definition 2. Let T be a tensor over (U, V, W ). The rank of T, denoted R(T ), is the minimal integer t for which T can be written as dim(u) dim(v ) dim(w ) t T = α su x u β sv y v γ sw z w, s=1 u=1 for some constants α su, β sv, γ sw in F. v=1 As an illustration of this definition, consider the rank of the matrix multiplication tensor. Obviously, R( m, n, p ) mnp, from the definition (Eq. (3.1)). In particular, R(2, 2, 2) 8. Strassen s algorithm corresponds to the equality 2, 2, 2 =a 11 (b 12 b 22 ) (c 12 + c 22 ) w=1 + (a 11 + a 12 ) b 22 ( c 11 + c 12 ) + (a 21 + a 22 ) b 11 (c 21 c 22 ) + a 22 (b 21 b 11 ) (c 11 + c 21 ) + (a 11 + a 22 ) (b 11 + b 22 ) (c 11 + c 22 ) + (a 12 a 22 ) (b 21 + b 22 ) c 11 + (a 11 a 21 ) (b 11 + b 12 ) ( c 22 ), which shows that actually R(2, 2, 2) 7. More generally, it is easy to see that the rank of a matrix multiplication tensor corresponds to the bilinear complexity of the best bilinear algorithm that computes this matrix multiplication. This observation directly implies the following reinterpretation of Proposition 1: if R(m, m, m) t then ω log m (t). This argument can be generalized as follows. Theorem 1. Let m, n, p and t be four positive integers. If R( m, n, p ) t, then (mnp) ω/3 t. This theorem can be proved using the following two properties of the rank. First, for any tensors T and T, R(T T ) R(T ) R(T ). (3.3) Secondly, for any positive integers m, n and p, R( m, n, p ) = R( m, p, n ) = R( n, m, p ) = R( n, p, m ) = R( p, m, n ) = R( p, n, m ). (3.4) Note that this second property has interesting consequences. It implies, for instance, that the bilinear complexity of computing the product of an n n matrix by an n n 2 matrix is the same as the bilinear complexity of computing the product of an n n 2 matrix by an n 2 n. 4

5 4 More Advanced Techniques We introduce the concept of border rank of a tensor, and use it to obtain Theorem 2 below, which generalizes Theorem 1. We then present Schönhage s asymptotic sum inequality (Theorem 3), which significantly generalizes Theorem Approximate bilinear algorithms and border rank Let λ be an indeterminate and F[λ] denote the ring of polynomials in λ with coefficients in the field F. We now define the concept of border rank of a tensor (compare with Definition 2). Definition 3. Let T be a tensor over (U, V, W ). The border rank of T, denoted R(T ), is the minimal integer t for which there exist an integer c 0 and a tensor T such that T can be written as as dim(u) dim(v ) dim(w ) t λ c T = α su x u β sv y v γ sw z w + λ c+1 T, s=1 u=1 for some constants α su, β sv, γ sw in F[λ]. v=1 Obviously, R(T ) R(T ) for any tensor T. Moreover, Eqs. (3.3) and (3.4) hold when replacing the rank by the border rank. Let us study an example. Bini et al. [1] considered the tensor T Bini = a ik b kj c ij 1 i,j,k 2 (i,k) (2,2) w=1 =a 11 b 11 c 11 + a 12 b 21 c 11 + a 11 b 12 c 12 + a 12 b 22 c 12 + a 21 b 11 c 21 + a 21 b 12 c 22, which corresponds to a matrix product of two 2 2 matrices where one entry in the first matrix is zero (more precisely, a 22 = 0). It can be shown that R(T Bini ) = 6. Bini et al. [1] showed that R(T Bini ) 5 by exhibiting the identity λt Bini = T + λ 2 T where and T =(a 12 + λa 11 ) (b 12 + λb 22 ) c 12 + (a 21 + λa 11 ) b 11 (c 11 + λc 21 ) a 12 b 12 (c 11 + c 12 + λc 22 ) a 21 (b 11 + b 12 + λb 21 ) c 11 + (a 12 + a 21 ) (b 12 + λb 21 ) (c 11 + λc 22 ) T = a 11 b 22 c 12 + a 11 b 11 c 21 + (a 12 + a 21 ) b 21 c 22. Remember that the rank of a tensor is related to the complexity of bilinear algorithms computing the tensor. The border rank is related to the complexity of approximate bilinear algorithms computing the tensor. Another contribution of [1] was to show that approximate bilinear algorithms can be converted into usual bilinear algorithms without increasing the complexity too much, as stated in the following proposition. Proposition 2. There exists a constant a such that R(T ) a R(T ) for any tensor T. 5

6 The constant a in Proposition 2 actually depends of the value c in the border rank. We will nevertheless ignore this technical point in this tutorial. It is easy to see, by combining two copies of the tensor T Bini, that R( 3, 3, 2 ) 10. From the border rank versions of Eqs. (3.3) and (3.4), this gives R( 12, 12, 12 ) 1000, and thus R( 12, 12, 12 ) a 1000 from Proposition 2. Unfortunately, this inequality (via Theorem 1) does not give any interesting upper bound on ω unless a is very close to one, which is not the case (indeed, in the example we are now studying the constant a can be taken as 10). The trick to bypass this difficulty is to consider the tensor 12, 12, 12 N = 12 N, 12 N, 12 N for a large integer N. We have R( 12 N, 12 N, 12 N ) a R( 12 N, 12 N, 12 N ) a 1000 N. Now this inequality, via Theorem 1, gives ω log 12 (a 1/N 1000), which implies ω log 12 (1000) < 2.78 by taking the limit when N goes to infinity. This upper bound was obtained by Bini et al. [1]. The above analysis indicates that, when deriving an upper bound on ω via Theorem 1, one can use the border rank instead of the rank. Indeed, the rank can be replaced by the border rank in Theorem 1, as we now state. Theorem 2. Let m, n, p and t be four positive integers. If R( m, n, p ) t, then (mnp) ω/3 t. 4.2 Schönhage s asymptotic sum inequality Schönhage [9] considered the following tensor: T Schon = 3 a i b j c ij + i,j=1 4 v k v k w. Observe that the first part is isomorphic to 3, 1, 3, and the second part is isomorphic to 1, 4, 1. Since the first part and the second part do not share variables, the sum is actually direct, so we have k=1 T Schon = 3, 1, 3 1, 4, 1. Since R( 3, 1, 3 ) = 9 and R( 1, 4, 1 ) = 4, we obtain immediately R(T Schon ) 13. Schönhage showed that R(T Schon ) 10, by exhibiting the identity λ 2 T Schon = T + λ 3 T 6

7 for T =(a 1 + λu 1 ) (b 1 + λv 1 ) (w + λ 2 c 11 ) + (a 1 + λu 2 ) (b 2 + λv 2 ) (w + λ 2 c 12 ) + (a 2 + λu 3 ) (b 1 + λv 3 ) (w + λ 2 c 21 ) + (a 2 + λu 4 ) (b 2 + λv 4 ) (w + λ 2 c 22 ) + (a 3 λu 1 λu 3 ) b 1 (w + λ 2 c 31 ) + (a 3 λu 2 λu 4 ) b 2 (w + λ 2 c 32 ) + a 1 (b 3 λv 1 λv 2 ) (w + λ 2 c 13 ) + a 2 (b 3 λv 3 λv 4 ) (w + λ 2 c 23 ) + a 3 b 3 (w + λ 2 c 33 ) (a 1 + a 2 + a 3 ) (b 1 + b 2 + b 3 ) w and some tensor T. Schönhage also showed the following strong generalization of Theorem 2 known as the asymptotic sum inequality. Theorem 3. Let k, t be two positive integers, and m 1,..., m k, n 1,..., n k, p 1,..., p k be 3k positive integers. If ( k ) R m i, n i, p i t then Applying Theorem 3 to T Schon gives k (m i n i p i ) ω/3 t. 9 ω/3 + 4 ω/3 10, which implies ω Using a variant of this tensor, Schönhage [9] ultimately obtained the upper bound ω 2.55, again via Theorem 3. 5 The Laser Method We show how the techniques developed so far, combined with an approach sometimes called the laser method, can be applied to obtain the upper bound ω < This upper bound has been first obtained by Coppersmith and Winograd [3]. 5.1 The first construction by Coppersmith and Winograd We start with the first construction from Coppersmith and Winograd s paper [3]. Let q be a positive integer, and consider three vector spaces U, V and W of dimension q + 1 over the field F. Take a basis {x 0,..., x q } of U, a basis {y 0,..., y q } of V, and a basis {z 0,..., z q } of W. Consider the tensor T easy = Teasy Teasy Teasy, 110 7

8 where T 011 easy = T 101 easy = T 110 easy = x 0 y i z i = 1, 1, q, x i y 0 z i = q, 1, 1, x i y i z 0 = 1, q, 1. Remark. The superscripts in T 011 three vector spaces U, V and W : easy, T 101 easy and Teasy 110 come from the following decomposition of the U = U 0 U 1 where U 0 = span{x 0 } and U 1 = span{x 1,..., x q } V = V 0 V 1, where V 0 = span{y 0 } and V 1 = span{y 1,..., y q } W = W 0 W 1, where W 0 = span{z 0 } and W 1 = span{z 1,..., z q }. Observe that where λ 3 T easy = T + λ 4 T T = λ(x 0 + λx i ) (y 0 + λy i ) (z 0 + λz i ) (x 0 + λ 2 x i ) (y 0 + λ 2 y i ) (z 0 + λ 2 z i ) + (1 qλ)x 0 y 0 z 0. and T is some tensor. Thus R(T easy ) q + 2. While the tensor T easy is a sum of three parts (Teasy, 011 Teasy 101 and Teasy), 110 Theorem 3 cannot be used since the sum is not direct (for instance, the variable y i, for any i {1,..., q}, is shared by T 011 T 110 easy and easy). Coppersmith and Winograd [3] showed how to overcome this difficulty by considering several copies of T easy, and obtained the following result, which is an illustration of the laser method developed by Strassen [12]. Theorem 4. For N large enough, the tensor T N easy can be converted into a direct sum of terms 1, each isomorphic to 2 (H( 1 3, 2 3) o(1))n [ ] T 011 N/3 [ ] easy T 101 N/3 [ easy T 110 N/3 easy] = q N/3, q N/3, q N/3. Theorem 4 implies, via Theorem 3, that 2 (H( 1 3, 2 3) o(1))n q Nω/3 R(T N easy) (q + 2) N. 1 Here H ` 1, = 1 log ` log ` represents the entropy (with logarithms taken to the basis 2) of the corresponding probability distribution. 8

9 We thus obtain which gives for q = 8. 2 H( 1 3, 2 3) q ω/3 q + 2, ω < The second construction by Coppersmith and Winograd We now describe the second construction from Coppersmith and Winograd s paper [3]. Let q be a positive integer, and consider three vector spaces U, V and W of dimension q + 2 over F. Take a basis {x 0,..., x q+1 } of U, a basis {y 0,..., y q+1 } of V, and a basis {z 0,..., z q+1 } of W. Consider the tensor T = T T T T T T 200, where T 011 = T 101 = T 110 = x 0 y i z i = 1, 1, q, x i y 0 z i = 1, q, 1, x i y i z 0 = q, 1, 1, T 002 = x 0 y 0 z q+1 = 1, 1, 1, T 020 = x 0 y q+1 z 0 = 1, 1, 1, T 200 = x q+1 y 0 z 0 = 1, 1, 1. Remark. The superscripts in T 011, T 101, T 110, T 200, T 020 decomposition of the three vector spaces U, V and W : and T 002 come from the following U = U 0 U 1 U 2 where U 0 = span{x 0 }, U 1 = span{x 1,..., x q } and U 2 = span{x q+1 } V = V 0 V 1 V 2, where V 0 = span{y 0 }, V 1 = span{y 1,..., y q } and V 2 = span{y q+1 } W = W 0 W 1 W 2, where W 0 = span{z 0 }, W 1 = span{z 1,..., z q } and W 2 = span{z q+1 }. Observe that where λ 3 T = T + λ 4 T T = λ(x 0 + λx i ) (y 0 + λy i ) (z 0 + λz i ) (x 0 + λ 2 x i ) (y 0 + λ 2 y i ) (z 0 + λ 2 z i ) + (1 qλ)(x 0 + λ 3 x q+1 ) (y 0 + λ 3 y q+1 ) (z 0 + λ 3 z q+1 ). and T is some tensor. Thus R(T ) q + 2. While the tensor T is a sum of six parts, Theorem 3 cannot directly be used since the sum is not direct. Again, Coppersmith and Winograd [3] showed that this difficulty can be overcome by considering many copies of T, and obtained the following result. 9

10 Theorem 5. For any 0 α 1/3 and for N large enough, the tensor T N can be converted into a direct sum of 2 (H( 2 3 α,2α, 1 3 α) o(1))n terms, each isomorphic to [ T 011 ] αn [ T 101 ] αn [ T 110 ] αn [ ] T 002 ( 1 Theorem 5 implies, via Theorem 3, that 3 α)n [ T 020 ] ( 1 3 α)n [ T (H( 2 3 α,2α, 1 3 α) o(1))n q αnω R(T N ) (q + 2)N. ] ( 1 3 α)n = q αn, q αn, q αn. We thus obtain which gives for q = 6 and α = H( 2 3 α,2α, 1 3 α) q αω q + 2, ω < Taking powers of the second construction by Coppersmith and Winograd Consider the tensor We can write T 2 = T T. T 2 = T T T T T T T T T T T T T T T 112, where T 400 = T 200 T 200, T 310 = T 200 T T 110 T 200, T 220 = T 200 T T 020 T T 110 T 110, T 211 = T 200 T T 011 T T 110 T T 101 T 110, and the other 11 terms are obtained by permuting the variables (e.g., T 040 = T 020 T 020). Coppersmith and Winograd [3] showed how to generalize the approach of Section 5.2 to analyze T 2, and obtained the upper bound ω by solving an optimization problem of 3 variables (remember that in Section 5.2 the optimization problem had only one variable α). Since T 2 gives better upper bounds on ω than T, a natural question was to consider higher powers of T, i.e., study the tensor T m for m 3. Investigating the third power (i.e., m = 3) was indeed explicitly mentioned as an open problem in [3]. More that twenty years later, Stothers showed that, while the third power does not seem to lead to any improvement, the fourth power does give an improvement [10]. The cases m = 8, m = 16 and m = 32 have then been analyzed, giving the upper bounds on ω summarized in Table 2. 10

11 Table 2: Upper bounds on ω obtained by analyzing the m-th power of the construction T. m Upper bound Number of variables in in the optimization problem Reference 1 ω < Ref. [3] 2 ω < Ref. [3] 4 ω < Ref. [10, 13] 8 ω < Ref. [5] (ω < given in Ref. [13]) 16 ω < Ref. [5] 32 ω < Ref. [5] References [1] BINI, D., CAPOVANI, M., ROMANI, F., AND LOTTI, G. O(n ) complexity for n n approximate matrix multiplication. Information Processing Letters 8, 5 (1979), [2] COPPERSMITH, D., AND WINOGRAD, S. On the asymptotic complexity of matrix multiplication. SIAM Journal on Computing 11, 3 (1982), [3] COPPERSMITH, D., AND WINOGRAD, S. Matrix multiplication via arithmetic progressions. Journal of Symbolic Computation 9, 3 (1990), [4] DAVIE, A. M., AND STOTHERS, A. J. Improved bound for complexity of matrix multiplication. Proceedings of the Royal Society of Edinburgh 143A (2013), [5] LE GALL, F. Powers of tensors and fast matrix multiplication. In Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation (2014), [6] PAN, V. Y. Field extension and triangular aggregating, uniting and canceling for the acceleration of matrix multiplications. In Proceedings of the 20th Annual Symposium on Foundations of Computer Science (1979), pp [7] PAN, V. Y. New combinations of methods for the acceleration of matrix multiplication. Computer and Mathematics with Applications (1981), [8] ROMANI, F. Some properties of disjoint sums of tensors related to matrix multiplication. SIAM Journal on Computing 11, 2 (1982), [9] SCHÖNHAGE, A. Partial and total matrix multiplication. SIAM Journal on Computing 10, 3 (1981), [10] STOTHERS, A. On the Complexity of Matrix Multiplication. PhD thesis, University of Edinburgh, [11] STRASSEN, V. Gaussian elimination is not optimal. Numerische Mathematik 13 (1969), [12] STRASSEN, V. The asymptotic spectrum of tensors and the exponent of matrix multiplication. In Proceedings of the 27th Annual Symposium on Foundations of Computer Science (1986), pp [13] VASSILEVSKA WILLIAMS, V. Multiplying matrices faster than Coppersmith-Winograd. In Proceedings of the 44th Symposium on Theory of Computing (2012), pp

### 4. MATRICES Matrices

4. MATRICES 170 4. Matrices 4.1. Definitions. Definition 4.1.1. A matrix is a rectangular array of numbers. A matrix with m rows and n columns is said to have dimension m n and may be represented as follows:

### The Characteristic Polynomial

Physics 116A Winter 2011 The Characteristic Polynomial 1 Coefficients of the characteristic polynomial Consider the eigenvalue problem for an n n matrix A, A v = λ v, v 0 (1) The solution to this problem

### 1. LINEAR EQUATIONS. A linear equation in n unknowns x 1, x 2,, x n is an equation of the form

1. LINEAR EQUATIONS A linear equation in n unknowns x 1, x 2,, x n is an equation of the form a 1 x 1 + a 2 x 2 + + a n x n = b, where a 1, a 2,..., a n, b are given real numbers. For example, with x and

### Lecture 3: Finding integer solutions to systems of linear equations

Lecture 3: Finding integer solutions to systems of linear equations Algorithmic Number Theory (Fall 2014) Rutgers University Swastik Kopparty Scribe: Abhishek Bhrushundi 1 Overview The goal of this lecture

### a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.

Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given

### MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

### Diagonal, Symmetric and Triangular Matrices

Contents 1 Diagonal, Symmetric Triangular Matrices 2 Diagonal Matrices 2.1 Products, Powers Inverses of Diagonal Matrices 2.1.1 Theorem (Powers of Matrices) 2.2 Multiplying Matrices on the Left Right by

### Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

### The Dirichlet Unit Theorem

Chapter 6 The Dirichlet Unit Theorem As usual, we will be working in the ring B of algebraic integers of a number field L. Two factorizations of an element of B are regarded as essentially the same if

### Linear Codes. In the V[n,q] setting, the terms word and vector are interchangeable.

Linear Codes Linear Codes In the V[n,q] setting, an important class of codes are the linear codes, these codes are the ones whose code words form a sub-vector space of V[n,q]. If the subspace of V[n,q]

### MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

### MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

### Linear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007)

MAT067 University of California, Davis Winter 2007 Linear Maps Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007) As we have discussed in the lecture on What is Linear Algebra? one of

### Direct Methods for Solving Linear Systems. Matrix Factorization

Direct Methods for Solving Linear Systems Matrix Factorization Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011

### Solution of Linear Systems

Chapter 3 Solution of Linear Systems In this chapter we study algorithms for possibly the most commonly occurring problem in scientific computing, the solution of linear systems of equations. We start

### Algebraic and Combinatorial Circuits

C H A P T E R Algebraic and Combinatorial Circuits Algebraic circuits combine operations drawn from an algebraic system. In this chapter we develop algebraic and combinatorial circuits for a variety of

### Solution to Homework 2

Solution to Homework 2 Olena Bormashenko September 23, 2011 Section 1.4: 1(a)(b)(i)(k), 4, 5, 14; Section 1.5: 1(a)(b)(c)(d)(e)(n), 2(a)(c), 13, 16, 17, 18, 27 Section 1.4 1. Compute the following, if

### Notes on Determinant

ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without

### Linear Dependence Tests

Linear Dependence Tests The book omits a few key tests for checking the linear dependence of vectors. These short notes discuss these tests, as well as the reasoning behind them. Our first test checks

### The basic unit in matrix algebra is a matrix, generally expressed as: a 11 a 12. a 13 A = a 21 a 22 a 23

(copyright by Scott M Lynch, February 2003) Brief Matrix Algebra Review (Soc 504) Matrix algebra is a form of mathematics that allows compact notation for, and mathematical manipulation of, high-dimensional

### NOTES ON LINEAR TRANSFORMATIONS

NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all

### Inner Product Spaces and Orthogonality

Inner Product Spaces and Orthogonality week 3-4 Fall 2006 Dot product of R n The inner product or dot product of R n is a function, defined by u, v a b + a 2 b 2 + + a n b n for u a, a 2,, a n T, v b,

### 1 VECTOR SPACES AND SUBSPACES

1 VECTOR SPACES AND SUBSPACES What is a vector? Many are familiar with the concept of a vector as: Something which has magnitude and direction. an ordered pair or triple. a description for quantities such

### LINEAR ALGEBRA W W L CHEN

LINEAR ALGEBRA W W L CHEN c W W L Chen, 1997, 2008 This chapter is available free to all individuals, on understanding that it is not to be used for financial gain, and may be downloaded and/or photocopied,

### Matrix Norms. Tom Lyche. September 28, Centre of Mathematics for Applications, Department of Informatics, University of Oslo

Matrix Norms Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo September 28, 2009 Matrix Norms We consider matrix norms on (C m,n, C). All results holds for

### Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

### T ( a i x i ) = a i T (x i ).

Chapter 2 Defn 1. (p. 65) Let V and W be vector spaces (over F ). We call a function T : V W a linear transformation form V to W if, for all x, y V and c F, we have (a) T (x + y) = T (x) + T (y) and (b)

### Section 6.1 - Inner Products and Norms

Section 6.1 - Inner Products and Norms Definition. Let V be a vector space over F {R, C}. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F,

### 9.2 Summation Notation

9. Summation Notation 66 9. Summation Notation In the previous section, we introduced sequences and now we shall present notation and theorems concerning the sum of terms of a sequence. We begin with a

### LINEAR ALGEBRA. September 23, 2010

LINEAR ALGEBRA September 3, 00 Contents 0. LU-decomposition.................................... 0. Inverses and Transposes................................. 0.3 Column Spaces and NullSpaces.............................

### Lecture 1: Jan 14, 2015

E0 309: Topics in Complexity Theory Spring 2015 Lecture 1: Jan 14, 2015 Lecturer: Neeraj Kayal Scribe: Sumant Hegde and Abhijat Sharma 11 Introduction The theme of the course in this semester is Algebraic

### Solving Systems of Linear Equations

LECTURE 5 Solving Systems of Linear Equations Recall that we introduced the notion of matrices as a way of standardizing the expression of systems of linear equations In today s lecture I shall show how

### 2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system

1. Systems of linear equations We are interested in the solutions to systems of linear equations. A linear equation is of the form 3x 5y + 2z + w = 3. The key thing is that we don t multiply the variables

### Linear Codes. Chapter 3. 3.1 Basics

Chapter 3 Linear Codes In order to define codes that we can encode and decode efficiently, we add more structure to the codespace. We shall be mainly interested in linear codes. A linear code of length

### MA106 Linear Algebra lecture notes

MA106 Linear Algebra lecture notes Lecturers: Martin Bright and Daan Krammer Warwick, January 2011 Contents 1 Number systems and fields 3 1.1 Axioms for number systems......................... 3 2 Vector

### Algebraic Computation Models. Algebraic Computation Models

Algebraic Computation Models Ζυγομήτρος Ευάγγελος ΜΠΛΑ 201118 Φεβρουάριος, 2013 Reasons for using algebraic models of computation The Turing machine model captures computations on bits. Many algorithms

### Tensors on a vector space

APPENDIX B Tensors on a vector space In this Appendix, we gather mathematical definitions and results pertaining to tensors. The purpose is mostly to introduce the modern, geometrical view on tensors,

### 1 Solving LPs: The Simplex Algorithm of George Dantzig

Solving LPs: The Simplex Algorithm of George Dantzig. Simplex Pivoting: Dictionary Format We illustrate a general solution procedure, called the simplex algorithm, by implementing it on a very simple example.

### Appendix A. Appendix. A.1 Algebra. Fields and Rings

Appendix A Appendix A.1 Algebra Algebra is the foundation of algebraic geometry; here we collect some of the basic algebra on which we rely. We develop some algebraic background that is needed in the text.

### Group-theoretic Algorithms for Matrix Multiplication

Group-theoretic Algorithms for Matrix Multiplication Henry Cohn Robert Kleinberg Balázs Szegedy Christopher Umans Abstract We further develop the group-theoretic approach to fast matrix multiplication

### Vector Spaces II: Finite Dimensional Linear Algebra 1

John Nachbar September 2, 2014 Vector Spaces II: Finite Dimensional Linear Algebra 1 1 Definitions and Basic Theorems. For basic properties and notation for R N, see the notes Vector Spaces I. Definition

### MAT188H1S Lec0101 Burbulla

Winter 206 Linear Transformations A linear transformation T : R m R n is a function that takes vectors in R m to vectors in R n such that and T (u + v) T (u) + T (v) T (k v) k T (v), for all vectors u

### 7 Gaussian Elimination and LU Factorization

7 Gaussian Elimination and LU Factorization In this final section on matrix factorization methods for solving Ax = b we want to take a closer look at Gaussian elimination (probably the best known method

### 3. Mathematical Induction

3. MATHEMATICAL INDUCTION 83 3. Mathematical Induction 3.1. First Principle of Mathematical Induction. Let P (n) be a predicate with domain of discourse (over) the natural numbers N = {0, 1,,...}. If (1)

### Similarity and Diagonalization. Similar Matrices

MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

### The Hadamard Product

The Hadamard Product Elizabeth Million April 12, 2007 1 Introduction and Basic Results As inexperienced mathematicians we may have once thought that the natural definition for matrix multiplication would

### Factorization Theorems

Chapter 7 Factorization Theorems This chapter highlights a few of the many factorization theorems for matrices While some factorization results are relatively direct, others are iterative While some factorization

### 1.2 Solving a System of Linear Equations

1.. SOLVING A SYSTEM OF LINEAR EQUATIONS 1. Solving a System of Linear Equations 1..1 Simple Systems - Basic De nitions As noticed above, the general form of a linear system of m equations in n variables

### SOLVING LINEAR SYSTEMS

SOLVING LINEAR SYSTEMS Linear systems Ax = b occur widely in applied mathematics They occur as direct formulations of real world problems; but more often, they occur as a part of the numerical analysis

### Abstract: We describe the beautiful LU factorization of a square matrix (or how to write Gaussian elimination in terms of matrix multiplication).

MAT 2 (Badger, Spring 202) LU Factorization Selected Notes September 2, 202 Abstract: We describe the beautiful LU factorization of a square matrix (or how to write Gaussian elimination in terms of matrix

### 5.3 Determinants and Cramer s Rule

290 5.3 Determinants and Cramer s Rule Unique Solution of a 2 2 System The 2 2 system (1) ax + by = e, cx + dy = f, has a unique solution provided = ad bc is nonzero, in which case the solution is given

### Systems of Linear Equations

Systems of Linear Equations Beifang Chen Systems of linear equations Linear systems A linear equation in variables x, x,, x n is an equation of the form a x + a x + + a n x n = b, where a, a,, a n and

### 2.5 Elementary Row Operations and the Determinant

2.5 Elementary Row Operations and the Determinant Recall: Let A be a 2 2 matrtix : A = a b. The determinant of A, denoted by det(a) c d or A, is the number ad bc. So for example if A = 2 4, det(a) = 2(5)

### DETERMINANTS. b 2. x 2

DETERMINANTS 1 Systems of two equations in two unknowns A system of two equations in two unknowns has the form a 11 x 1 + a 12 x 2 = b 1 a 21 x 1 + a 22 x 2 = b 2 This can be written more concisely in

### Sec 4.1 Vector Spaces and Subspaces

Sec 4. Vector Spaces and Subspaces Motivation Let S be the set of all solutions to the differential equation y + y =. Let T be the set of all 2 3 matrices with real entries. These two sets share many common

### MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix.

MATH 304 Linear Algebra Lecture 18: Rank and nullity of a matrix. Nullspace Let A = (a ij ) be an m n matrix. Definition. The nullspace of the matrix A, denoted N(A), is the set of all n-dimensional column

### Linear Algebra Notes for Marsden and Tromba Vector Calculus

Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of

### INTRODUCTORY LINEAR ALGEBRA WITH APPLICATIONS B. KOLMAN, D. R. HILL

SOLUTIONS OF THEORETICAL EXERCISES selected from INTRODUCTORY LINEAR ALGEBRA WITH APPLICATIONS B. KOLMAN, D. R. HILL Eighth Edition, Prentice Hall, 2005. Dr. Grigore CĂLUGĂREANU Department of Mathematics

### Chapter 3. Distribution Problems. 3.1 The idea of a distribution. 3.1.1 The twenty-fold way

Chapter 3 Distribution Problems 3.1 The idea of a distribution Many of the problems we solved in Chapter 1 may be thought of as problems of distributing objects (such as pieces of fruit or ping-pong balls)

### Solving Linear Diophantine Matrix Equations Using the Smith Normal Form (More or Less)

Solving Linear Diophantine Matrix Equations Using the Smith Normal Form (More or Less) Raymond N. Greenwell 1 and Stanley Kertzner 2 1 Department of Mathematics, Hofstra University, Hempstead, NY 11549

### Solving Systems of Linear Equations; Row Reduction

Harvey Mudd College Math Tutorial: Solving Systems of Linear Equations; Row Reduction Systems of linear equations arise in all sorts of applications in many different fields of study The method reviewed

### (x + a) n = x n + a Z n [x]. Proof. If n is prime then the map

22. A quick primality test Prime numbers are one of the most basic objects in mathematics and one of the most basic questions is to decide which numbers are prime (a clearly related problem is to find

### EC9A0: Pre-sessional Advanced Mathematics Course

University of Warwick, EC9A0: Pre-sessional Advanced Mathematics Course Peter J. Hammond & Pablo F. Beker 1 of 55 EC9A0: Pre-sessional Advanced Mathematics Course Slides 1: Matrix Algebra Peter J. Hammond

### Some Polynomial Theorems. John Kennedy Mathematics Department Santa Monica College 1900 Pico Blvd. Santa Monica, CA 90405 rkennedy@ix.netcom.

Some Polynomial Theorems by John Kennedy Mathematics Department Santa Monica College 1900 Pico Blvd. Santa Monica, CA 90405 rkennedy@ix.netcom.com This paper contains a collection of 31 theorems, lemmas,

### Notes on Factoring. MA 206 Kurt Bryan

The General Approach Notes on Factoring MA 26 Kurt Bryan Suppose I hand you n, a 2 digit integer and tell you that n is composite, with smallest prime factor around 5 digits. Finding a nontrivial factor

### Solving Linear Systems, Continued and The Inverse of a Matrix

, Continued and The of a Matrix Calculus III Summer 2013, Session II Monday, July 15, 2013 Agenda 1. The rank of a matrix 2. The inverse of a square matrix Gaussian Gaussian solves a linear system by reducing

### 8 Square matrices continued: Determinants

8 Square matrices continued: Determinants 8. Introduction Determinants give us important information about square matrices, and, as we ll soon see, are essential for the computation of eigenvalues. You

### The determinant of a skew-symmetric matrix is a square. This can be seen in small cases by direct calculation: 0 a. 12 a. a 13 a 24 a 14 a 23 a 14

4 Symplectic groups In this and the next two sections, we begin the study of the groups preserving reflexive sesquilinear forms or quadratic forms. We begin with the symplectic groups, associated with

### Symbolic Determinants: Calculating the Degree

Symbolic Determinants: Calculating the Degree Technical Report by Brent M. Dingle Texas A&M University Original: May 4 Updated: July 5 Abstract: There are many methods for calculating the determinant of

### Matrix Differentiation

1 Introduction Matrix Differentiation ( and some other stuff ) Randal J. Barnes Department of Civil Engineering, University of Minnesota Minneapolis, Minnesota, USA Throughout this presentation I have

### Using row reduction to calculate the inverse and the determinant of a square matrix

Using row reduction to calculate the inverse and the determinant of a square matrix Notes for MATH 0290 Honors by Prof. Anna Vainchtein 1 Inverse of a square matrix An n n square matrix A is called invertible

### Introduction to Matrix Algebra

Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

### Row Echelon Form and Reduced Row Echelon Form

These notes closely follow the presentation of the material given in David C Lay s textbook Linear Algebra and its Applications (3rd edition) These notes are intended primarily for in-class presentation

### 1 Sets and Set Notation.

LINEAR ALGEBRA MATH 27.6 SPRING 23 (COHEN) LECTURE NOTES Sets and Set Notation. Definition (Naive Definition of a Set). A set is any collection of objects, called the elements of that set. We will most

### Practical Guide to the Simplex Method of Linear Programming

Practical Guide to the Simplex Method of Linear Programming Marcel Oliver Revised: April, 0 The basic steps of the simplex algorithm Step : Write the linear programming problem in standard form Linear

### We seek a factorization of a square matrix A into the product of two matrices which yields an

LU Decompositions We seek a factorization of a square matrix A into the product of two matrices which yields an efficient method for solving the system where A is the coefficient matrix, x is our variable

### Matrix Algebra. Some Basic Matrix Laws. Before reading the text or the following notes glance at the following list of basic matrix algebra laws.

Matrix Algebra A. Doerr Before reading the text or the following notes glance at the following list of basic matrix algebra laws. Some Basic Matrix Laws Assume the orders of the matrices are such that

### Iterative Methods for Solving Linear Systems

Chapter 5 Iterative Methods for Solving Linear Systems 5.1 Convergence of Sequences of Vectors and Matrices In Chapter 2 we have discussed some of the main methods for solving systems of linear equations.

### U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009. Notes on Algebra

U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009 Notes on Algebra These notes contain as little theory as possible, and most results are stated without proof. Any introductory

### On the complexity of some computational problems in the Turing model

On the complexity of some computational problems in the Turing model Claus Diem November 18, 2013 Abstract Algorithms for concrete problems are usually described and analyzed in some random access machine

### MATH 110 Spring 2015 Homework 6 Solutions

MATH 110 Spring 2015 Homework 6 Solutions Section 2.6 2.6.4 Let α denote the standard basis for V = R 3. Let α = {e 1, e 2, e 3 } denote the dual basis of α for V. We would first like to show that β =

### Cofactor Expansion: Cramer s Rule

Cofactor Expansion: Cramer s Rule MATH 322, Linear Algebra I J. Robert Buchanan Department of Mathematics Spring 2015 Introduction Today we will focus on developing: an efficient method for calculating

### MODULAR ARITHMETIC. a smallest member. It is equivalent to the Principle of Mathematical Induction.

MODULAR ARITHMETIC 1 Working With Integers The usual arithmetic operations of addition, subtraction and multiplication can be performed on integers, and the result is always another integer Division, on

### GROUP ALGEBRAS. ANDREI YAFAEV

GROUP ALGEBRAS. ANDREI YAFAEV We will associate a certain algebra to a finite group and prove that it is semisimple. Then we will apply Wedderburn s theory to its study. Definition 0.1. Let G be a finite

### Lecture 6. Inverse of Matrix

Lecture 6 Inverse of Matrix Recall that any linear system can be written as a matrix equation In one dimension case, ie, A is 1 1, then can be easily solved as A x b Ax b x b A 1 A b A 1 b provided that

### Some Lecture Notes and In-Class Examples for Pre-Calculus:

Some Lecture Notes and In-Class Examples for Pre-Calculus: Section.7 Definition of a Quadratic Inequality A quadratic inequality is any inequality that can be put in one of the forms ax + bx + c < 0 ax

### LEARNING OBJECTIVES FOR THIS CHAPTER

CHAPTER 2 American mathematician Paul Halmos (1916 2006), who in 1942 published the first modern linear algebra book. The title of Halmos s book was the same as the title of this chapter. Finite-Dimensional

### 4.6 Null Space, Column Space, Row Space

NULL SPACE, COLUMN SPACE, ROW SPACE Null Space, Column Space, Row Space In applications of linear algebra, subspaces of R n typically arise in one of two situations: ) as the set of solutions of a linear

### Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like

### Generalized Inverse Computation Based on an Orthogonal Decomposition Methodology.

International Conference on Mathematical and Statistical Modeling in Honor of Enrique Castillo. June 28-30, 2006 Generalized Inverse Computation Based on an Orthogonal Decomposition Methodology. Patricia

### So let us begin our quest to find the holy grail of real analysis.

1 Section 5.2 The Complete Ordered Field: Purpose of Section We present an axiomatic description of the real numbers as a complete ordered field. The axioms which describe the arithmetic of the real numbers

### SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89. by Joseph Collison

SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89 by Joseph Collison Copyright 2000 by Joseph Collison All rights reserved Reproduction or translation of any part of this work beyond that permitted by Sections

### = 2 + 1 2 2 = 3 4, Now assume that P (k) is true for some fixed k 2. This means that

Instructions. Answer each of the questions on your own paper, and be sure to show your work so that partial credit can be adequately assessed. Credit will not be given for answers (even correct ones) without

### 1 Eigenvalues and Eigenvectors

Math 20 Chapter 5 Eigenvalues and Eigenvectors Eigenvalues and Eigenvectors. Definition: A scalar λ is called an eigenvalue of the n n matrix A is there is a nontrivial solution x of Ax = λx. Such an x

### Orthogonal Diagonalization of Symmetric Matrices

MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding

### by the matrix A results in a vector which is a reflection of the given

Eigenvalues & Eigenvectors Example Suppose Then So, geometrically, multiplying a vector in by the matrix A results in a vector which is a reflection of the given vector about the y-axis We observe that

### Solving Systems of Linear Equations Using Matrices

Solving Systems of Linear Equations Using Matrices What is a Matrix? A matrix is a compact grid or array of numbers. It can be created from a system of equations and used to solve the system of equations.

### MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set.

MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set. Vector space A vector space is a set V equipped with two operations, addition V V (x,y) x + y V and scalar