Matrix-Chain Multiplication



Similar documents
Solutions to Homework 6

Dynamic Programming. Lecture Overview Introduction

Warshall s Algorithm: Transitive Closure

CS473 - Algorithms I

Unit 4: Dynamic Programming

Analysis of Algorithms I: Optimal Binary Search Trees

Minimizing the Number of Machines in a Unit-Time Scheduling Problem


1.2 Solving a System of Linear Equations

2. (a) Explain the strassen s matrix multiplication. (b) Write deletion algorithm, of Binary search tree. [8+8]

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES

COMMUTATIVE RINGS. Definition: A domain is a commutative ring R that satisfies the cancellation law for multiplication:

Discrete Optimization

Math 115A HW4 Solutions University of California, Los Angeles. 5 2i 6 + 4i. (5 2i)7i (6 + 4i)( 3 + i) = 35i + 14 ( 22 6i) = i.

Scheduling Shop Scheduling. Tim Nieberg

Lemma 5.2. Let S be a set. (1) Let f and g be two permutations of S. Then the composition of f and g is a permutation of S.

Closest Pair Problem

Solution to Homework 2

Chapter 7. Matrices. Definition. An m n matrix is an array of numbers set out in m rows and n columns. Examples. (

Systems of Linear Equations

OPTIMAL BINARY SEARCH TREES

1 Introduction to Matrices

Matrix Algebra. Some Basic Matrix Laws. Before reading the text or the following notes glance at the following list of basic matrix algebra laws.

Near Optimal Solutions

Vectors VECTOR PRODUCT. Graham S McDonald. A Tutorial Module for learning about the vector product of two vectors. Table of contents Begin Tutorial

NOTES ON LINEAR TRANSFORMATIONS

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Linear Algebra Notes for Marsden and Tromba Vector Calculus

3.2 Roulette and Markov Chains

Lecture 18: Applications of Dynamic Programming Steven Skiena. Department of Computer Science State University of New York Stony Brook, NY

Approximation Algorithms

Matrix Representations of Linear Transformations and Changes of Coordinates

Part 2: Community Detection

Here are some examples of combining elements and the operations used:

24. The Branch and Bound Method

Many algorithms, particularly divide and conquer algorithms, have time complexities which are naturally

LECTURE 4. Last time: Lecture outline

On an algorithm for classification of binary self-dual codes with minimum distance four

An example of a computable

5. Orthogonal matrices

AXIOMS FOR INVARIANT FACTORS*

2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system

University of Lille I PC first year list of exercises n 7. Review

Single machine parallel batch scheduling with unbounded capacity

MAT188H1S Lec0101 Burbulla

Lecture 3: Finding integer solutions to systems of linear equations

Dynamic programming. Chapter Shortest paths in dags, revisited S C A B D E

WRITING PROOFS. Christopher Heil Georgia Institute of Technology

5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1

CMSC 451 Design and Analysis of Computer Algorithms 1

Dynamic Programming Problem Set Partial Solution CMPSC 465

Mathematical finance and linear programming (optimization)

1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let

Optimization - Elements of Calculus - Basics of constrained and unconstrained optimization

Continued Fractions and the Euclidean Algorithm

Lecture 5 Principal Minors and the Hessian

15 Prime and Composite Numbers

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

9th Max-Planck Advanced Course on the Foundations of Computer Science (ADFOCS) Primal-Dual Algorithms for Online Optimization: Lecture 1

Kevin James. MTHSC 412 Section 2.4 Prime Factors and Greatest Comm

1.4 Compound Inequalities

Matrix Differentiation

6. Cholesky factorization

2 When is a 2-Digit Number the Sum of the Squares of its Digits?

Lecture notes on linear algebra

LS.6 Solution Matrices

1. Prove that the empty set is a subset of every set.

7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix

7 Gaussian Elimination and LU Factorization

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

v w is orthogonal to both v and w. the three vectors v, w and v w form a right-handed set of vectors.

Dantzig-Wolfe bound and Dantzig-Wolfe cookbook

Discrete Mathematics Problems

Lecture 2 Matrix Operations

Statistical machine learning, high dimension and big data

8 Divisibility and prime numbers

Factoring Algorithms

Unit 18 Determinants

MATH10040 Chapter 2: Prime and relatively prime numbers

1 Homework 1. [p 0 q i+j p i 1 q j+1 ] + [p i q j ] + [p i+1 q j p i+j q 0 ]

csci 210: Data Structures Recursion

SPECTRAL POLYNOMIAL ALGORITHMS FOR COMPUTING BI-DIAGONAL REPRESENTATIONS FOR PHASE TYPE DISTRIBUTIONS AND MATRIX-EXPONENTIAL DISTRIBUTIONS

Linear Maps. Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (February 5, 2007)

2.3 Convex Constrained Optimization Problems

8 Primes and Modular Arithmetic

Abstract: We describe the beautiful LU factorization of a square matrix (or how to write Gaussian elimination in terms of matrix multiplication).

Direct Methods for Solving Linear Systems. Matrix Factorization

Incremental Network Design with Shortest Paths

Factorization Theorems

Linear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S

recursion, O(n), linked lists 6/14

3. Mathematical Induction

Introduction to Markov Chain Monte Carlo

Practice with Proofs

Lecture 13: The Knapsack Problem

Definition Given a graph G on n vertices, we define the following quantities:

RECURSIVE ENUMERATION OF PYTHAGOREAN TRIPLES

Transcription:

Matrix-Chain Multiplication Let A be an n by m matrix, let B be an m by p matrix, then C = AB is an n by p matrix. C = AB can be computed in O(nmp) time, using traditional matrix multiplication. Suppose I want to compute A 1 A 2 A 3 A 4. Matrix Multiplication is associative, so I can do the multiplication in several different orders. Example: A 1 is 10 by 100 matrix A 2 is 100 by 5 matrix A 3 is 5 by 50 matrix A 4 is 50 by 1 matrix A 1 A 2 A 3 A 4 is a 10 by 1 matrix

Example A 1 is 10 by 100 matrix A 2 is 100 by 5 matrix A 3 is 5 by 50 matrix A 4 is 50 by 1 matrix A 1 A 2 A 3 A 4 is a 10 by 1 matrix 5 different orderings = 5 different parenthesizations (A 1 (A 2 (A 3 A 4 ))) ((A 1 A 2 )(A 3 A 4 )) (((A 1 A 2 )A 3 )A 4 ) ((A 1 (A 2 A 3 ))A 4 ) (A 1 ((A 2 A 3 )A 4 )) Each parenthesization is a different number of mults Let A ij = A i A j

Example A 1 is 10 by 100 matrix, A 2 is 100 by 5 matrix, A 3 is 5 by 50 matrix, A 4 is 50 by 1 matrix, A 1 A 2 A 3 A 4 is a 10 by 1 matrix. (A 1 (A 2 (A 3 A 4 ))) A 34 = A 3 A 4, 250 mults, result is 5 by 1 A 24 = A 2 A 34, 500 mults, result is 100 by 1 A 14 = A 1 A 24, 1000 mults, result is 10 by 1 Total is 1750 ((A 1 A 2 )(A 3 A 4 )) A 12 = A 1 A 2, 5000 mults, result is 10 by 5 A 34 = A 3 A 4, 250 mults, result is 5 by 1 A 14 = A 12 A 34 ), 50 mults, result is 10 by 1 Total is 5300 (((A 1 A 2 )A 3 )A 4 ) A 12 = A 1 A 2, 5000 mults, result is 10 by 5 A 13 = A 12 A 3, 2500 mults, result is 10 by 50 A 14 = A 13 A 4, 500 mults, results is 10 by 1 Total is 8000

Example A 1 is 10 by 100 matrix, A 2 is 100 by 5 matrix, A 3 is 5 by 50 matrix, A 4 is 50 by 1 matrix, A 1 A 2 A 3 A 4 is a 10 by 1 matrix. ((A 1 (A 2 A 3 ))A 4 ) A 23 = A 2 A 3, 25000 mults, result is 100 by 50 A 13 = A 1 A 23, 50000 mults, result is 10 by 50 A 14 = A 13 A 4, 500 mults, results is 10 by Total is 75500 (A 1 ((A 2 A 3 )A 4 )) A 23 = A 2 A 3, 25000 mults, result is 100 by 50 A 24 = A 23 A 4, 5000 mults, result is 100 by 1 A 14 = A 1 A 24, 1000 mults, result is 10 by 1 Total is 31000 Conclusion Order of operations makes a huge difference. How do we compute the minimum?

One approach Parenthesization A product of matrices is fully parenthesized if it is either a single matrix, or a product of two fully parenthesized matrices, surrounded by parentheses Each parenthesization defines a set of n-1 matrix multiplications. We just need to pick the parenthesization that corresponds to the best ordering. How many parenthesizations are there?

One approach Parenthesization A product of matrices is fully parenthesized if it is either a single matrix, or a product of two fully parenthesized matrices, surrounded by parentheses Each parenthesization defines a set of n-1 matrix multiplications. We just need to pick the parenthesization that corresponds to the best ordering. How many parenthesizations are there? Let P(n) be the number of ways to parenthesize n matrices. P (n) = n 1 k=1 P (k)p (n k) if n 2 1 if n = 1 This recurrence is related to the Catalan numbers, and solves to P (n) = Ω(4 n /n 3/2 ). Conclusion Trying all possible parenthesizations is a bad idea.

Use dynamic programming 1. Characterize the structure of an optimal solution 2. Recursively define the value of an optimal solution 3. Compute the value of an optimal solution bottom-up 4. Construct an optimal solution from the computed information Structure of an optimal solution If the outermost parenthesization is ((A 1 A 2 A i )(A i+1 A n )) then the optimal solution consists of solving A 1i and A i+1,n optimally and then combining the solutions.

Proof Structure of an optimal solution If the outermost parenthesization is ((A 1 A 2 A i )(A i+1 A n )) then the optimal solution consists of solving A 1i and A i+1,n optimally and then combining the solutions. Proof: Consider an optimal algorithm that does not solve A 1i optimally. Let x be the number of multiplications it does to solve A 1i, y be the number of multiplications it does to solve A i+1,n, and z be the number of multiplications it does in the final step. The total number of multiplications is therefore x + y + z. But since it is not solving A 1i optimally, there is a way to solve A 1i using x < x multiplications. If we used this optimal algorithm instead of our current one for A 1i, we would do x + y + z < x + y + z multiplications and therefore have a better algorithm, contradicting the fact that our algorithms is optimal.

Proof Proof: Consider an optimal algorithm that does not solve A 1i optimally. Let x be the number of multiplications it does to solve A 1i, y be the number of multiplications it does to solve A i+1,n, and z be the number of multiplications it does in the final step. The total number of multiplications is therefore x + y + z. But since it is not solving A 1i optimally, there is a way to solve A 1i using x < x multiplications. If we used this optimal algorithm instead of our current one for A 1i, we would do x + y + z < x + y + z multiplications and therefore have a better algorithm, contradicting the fact that our algorithms is optimal. Meta-proof that is not a correct proof Our problem consists of subproblems, assume we didn t solve the subproblems optimally, then we could just replace them with an optimal subproblem solution and have a better solution.

Recursive solution In the enumeration of the P (n) = Ω(4 n /n 3/2 ) subproblems, how many unique subproblems are there?

Recursive solution In the enumeration of the P (n) = Ω(4 n /n 3/2 ) subproblems, how many unique subproblems are there? Answer: A subproblem is of the form A ij with 1 i, j n, so there are O(n 2 ) subproblems! Notation Let A i be p i 1 by p i. Let m[i, j] be the cost of computing A ij If the final multiplication for A ij is A ij = A ik A k+1,j then m[i, j] = m[i, k] + m[k + 1, j] + p i 1 p k p j. We don t know k a priori, so we take the minimum m[i, j] = min {m[i, k] + m[k + 1, j] + p i 1p k p j } i k<j 0 if i = j, if i < j Direct recursion on this does not work! We must use the fact that there are at most O(n 2 ) different calls. What is the order?

The final code Matrix-Chain-Order(p) 1 n length[p] 1 2 for i 1 to n 3 do m[i, i] 0 4 for l 2 to n l is the chain length. 5 do for i 1 to n l + 1 6 do j i + l 1 7 m[i, j] 8 for k i to j 1 9 do q m[i, k] + m[k + 1, j] + p i 1 p k p j 10 if q < m[i, j] 11 then m[i, j] q 12 s[i, j] k 13 return m and s