Yousef Saad University of Minnesota Computer Science and Engineering. CRM Montreal - April 30, 2008



Similar documents
6. Cholesky factorization

Numerical Methods I Solving Linear Systems: Sparse Matrices, Iterative Methods and Non-Square Systems

DATA ANALYSIS II. Matrix Algorithms

7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix

Iterative Solvers for Linear Systems

A note on fast approximate minimum degree orderings for symmetric matrices with some dense rows

AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS

10.2 ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS. The Jacobi Method

Multigrid preconditioning for nonlinear (degenerate) parabolic equations with application to monument degradation

Elementary Matrices and The LU Factorization

FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG

Direct Methods for Solving Linear Systems. Matrix Factorization

SOLVING LINEAR SYSTEMS

Solution of Linear Systems

Scalable Distributed Schur Complement Solvers for Internal and External Flow Computations on Many-Core Architectures

SECTIONS NOTES ON GRAPH THEORY NOTATION AND ITS USE IN THE STUDY OF SPARSE SYMMETRIC MATRICES

Fast Multipole Method for particle interactions: an open source parallel library component

HSL and its out-of-core solver

Matrix Multiplication

TESLA Report

1 Introduction to Matrices

Lecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method

Mesh Generation and Load Balancing

7 Gaussian Elimination and LU Factorization

Social Media Mining. Graph Essentials

5. Orthogonal matrices

General Framework for an Iterative Solution of Ax b. Jacobi s Method

Distributed block independent set algorithms and parallel multilevel ILU preconditioners

P013 INTRODUCING A NEW GENERATION OF RESERVOIR SIMULATION SOFTWARE

Factorization Theorems

Home Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit

Abstract: We describe the beautiful LU factorization of a square matrix (or how to write Gaussian elimination in terms of matrix multiplication).

A matrix-free preconditioner for sparse symmetric positive definite systems and least square problems

AN INTERFACE STRIP PRECONDITIONER FOR DOMAIN DECOMPOSITION METHODS

CS3220 Lecture Notes: QR factorization and orthogonal transformations

How To Solve The Fmfontham Equation In A Two Level Iterative Computer Science Project

P164 Tomographic Velocity Model Building Using Iterative Eigendecomposition

Technical Report. Parallel iterative solution method for large sparse linear equation systems. Rashid Mehmood, Jon Crowcroft. Number 650.

Performance of Dynamic Load Balancing Algorithms for Unstructured Mesh Calculations

Modification of the Minimum-Degree Algorithm by Multiple Elimination

Solving Linear Systems of Equations. Gerald Recktenwald Portland State University Mechanical Engineering Department

Approximation Algorithms

Network (Tree) Topology Inference Based on Prüfer Sequence

SHARP BOUNDS FOR THE SUM OF THE SQUARES OF THE DEGREES OF A GRAPH

(67902) Topics in Theory and Complexity Nov 2, Lecture 7

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1

Preconditioning Sparse Matrices for Computing Eigenvalues and Solving Linear Systems of Equations. Tzu-Yi Chen

Computational Optical Imaging - Optique Numerique. -- Deconvolution --

5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1

Data Mining: Algorithms and Applications Matrix Math Review

ALGEBRAIC EIGENVALUE PROBLEM

On fast factorization pivoting methods for sparse symmetric indefinite systems

MATH APPLIED MATRIX THEORY

RESEARCH OF MULTICORE-BASED PARALLEL GABP ALGORITHM WITH DYNAMIC LOAD-BALANCE

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

POISSON AND LAPLACE EQUATIONS. Charles R. O Neill. School of Mechanical and Aerospace Engineering. Oklahoma State University. Stillwater, OK 74078

Approximating the Minimum Chain Completion problem

AN ALGORITHM FOR DETERMINING WHETHER A GIVEN BINARY MATROID IS GRAPHIC

Solving polynomial least squares problems via semidefinite programming relaxations

Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014

Practical Guide to the Simplex Method of Linear Programming

High Performance Computing for Operation Research

1 Determinants and the Solvability of Linear Systems

Similarity and Diagonalization. Similar Matrices

Using row reduction to calculate the inverse and the determinant of a square matrix

Lecture 2 Linear functions and examples

Lecture 12: Partitioning and Load Balancing

PARALLEL ALGORITHMS FOR PREDICTIVE MODELLING

Introduction to the Finite Element Method

Linear Algebra Review. Vectors

26. Determinants I. 1. Prehistory

(Quasi-)Newton methods

Binary Image Reconstruction

Load balancing. David Bindel. 12 Nov 2015

Chapter 6: Solving Large Systems of Linear Equations

MATRICES WITH DISPLACEMENT STRUCTURE A SURVEY

6.3 Multigrid Methods

MAT188H1S Lec0101 Burbulla

CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION

A Load Balancing Tool for Structured Multi-Block Grid CFD Applications

Applied Algorithm Design Lecture 5

1 Solving LPs: The Simplex Algorithm of George Dantzig

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).

Object-oriented scientific computing

Big Data Optimization at SAS

LINEAR ALGEBRA. September 23, 2010

OPTIMAL DESIGN OF DISTRIBUTED SENSOR NETWORKS FOR FIELD RECONSTRUCTION

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

A linear combination is a sum of scalars times quantities. Such expressions arise quite frequently and have the form

Toward a New Metric for Ranking High Performance Computing Systems

Similar matrices and Jordan form

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm.

Notes on Cholesky Factorization

1 Finite difference example: 1D implicit heat equation

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 10

A Direct Numerical Method for Observability Analysis

Transcription:

A tutorial on: Iterative methods for Sparse Matrix Problems Yousef Saad University of Minnesota Computer Science and Engineering CRM Montreal - April 30, 2008

Outline Part 1 Sparse matrices and sparsity Basic iterative techniques Projection methods Krylov subspace methods Part 2 Preconditioned iterations Preconditioning techniques Part 3 Parallel implementations Multigrid methods Part 4 Eigenvalue problems Applications CRM April 30, 2008 2

INTRODUCTION TO SPARSE MATRICES

Typical Problem: Physical Model Nonlinear PDEs Discretization Linearization (Newton) Sequence of Sparse Linear Systems Ax = b CRM April 30, 2008 4

Introduction: Linear System Solvers Problem considered: Linear systems Ax = b Can view the problem from somewhat different angles: Discretized problem coming from a PDE An algebraic system of equations [ignore origin] Sometimes a system of equations where A is not explicitly available CRM April 30, 2008 5

Direct sparse Solvers A x = b u = f + bc Iterative Methods Preconditioned Krylov General Purpose Specialized Fast Poisson Solvers Multigrid Methods CRM April 30, 2008 6

Introduction: Linear System Solvers Much of recent work on solvers has focussed on: (1) Parallel implementation scalable performance (2) Improving Robustness, developing more general preconditioners CRM April 30, 2008 7

A few observations Problems are getting harder for Sparse Direct methods (more 3-D models, much bigger problems,..) Problems are also getting difficult for iterative methods Cause: more complex models - away from Poisson Researchers in iterative methods are borrowing techniques from direct methods: preconditioners The inverse is also happening: Direct methods are being adapted for use as preconditioners CRM April 30, 2008 8

What are sparse matrices? Common definition:..matrices that allow special techniques to take advantage of the large number of zero elements and the structure. A few applications of sparse matrices: Structural Engineering, Reservoir simulation, Electrical Networks, optimization problems,... Goals: Much less storage and work than dense computations. Observation: A 1 is usually dense, but L and U in the LU factorization may be reasonably sparse (if a good technique is used). CRM April 30, 2008 9

Nonzero patterns of a few sparse matrices ARC130: Unsymmetric matrix from laser problem. a.r.curtis, oct 1974 SHERMAN5: fully implicit black oil simulator 16 by 23 by 3 grid, 3 unk

PORES3: Unsymmetric MATRIX FROM PORES BP_1000: UNSYMMETRIC BASIS FROM LP PROBLEM BP CRM April 30, 2008 11

Two types of matrices: structured (e.g. Sherman5) and unstructured (e.g. BP 1000) Main goal of Sparse Matrix Techniques: To perform standard matrix computations economically i.e., without storing the zeros of the matrix. Example: To add two square dense matrices of size n requires O(n 2 ) operations. To add two sparse matrices A and B requires O(nnz(A) + nnz(b)) where nnz(x) = number of nonzero elements of a matrix X. For typical Finite Element /Finite difference matrices, number of nonzero elements is O(n). CRM April 30, 2008 12

Graph Representations of Sparse Matrices Graph theory is a fundamental tool in sparse matrix techniques. Graph G = (V, E) of an n n matrix A defined by Vertices V = {1, 2,..., N}. Edges E = {(i, j) a ij 0}. Graph is undirected if matrix has symmetric structure: a ij 0 iff a ji 0. CRM April 30, 2008 13

CRM April 30, 2008 14

Example: Adjacency graph of: A =. Example: For any matrix A, what is the graph of A 2? [interpret in terms of paths in the graph of A] CRM April 30, 2008 15

Direct versus iterative methods Background. Two types of methods: Direct methods : based on sparse Gaussian eimination, sparse Cholesky,.. Iterative methods: compute a sequence of iterates which converge to the solution - preconditioned Krylov methods.. Remark: These two classes of methods have always been in competition. 40 years ago solving a system with n = 10, 000 was a challenge Now you can solve this in < 1 sec. on a laptop. CRM April 30, 2008 16

Sparse direct methods made huge gains in efficiency. As a result they are very competitive for 2-D problems. 3-D problems lead to more challenging systems [inherent to the underlying graph] Problems with many unknowns per grid point similar to 3-D problems Remarks: No robust black-box iterative solvers. Robustness often conflicts with efficiency However, situation improved in last decade Line between direct and iterative solvers blurring CRM April 30, 2008 17

Direct Sparse Matrix Techniques Principle of sparse matrix techniques: Store only the nonzero elements of A. Try to minimize computations and (perhaps more importantly) storage. Difficulty in Gaussian elimination: Fill-in Trivial Example: A = + + + + + + + + + + + + + + + + CRM April 30, 2008 18

Reorder equations and unknowns in order N, N 1,..., 1 A stays sparse during Gaussian eliminatin i.e., no fill-in. A = + + + + + + + + + + + + + + + + Finding the best ordering to minimize fill-in is NP-complete. A number of heuristics developed. Among the best known: Minimum degree ordering (Tinney Scheme 2) Nested Dissection Ordering. Approximate Minimal Degree... CRM April 30, 2008 19

Reorderings and graphs Let π = {i 1,, i n } a permutation A π, = { a π(i),j }i,j=1,...,n = matrix A with its i-th row replaced by row number π(i). A,π = matrix A with its j-th column replaced by column π(j). Define P π = I π, = Permutation matrix Then: (1) Each row (column) of P π consists of zeros and exactly one 1 (2) A π, = P π A (3) P π Pπ T = I (4) A,π = APπ T CRM April 30, 2008 20

Consider now: A = A π,π = P π AP T π Entry (i, j) in matrix A is exactly entry in position (π(i), π(j)) in A, i.e., (a ij = a π(i),π(j)) (i, j) E A (π(i), π(j)) E A General picture : i j π i π j CRM April 30, 2008 21

Example A 9 9 arrow matrix and its adjacency graph. 5 6 4 7 1 3 8 2 9 CRM April 30, 2008 22

Graph and matrix after permuting the nodes in reverse order. 3 4 2 5 9 1 6 8 7 CRM April 30, 2008 23

Cuthill-McKee & reverse Cuthill-McKee A class of reordering techniques proceeds by levels in the graph. Related to Breadth First Search (BFS) traversal in graph theory. Idea of BFS is to visit the nodes by levels. Level 0 = level of starting node. Start with a node, visit its neighbors, then the (unmarked) neighbors of its neighbors, etc... CRM April 30, 2008 24

Example: A B H I *--------*-----*-------* / / / / BFS from node A: / Level 0: A C*--------* D / Level 1: B, C; \ / Level 2: E, D, H; \ * K Level 3: I, K, E, F, G, H. F \ E *-------*----* G \ / \ / \ / * H CRM April 30, 2008 25

Implementation using levels Algorithm BF S(G, v) by level sets Initialize S = {v}, seen = 1; Mark v; While seen < n Do S new = ; For each node v in S do For each unmarked w in adj(v) do Add w to S new ; Mark w; seen + +; S := S new CRM April 30, 2008 26

A few properties of Breadth-First-Search If G is a connected undirected graph then each vertex will be visited once each edge will be inspected at least once Therefore, for a connected undirected graph, The cost of BFS is O( V + E ) Distance = level number; For each node v we have: min dist(s, v) = level number(v) = depth T (v) Several reordering algorithms are based on variants of Breadth- First-Search CRM April 30, 2008 27

Cuthill McKee ordering Algorithm proceeds by levels. Same as BFS except: in each level, nodes are ordered by increasing degree Example A B E D G C F Level Nodes Deg. Order 0 A 2 A 1 B, C 4, 3 C, B 2 D, E, F 3, 4, 2 F, D, E 3 G 2 G CRM April 30, 2008 28

ALGORITHM : 1 Cuthill Mc Kee ordering 0. Find an intial node for the traversal 1. Initialize S = {v}, seen = 1, π(seen) = v; Mark v; 2. While seen < n Do 3. S new = ; 4. For each node v, going from lowest to highest degree, Do: 5. π(+ + seen) = v; 6. For each unmarked w in adj(v) do 7. Add w to S new ; 8. Mark w; 9. EndDo 10. S := S new 11. EndDo 12. EndWhile CRM April 30, 2008 29

Reverse Cuthill McKee ordering The Cuthill - Mc Kee ordering has a tendency to create small arrow matrices (going the wrong way): Origimal matrix CM ordering 0 0 10 10 20 20 30 30 40 40 50 50 60 60 70 70 0 10 20 30 40 50 60 70 nz = 377 0 10 20 30 40 50 60 70 nz = 377 CRM April 30, 2008 30

Idea: Take the reverse ordering 0 RCM ordering 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70 nz = 377 Reverse Cuthill M Kee ordering (RCM). CRM April 30, 2008 31

Nested Dissection ordering The idea of divide and conquer recursively divide graph in two using a separator. 3 5 1 7 4 2 6 CRM April 30, 2008 32

Nested dissection for a small mesh Original Grid First dissection

Second Dissection Third Dissection CRM April 30, 2008 34

Nested dissection: cost for a regular mesh In 2-D consider an n n problem, N = n 2 In 3-D consider an n n n problem, N = n 3 2-D 3-D space (fill) O(N log N) O(N 4/3 ) time (flops) O(N 3/2 ) O(N 2 ) Significant difference in complexity between 2-D and 3-D CRM April 30, 2008 35

Ordering techniques for direct methods in practice In practice: Nested dissection (+ variants) is preferred for parallel processing Good implementations of Min. Degree algorithm work well in practice. Currently AMD and AMF are best known implementations/variants/ Best practical reordering algorithms usually combine Nested dissection and min. degree algorithms. CRM April 30, 2008 36

BASIC RELAXATION METHODS

Basic Relaxation Schemes Relaxation schemes: based on the decomposition A = D E F - E - F D D = diag(a), E = strict lower part of A and F its strict upper part. Gauss-Seidel iteration for solving Ax = b: (D E)x (k+1) = F x (k) + b idea: correct the j-th component of the current approximate solution, j = 1, 2,..n, to zero the j th component of residual. CRM April 30, 2008 38

Can also define a backward Gauss-Seidel Iteration: (D F )x (k+1) = Ex (k) + b and a Symmetric Gauss-Seidel Iteration: forward sweep followed by backward sweep. Over-relaxation is based on the decomposition: ωa = (D ωe) (ωf + (1 ω)d) successive overrelaxation, (SOR): (D ωe)x (k+1) = [ωf + (1 ω)d]x (k) + ωb CRM April 30, 2008 39

Iteration matrices Jacobi, Gauss-Seidel, SOR, & SSOR iterations are of the form x (k+1) = Mx (k) + f M Jac = D 1 (E + F ) = I D 1 A M GS (A) = (D E) 1 F == I (D E) 1 A M SOR (A) = (D ωe) 1 (ωf +(1 ω)d) = I (ω 1 D E) 1 A M SSOR (A) = I (2ω 1 1)(ω 1 D F ) 1 D(ω 1 D E) 1 A = I ω(2ω 1)(D ωf ) 1 D(D ωe) 1 A CRM April 30, 2008 40

General convergence result Consider the iteration: x (k+1) = Gx (k) + f (1) Assume that ρ(a) < 1. Then I G is non-singular and G has a fixed point. Iteration converges to a fixed point for any f and x (0). (2) If iteration converges for any f and x (0) then ρ(g) < 1. Example: Richardson s iteration x (k+1) = x (k) + α(b A (k) ) Assume Λ(A) R. When does the iteration converge? Jacobi and Gauss-Seidel converge for diagonal dominant A SOR converges for 0 < ω < 2 for SPD matrices CRM April 30, 2008 41

An observation. Introduction to Preconditioning The iteration x (k+1) = Mx (k) + f is attempting to solve (I M)x = f. Since M is of the form M = I P 1 A this system can be rewritten as P 1 Ax = P 1 b where for SSOR, we have P SSOR = (D ωe)d 1 (D ωf ) referred to as the SSOR preconditioning matrix. In other words: Relaxation Scheme Preconditioned Fixed Point Iteration CRM April 30, 2008 42