A matrix-free preconditioner for sparse symmetric positive definite systems and least square problems

Similar documents
6. Cholesky factorization

Numerical Methods I Solving Linear Systems: Sparse Matrices, Iterative Methods and Non-Square Systems

Yousef Saad University of Minnesota Computer Science and Engineering. CRM Montreal - April 30, 2008

DATA ANALYSIS II. Matrix Algorithms

ALGEBRAIC EIGENVALUE PROBLEM

Numerical Methods I Eigenvalue Problems

7 Gaussian Elimination and LU Factorization

7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix

CS3220 Lecture Notes: QR factorization and orthogonal transformations

Orthogonal Diagonalization of Symmetric Matrices

LINEAR ALGEBRA. September 23, 2010

Inner Product Spaces and Orthogonality

Chapter 6. Orthogonality

P164 Tomographic Velocity Model Building Using Iterative Eigendecomposition

(Quasi-)Newton methods

MATH 423 Linear Algebra II Lecture 38: Generalized eigenvectors. Jordan canonical form (continued).

Nonlinear Iterative Partial Least Squares Method

MODIFIED INCOMPLETE CHOLESKY FACTORIZATION FOR SOLVING ELECTROMAGNETIC SCATTERING PROBLEMS

SOLVING LINEAR SYSTEMS

Similarity and Diagonalization. Similar Matrices

General Framework for an Iterative Solution of Ax b. Jacobi s Method

Applied Linear Algebra I Review page 1

by the matrix A results in a vector which is a reflection of the given

Similar matrices and Jordan form

3 Orthogonal Vectors and Matrices

Examination paper for TMA4205 Numerical Linear Algebra

Computational Optical Imaging - Optique Numerique. -- Deconvolution --

Linear Algebra Review. Vectors

SALEM COMMUNITY COLLEGE Carneys Point, New Jersey COURSE SYLLABUS COVER SHEET. Action Taken (Please Check One) New Course Initiated

Solution of Linear Systems

Solving polynomial least squares problems via semidefinite programming relaxations

Manifold Learning Examples PCA, LLE and ISOMAP

Vector and Matrix Norms

x = + x 2 + x

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #18: Dimensionality Reduc7on

Performance of First- and Second-Order Methods for Big Data Optimization

Lecture 5: Singular Value Decomposition SVD (1)

Notes on Orthogonal and Symmetric Matrices MENU, Winter 2013

LU Factoring of Non-Invertible Matrices

Notes on Symmetric Matrices

ANALYSIS, THEORY AND DESIGN OF LOGISTIC REGRESSION CLASSIFIERS USED FOR VERY LARGE SCALE DATA MINING

Interior-Point Algorithms for Quadratic Programming

October 3rd, Linear Algebra & Properties of the Covariance Matrix

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

[1] Diagonal factorization

Introduction to Matrix Algebra

Using row reduction to calculate the inverse and the determinant of a square matrix

Derivative Free Optimization

NUMERICAL METHODS FOR LARGE EIGENVALUE PROBLEMS

1 Solving LPs: The Simplex Algorithm of George Dantzig

Mean value theorem, Taylors Theorem, Maxima and Minima.

Multigrid preconditioning for nonlinear (degenerate) parabolic equations with application to monument degradation

Notes on Cholesky Factorization

Politecnico di Torino. Porto Institutional Repository

Linear Algebra: Determinants, Inverses, Rank

5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1

AN INTERFACE STRIP PRECONDITIONER FOR DOMAIN DECOMPOSITION METHODS

Nonlinear Programming Methods.S2 Quadratic Programming

Preconditioning Sparse Matrices for Computing Eigenvalues and Solving Linear Systems of Equations. Tzu-Yi Chen

Notes on Determinant

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Factorization Theorems

MATH 304 Linear Algebra Lecture 9: Subspaces of vector spaces (continued). Span. Spanning set.

Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers

1 Introduction to Matrices

A note on fast approximate minimum degree orderings for symmetric matrices with some dense rows

Math 312 Homework 1 Solutions

LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu

Numerisches Rechnen. (für Informatiker) M. Grepl J. Berger & J.T. Frings. Institut für Geometrie und Praktische Mathematik RWTH Aachen

IRREDUCIBLE OPERATOR SEMIGROUPS SUCH THAT AB AND BA ARE PROPORTIONAL. 1. Introduction

FAST EXACT AFFINE PROJECTION ALGORITHM USING DISPLACEMENT STRUCTURE THEORY. Manolis C. Tsakiris and Patrick A. Naylor

The Characteristic Polynomial

Solving Linear Systems of Equations. Gerald Recktenwald Portland State University Mechanical Engineering Department

5. Orthogonal matrices

Recall the basic property of the transpose (for any A): v A t Aw = v w, v, w R n.

Section Inner Products and Norms

It s Not A Disease: The Parallel Solver Packages MUMPS, PaStiX & SuperLU

The Geometry of Polynomial Division and Elimination

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

MAT 242 Test 2 SOLUTIONS, FORM T

13 MATH FACTS a = The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions.

SPECTRAL POLYNOMIAL ALGORITHMS FOR COMPUTING BI-DIAGONAL REPRESENTATIONS FOR PHASE TYPE DISTRIBUTIONS AND MATRIX-EXPONENTIAL DISTRIBUTIONS

Practical Guide to the Simplex Method of Linear Programming

The Singular Value Decomposition in Symmetric (Löwdin) Orthogonalization and Data Compression

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder

A Direct Numerical Method for Observability Analysis

Numerical Analysis Lecture Notes

MAT188H1S Lec0101 Burbulla

Continuity of the Perron Root

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 10

Au = = = 3u. Aw = = = 2w. so the action of A on u and w is very easy to picture: it simply amounts to a stretching by 3 and 2, respectively.

Data Mining: Algorithms and Applications Matrix Math Review

Transcription:

A matrix-free preconditioner for sparse symmetric positive definite systems and least square problems Stefania Bellavia Dipartimento di Ingegneria Industriale Università degli Studi di Firenze Joint work with Jacek Gondzio and Benedetta Morini Lavoro svolto nellambito del Progetto INdAM-GNCS 2012 Metodi e software numerici per il precondizionamento di sistemi lineari nella risoluzione di PDE e di problemi di ottimizzazione Algebra Lineare Numerica e sue Applicazioni, Rome 29-31 Jan 2013 Stefania Bellavia, Università di Firenze 1 / 28

Introduction The Problem Consider systems of the form Hx = b, with H R m m spd Special interest in the case H = AΘA T with A R m n sparse and Θ R n n diagonal spd They arise in at least two prominent applications in the area of optimization: Newton-like methods for weighted least-squares problems, interior point methods Stefania Bellavia, Università di Firenze 2 / 28

Introduction We assume that H is too large and/or too difficult to be formed and solved directly We will solve it using an iterative Conjugate Gradient (CG) like approach We are interested in preconditioning H with a reliable algorithm that does not require forming the whole matrix H at a time (matrix-free) We are also interested in solving sequences of linear systems arising in optimization methods Stefania Bellavia, Università di Firenze 3 / 28

Introduction Preconditioning H Incomplete Cholesky (IC) factorizations are matrix-free in the sense that the columns of H can be computed one at a time, and then discarded Breakdown-free when H is an H-matrix IC factorizations relying on drop tolerances to reduce fill-in have unpredictable memory requirements Alternative approaches with predictable memory requirements depend on the entries of H, [Jones, Plassmann, ACM Trans Math Software 1995], [Lin, Moré, SISC 1999] Eg, let n k = nnz(tril(h(:, k), 1)) and retain the n k + p largest elements in the strict lower triangular part of the kth column of the factor, for some fixed p > 0 High storage requirements if H is dense Stefania Bellavia, Università di Firenze 4 / 28

Introduction Preconditioning H Approximate Inverse preconditioners form factorized sparse approximations for H 1 The Stabilized Approximate Inverse preconditioner (SAINV) by [Benzi, Cullum, Tuma, SISC 2000] is based on a modified Gram-Schmidt process It is matrix-free, ie it employs H multiplicatively and may work entirely with A T It preserves sparsity in the factors by dropping small elements In exact arithmetic, it is applicable to any SPD matrix without breakdowns The underlying assumption is that most entries of H 1 are small in magnitude Stefania Bellavia, Università di Firenze 5 / 28

Introduction Properties of our preconditioner Limited memory: memory bounded by O(m) rather than O(nz(H)) Matrix free: only the action of H on a vector is needed Only a small number k m of general matrix-vector products is required The diagonal of H or its approximation is needed: we expect that in many practical applications we will be able to compute or estimate the diagonal of H at low cost Stefania Bellavia, Università di Firenze 6 / 28

Introduction Properties of our preconditioner Limited memory: memory bounded by O(m) rather than O(nz(H)) Matrix free: only the action of H on a vector is needed Only a small number k m of general matrix-vector products is required The diagonal of H or its approximation is needed: we expect that in many practical applications we will be able to compute or estimate the diagonal of H at low cost PARTIAL CHOLESKY + DEFLATED CG Stefania Bellavia, Università di Firenze 6 / 28

LMP Preconditioner The preconditioner Partial Cholesky factorization limited to a small number k of columns of H + diagonal approximation of the Schur complement, [Gondzio, COAP 2011] 1 Choose k m Consider the formal partition of H [ H11 H H = 21 T H 21 H 22 ], H 11 R k k, H 21 R (m k) k, H 22 R (m k) (m k) 2 Form the first k columns of H, ie H 11, H 21 Stefania Bellavia, Università di Firenze 7 / 28

The preconditioner The preconditioner ced 3 Compute the Cholesky factorization [ L11 L 21 ] of H limited to [ H11 H 21 ] Compute the LDL T factorization H 11 = L 11 Q 11 L T 11 (Discard H 11) Solve L 11 Q 11 L T 21 = HT 21 for L 21, ie L 21 = H 21 L T 11 Q 1 11 (Discard H 21 ) Stefania Bellavia, Università di Firenze 8 / 28

The preconditioner The preconditioner ced 3 Compute the Cholesky factorization [ L11 L 21 ] of H limited to [ H11 H 21 ] Compute the LDL T factorization H 11 = L 11 Q 11 L T 11 (Discard H 11) Solve L 11 Q 11 L T 21 = HT 21 for L 21, ie L 21 = H 21 L T 11 Q 1 11 (Discard H 21 ) It follows where H = [ L11 L 21 I m k ] [ Q11 S S = H 22 H 21 H 1 11 HT 21, is the Schur complement of H 11 in H ] [ L T 11 L T 21 I m k ], Stefania Bellavia, Università di Firenze 8 / 28

The preconditioner The preconditioner ced 4 Set Q 22 = diag(s) = diag(h 22 ) diag(l 21 Q 11 L T 21) and P = [ L11 L 21 I m k ] } {{ } L [ Q11 Q 22 } {{ } Q ] [ L T 11 L T 21 I m k ] } {{ } L T The algorithm for constructing P has some good properties: it cannot break down in exact arithmetic; it has predictable memory requirements, nnz(l) O(km) Stefania Bellavia, Università di Firenze 9 / 28

The preconditioner Storage and computational cost The complete diagonal of H is required If it is not available and H = AΘA T : (H) ii = A T e i 2 2, i = 1,, m Storage: one (sparse) vector A T e i at a time and a vector for the diagonal of H The first k columns of H are computed and stored: He i, i = 1,, k The additional cost of this step is k products of H times a vector The products He i are cheap if H (or A) is sparse The k products He i are expected to be cheaper than the products Hv required by PCG where the vectors v involved are tipically dense Stefania Bellavia, Università di Firenze 10 / 28

The preconditioner Factorized form of P 1 By P = [ L11 L 21 I m k ] [ Q11 Q 22 ] [ L T 11 L T 21 I m k ], it follows [ P 1 L T = 11 L T 11 LT 21 0 I m k ] [ Q 1 11 0 0 Q 1 22 ] [ L 1 11 0 L 21 L 1 11 I m k ] ie a factorized sparse approximation for H 1 Stefania Bellavia, Università di Firenze 11 / 28

The preconditioner Factorized form of P 1 By P = [ L11 L 21 I m k ] [ Q11 Q 22 ] [ L T 11 L T 21 I m k ], it follows [ P 1 L T = 11 L T 11 LT 21 0 I m k ] [ Q 1 11 0 0 Q 1 22 ] [ L 1 11 0 L 21 L 1 11 I m k ] ie a factorized sparse approximation for H 1 [ ] [ ] L11 Q 1/2 11 Letting R = L 21 I m k Q 1/2 22 P 1 H is similar to the block diagonal matrix [ Ik 0 ] 0 Q22 1 we have P = R T R Stefania Bellavia, Università di Firenze 11 / 28

The preconditioner Spectral analysis of P 1 H k eigenvalues of P 1 H are equal to 1 The other eigenvalues are eigenvalues of Q 1 22 S and λ(q 1 22 S) λ min (S) λ max (Q 22 ) λ min (H) λ max (diag(s)) λ(q 1 22 S) λ max(s) λ min (Q 22 ) λ max(h 22 ) λ min (diag(s)) Stefania Bellavia, Università di Firenze 12 / 28

The preconditioner Reordering of H A greedy heuristic technique acts on the largest eigenvalues of H Since H is SPD, λ max (H) tr(h) = tr(h 11 ) + tr(h 22 ) [ ] If Q 22 = I, then P 1 Ik 0 H is similar to, and 0 S ([ ]) λ max (P 1 Ik 0 H) tr = k + tr(s) 0 S Permuting rows and columns of H so that H 11 contains the k largest elements of diag(h) would imply k + tr(s) tr(h) and a large reduction in the value of λ max (P 1 H) with respect to λ max (H) Stefania Bellavia, Università di Firenze 13 / 28

Deflated CG Handling small eigenvalues Applying the greedy technique requires no extra storage In most cases, the greedy reordering takes care of the largest eigenvalues of H and κ 2 (R 1 HR T ) is reduced considerably with respect to κ 2 (H) On the other hand, the smallest eigenvalues of H are sligtly modified or moved towards the origin When the convergence of CG (or CG-like) method is hampered by a small number of eigenvalues of P 1 H close to zero, the Preconditioned Deflated-CG or CG-like algorithm can be useful, [Saad, Yeung, Erhel, Guyomarc h, SISC 2000] Stefania Bellavia, Università di Firenze 14 / 28

Deflated CG Preconditioned Deflated-CG Let the eigenvalues of P 1 H be labeled in increasing order: λ 1 (P 1 H) λ m (P 1 H) Ideal case: Inject l exact eigenvectors of P 1 H associated to λ 1 (P 1 H),, λ l (P 1 H), into the Krylov subspace ( ) µ 1 j x x j H 2 x x 0 H, µ = λ m(p 1 H) µ + 1 λ l+1 (P 1 H) Therefore, convergence of CG method is improved if a few eigenvalues are close to the origin and well separated from the others If the l eigenvectors of P 1 H are numerically approximated, one can expect µ λ m (P 1 H)/λ l+1 (P 1 H) Stefania Bellavia, Università di Firenze 15 / 28

Deflated CG Preconditioned Deflated-CG ced Apply Deflated-CG to the split-preconditioned system R T HR 1 y = R T b, x = R 1 y using a few eigenvectors associated to the smallest eigenvalues of R T HR 1 Symmetric Lanczos processes for sparse symmetric eigenvalue problems require products of R T HR 1 times a vector Each product has the cost of one preconditioned PCG iteration To amortize the cost of approximating eigenvectors, Preconditioned Deflated-CG is suitable for solving systems with multiple right-hand sides and sequences of slowly varying linear systems Stefania Bellavia, Università di Firenze 16 / 28

Numerical results Numerical experiments We implemented the preconditioner in Matlab, ϵ m = 2 10 16 Initial guess for PCG: x 0 = (0,, 0) T Stopping criterion: Hx j b 2 10 6 b 2 A failure is declared after 1000 iterations H = AA T, 35 matrices A from the University of Florida Sparse Matrix Collection, Groups: LPnetlib, Meszaros for Linear Programming problems 1090 m 105127 220 10 5 dens(a) 650 10 3, 551 10 5 dens(h) 251 10 1 Stefania Bellavia, Università di Firenze 17 / 28

Numerical results Numerical experiments ced Experiments with SAINV preconditioner H 1 ZD 1 Z T where Z is unit upper triangular, D is diagonal Code from Sparselab package developed by M Tuma First drop tolerance tested: 10 1 In case of failure, the tolerance is progressively reduced by a factor 10 Stefania Bellavia, Università di Firenze 18 / 28

Numerical results Cost Comparison Tabella : Cost of the construction and application of LMP and SAINV Type Construction Application LMP m sparse-to-sparse products Θ 1/2 (A T e i ) 2 backsolves with L 11 k sparse-to-sparse products AΘ(A T e i ) 1 mat-vec product with D 1 m k backsolves with L 11 m k scalar products in R k m k scalar products in R k k scalar products in R m k SAINV m sparse-to-sparse products AΘ(A T v) 2 mat-vec products with Z 1 mat-vec product with D Stefania Bellavia, Università di Firenze 19 / 28

Numerical results Comparison between LMP(50) and LMP(100) LMP(100) outperforms LMP(50) in terms of PCG iterations 1 09 08 07 Performance profile,execution time LMP(50) LMP(100) 06 π s (τ) 05 04 03 02 01 0 1 15 2 25 3 τ Stefania Bellavia, Università di Firenze 20 / 28

Numerical results Comparison between LMP(50) and SAINV SAINV solved 21 systemsperformance profile on the tests successfully solved by all preconditioners 1 Performance profile, CG iterations π s (τ) 08 06 04 02 LMP(50) SAINV 0 1 2 3 4 5 6 7 8 9 τ Performance profile,execution time 1 π s (τ) 08 06 04 02 0 LMP(50) SAINV 2 4 6 8 10 12 14 16 τ Stefania Bellavia, Università di Firenze 21 / 28

Numerical results Preconditioner density 10 0 density of H and of the factors L and Z 10 2 L Z H 10 4 0 5 10 15 20 25 10 0 density of the factors L and L 1 10 2 L L 1 10 4 0 5 10 15 20 25 Stefania Bellavia, Università di Firenze 22 / 28

Numerical results Experiments with Preconditioned Deflated-CG A few eigenvectors of R T HR 1 are computed by the Matlab package PROPACK [RM Larsen, 1998] The symmetric Lanczos algorithm with partial reorthogonalization is applied A loose accuracy for the convergence criterion, 10 1, is fixed along with a specified maximum dimension, DIM L, of the Lanczos basis allowed The number of products of matrix-vector products is at most DIM L In the Preconditioned Deflated-CG we injected the estimated eigenvectors If convergence was not achieved, the vectors associated with eigenvalues smaller than a prescribed tolerance are selected Stefania Bellavia, Università di Firenze 23 / 28

Numerical results Solution of a single system Prec Prec H P 1 H Defl-CG CG Test name λ max λ min λ max λ min IT L IT L lp d2q06c 127e6 637e-4 648e0 339e-5 278 338 lp pilot 110e5 155e-2 122e1 258e-4 160 264 lp pilot87 101e6 152e-2 222e1 201e-4 250 294 lp stocfor2 160e6 198e-3 771e0 117e-6 97 144 lpi bgindy 897e3 407e-2 555e0 829e-3 38 53 ge 189e8 490e-5 121e1 878e-7 41 58 nl 826e4 700e-3 730e0 161e-4 388 441 scrs8-2c 185e3 349e-5 539e1 832e-5 102 140 Preconditioner formed with k = 50 Number of small eigenvalues estimated: 5 Maximum dimension of the Lanczos basis: 50 Stefania Bellavia, Università di Firenze 24 / 28

Numerical results Sequences of normal equations from least-squares problems Sequences of normal equations arise in the solution of constrained and unconstrained least-squares problems If the coefficient matrices vary slowly, a preconditioner freeze strategy for LMP coupled with Deflated-CGLS can be used We solved the Nonnegative Linear Least-Squares problems 1 min x 0 2 Bx d 2 2, B full rank, by the interior Newton-like method [Bellavia, Macconi, Morini, NLAA 2006] The trial step at jth nonlinear iteration solves ( ) ( ) min BSj Bxj d 2 p IR n p +, W j 0 Stefania Bellavia, Università di Firenze 25 / 28 2

Numerical results LMP in NNLS The matrix of the normal equation is H j = A j A T j, A j = ( S j B T W j ), j = 0, 1, where S j and W j are matrices with entries in (0, 1] and [0, 1] respectively We solve the sequence of linear systems with a frozen preconditioner For a seed matrix, say H 0, we form the LMP preconditioner and compute l approximate eigenvectors associated to the smallest eigenvalues We reuse the preconditioner and the eigenvectors troughout the nonlinear iterations until the preconditioner deteriorates, ie the limit of CGLS iterations is reached Then, the LMP preconditioner and l eigenvectors are refreshed for the current matrix Stefania Bellavia, Università di Firenze 26 / 28

Numerical results LMP(100), 5 small eigs estimated, Lanczos basis dim: 50 Prec Defl-CGLS Prec CGLS Test IT NL(R) IT L IT NL(R) IT L Savings in mat-vec prod lp pilot87 27(1) 3639 30(1) 6023 36% lp ken 11 14 512 19 720 12% lp ken 13 14 485 19 881 31% lp ken 18 24 1937 18 2449 14% lp pds 10 11 607 11 834 15% lp pds 20 13 1629 13 1877 9% lp truss 13 512 14 951 34% deter3 23 1441 28 1910 16% deter5 13 844 26 1939 51% deter7 18 1242 21 2050 33% fxm2-16 33(3) 8686 47(2) 10771 17% ge 35(3) 8425 34(3) 10021 13% nl 28(5) 7376 32(6) 10891 30% scrs8-2c 17 163 * Stefania Bellavia, Università di Firenze 27 / 28

Numerical results Final comments Work in progress: We are using LMP preconditioner in the solution of linear systems arising in Electrostatic and Electromagnetic problems, in cooperation with A Tamburrino, S Ventre, University of Cassino The matrix H is spd can be decomposed as H = H far + H near, -H near is available and includes the diagonal of H -H far is not available, the action of H far on a vector can be (approximated) computed Stefania Bellavia, Università di Firenze 28 / 28

Numerical results Final comments Work in progress: We are using LMP preconditioner in the solution of linear systems arising in Electrostatic and Electromagnetic problems, in cooperation with A Tamburrino, S Ventre, University of Cassino The matrix H is spd can be decomposed as H = H far + H near, -H near is available and includes the diagonal of H -H far is not available, the action of H far on a vector can be (approximated) computed S B, J Gondzio, B Morini, A matrix-free preconditioner for sparse symmetric positive definite systems and least-squares problems, SISC in corso di stampa J Gondzio, Interior point methods 25 years later, EJOR (2012) Stefania Bellavia, Università di Firenze 28 / 28

Numerical results Final comments Work in progress: We are using LMP preconditioner in the solution of linear systems arising in Electrostatic and Electromagnetic problems, in cooperation with A Tamburrino, S Ventre, University of Cassino The matrix H is spd can be decomposed as H = H far + H near, -H near is available and includes the diagonal of H -H far is not available, the action of H far on a vector can be (approximated) computed S B, J Gondzio, B Morini, A matrix-free preconditioner for sparse symmetric positive definite systems and least-squares problems, SISC in corso di stampa J Gondzio, Interior point methods 25 years later, EJOR (2012) Stefania Bellavia, Università di Firenze 28 / 28