# It s Not A Disease: The Parallel Solver Packages MUMPS, PaStiX & SuperLU

Save this PDF as:

Size: px
Start display at page:

Download "It s Not A Disease: The Parallel Solver Packages MUMPS, PaStiX & SuperLU"

## Transcription

1 It s Not A Disease: The Parallel Solver Packages MUMPS, PaStiX & SuperLU A. Windisch PhD Seminar: High Performance Computing II G. Haase March 29 th, 2012, Graz

2 Outline 1 MUMPS 2 PaStiX 3 SuperLU 4 Summary and Outlook

3 MUMPS MUltifrontal Massively Parallel sparse direct Solver

4 Some historical facts (and others) > 1999

5 Getting MUMPS is easy (it s PD) Link Debian, Ubuntu \$ sudo apt-get install libmumps-4.9.2

6 Using MUMPS Interfaces MUMPS (FORTRAN90): 1 C 2 MATLAB 3 Octave 4 Scilab

7 MUMPS: Relevant literature P. R. Amestoy, I. S. Duff, J. Koster and J. Y. L Excellent, SIAM Journal on Matrix Analysis and Applications 23 (2001) P. R. Amestoy, A. Guermouche, J. Y. L Excellent and S. Pralet, Parallel Computing 32 (2006) I. S. Duff and J. K. Reid, ACM Transactions on Mathematical Software 9 (1983) I. S. Duff, A. M. Erisman and J. K. Reid, Oxford University Press, London (1986) J. W. H.Liu, SIAM Review 34 (1992)

8 So, what is MUMPS, and how does it work? Solves Ax = b Direct Solver based on Multifrontal Approach A square sparse matrix 1 unsymmetric 2 symmetric positive definite 3 general symmetric Factorization A = LU Symmetric A A = LDL T

9 So, what is MUMPS, and how does it work? Solves Ax = b Direct Solver based on Multifrontal Approach A square sparse matrix 1 unsymmetric 2 symmetric positive definite 3 general symmetric Factorization A = LU Symmetric A A = LDL T

10

11

12 A = l A[l] a ij = a ij + a [l] ij

13 A = l A[l] a ij = a ij + a [l] ij Assembly

14 A = l A[l] a ij = a ij + a [l] ij Assembly Fully summed

15 a (k+1) ij = a (k) ij a (k) ik a(k) kk 1 a (k) kj GE

16 a (k+1) ij = a (k) ij a (k) ik a(k) kk 1 a (k) kj

17

18

19 Assemble A, B

20 Assemble A, B Eliminate u u u u 8 l 1 l 5 l

21 Assemble A, B Eliminate 4 Assemble C u u u u 8 l 1 l 5 l 2

22 Assemble A, B Eliminate 4 Assemble C Permute u u u u 1 l 8 l 5 l 2

23 Assemble A, B Eliminate 4 Assemble C Permute 1 8 Eliminate u u u u 1 l u u u u 8 l l 5 l l 2 l

24 u u u u 1 l u u u u 8 l l 5 l l 2 l

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45 Frontal (( ((A [1] + A [2] ) + A [3] ) + A [4] ) + ) Multifrontal ((A [1] + A [2] ) + (A [3] + A [4] ) + (A [5] + A [6] ) + (A [7] + A [8] ))

46 How MUMPS solves a problem Analysis 1 Preprocessing Factorization 2 Ordering 3 Symbolic factorization Solution

47 How MUMPS solves a problem Analysis Factorization Solution 1 Elimination tree nodes 2 Numerical factorization: frontal matrices 3 Factor matrices distributed

48 How MUMPS solves a problem Analysis 1 LUx = b, LDL T x = b Factorization 2 Forward: Ly = b or LDy = b 3 Backward: Ux = y or L T x = y Solution

49 MUMPS Furthermore... Interfaces to PORD, SCOTCH, METIS Parallel version requires MPI, BLAS, BLACS and ScaLAPACK Error analysis Detection of null-pivots Schur complement

50 PaStiX Parallel Sparse matrix package

51 Getting PaStiX Link

52 Depending on symmetry: A = LL T A = LDL T PaStiX Steps 1 Reordering to reduce fill-in 2 Symbolic factorization 3 Distribute matrix blocks to processors 4 Decomposition of A 5 Solve system 6 Refine solution (static pivoting)

53 1. Ordering SCOTCH (or METIS) Halo Approximate Minimum Degree Tree represents dependencies 2. Symbolic factorization Structure of factorized matrix from A Cheap step # of off-diag blocks 3. Distribution Partitioning: large blocks distributed to several processors Processor candidates: local communication Distribution: blocks to nodes Use elimination tree Scheduling comm.& comp. Levels of Parallelism Coarse: independend parts of tree Medium: block decomp. Fine: BLAS3

54 4. Factorization Calculate LL T or LDL T multi-frontal vs. super-nodal PaStiX: super-nodal (left-looking) 5. Solve distribution kept cheap 6. Refinement (opt) GMRES by Y. Saad iterative refinement conjugate gradient

55 SuperLU

56 Getting SuperLU Link xiaoye/superlu/ Debian, Ubuntu \$ sudo apt-get install libsuperlu3

57 Three packages 1 Sequential SuperLU Sequential processors One or more layers of memory 2 Multithreaded SuperLU (SuperLU_MT) Shared memory multiprocessor (SMPs) Can use parallel processors 3 Distributed SuperLU (SuperLU_DIST) Distributed memory parallel processors MPI Can use hundreds of processors

58 Summary

59 MUMPS Symbolic Factorization Distribution Factorization through multifrontal method PaStiX Symbolic Factorization Distribution Factorization through supernodal method SuperLU To be investigated...

60 Thank You For Your Attention!

### Scilab and MATLAB Interfaces to MUMPS (version 4.6 or greater)

Laboratoire de l Informatique du Parallélisme École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON-UCBL n o 5668 Scilab and MATLAB Interfaces to MUMPS (version 4.6 or greater)

More information

### Poisson Equation Solver Parallelisation for Particle-in-Cell Model

WDS'14 Proceedings of Contributed Papers Physics, 233 237, 214. ISBN 978-8-7378-276-4 MATFYZPRESS Poisson Equation Solver Parallelisation for Particle-in-Cell Model A. Podolník, 1,2 M. Komm, 1 R. Dejarnac,

More information

### A note on fast approximate minimum degree orderings for symmetric matrices with some dense rows

NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. (2009) Published online in Wiley InterScience (www.interscience.wiley.com)..647 A note on fast approximate minimum degree orderings

More information

### Domain Decomposition Based High Performance Parallel Computing

IJCSI International Journal of Computer Science Issues, Vol. 5, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 27 Domain Decomposition Based High Performance Parallel Computing Mandhapati P. Raju

More information

### HSL and its out-of-core solver

HSL and its out-of-core solver Jennifer A. Scott j.a.scott@rl.ac.uk Prague November 2006 p. 1/37 Sparse systems Problem: we wish to solve where A is Ax = b LARGE Informal definition: A is sparse if many

More information

### A DAG-based sparse Cholesky solver. architectures. Jonathan Hogg. Sparse Days at CERFACS June John Reid

for multicore architectures Jonathan Hogg j.hogg@ed.ac.uk John Reid john.reid@stfc.ac.uk Jennifer Scott jennifer.scott@stfc.ac.uk Sparse Days at CERFACS June 2009 Outline of talk How to efficiently solve

More information

### AN OUT-OF-CORE SPARSE SYMMETRIC INDEFINITE FACTORIZATION METHOD

AN OUT-OF-CORE SPARSE SYMMETRIC INDEFINITE FACTORIZATION METHOD OMER MESHAR AND SIVAN TOLEDO Abstract. We present a new out-of-core sparse symmetric-indefinite factorization algorithm. The most significant

More information

### Direct Solvers for Sparse Matrices X. Li July 2013

Direct Solvers for Sparse Matrices X. Li July 2013 Direct solvers for sparse matrices involve much more complicated algorithms than for dense matrices. The main complication is due to the need for efficient

More information

### Solution of Linear Systems

Chapter 3 Solution of Linear Systems In this chapter we study algorithms for possibly the most commonly occurring problem in scientific computing, the solution of linear systems of equations. We start

More information

### GOAL AND STATUS OF THE TLSE PLATFORM

GOAL AND STATUS OF THE TLSE PLATFORM P. Amestoy, F. Camillo, M. Daydé,, L. Giraud, R. Guivarch, V. Moya Lamiel,, M. Pantel, and C. Puglisi IRIT-ENSEEIHT And J.-Y. L excellentl LIP-ENS Lyon / INRIA http://www.irit.enseeiht.fr

More information

### SOLVING LINEAR SYSTEMS

SOLVING LINEAR SYSTEMS Linear systems Ax = b occur widely in applied mathematics They occur as direct formulations of real world problems; but more often, they occur as a part of the numerical analysis

More information

### Mathematical Libraries and Application Software on JUROPA and JUQUEEN

Mitglied der Helmholtz-Gemeinschaft Mathematical Libraries and Application Software on JUROPA and JUQUEEN JSC Training Course May 2014 I.Gutheil Outline General Informations Sequential Libraries Parallel

More information

### The MUMPS Solver: academic needs and industrial expectations

The MUMPS Solver: academic needs and industrial expectations Chiara Puglisi (Inria-Grenoble (LIP-ENS Lyon)) MUMPS group, Bordeaux 1 CERFACS, CNRS, ENS-Lyon, INRIA, INPT, Université Séminaire Aristote -

More information

### Scaling the solution of large sparse linear systems using multifrontal methods on hybrid shared-distributed memory architectures

Scaling the solution of large sparse linear systems using multifrontal methods on hybrid shared-distributed memory architectures Mohamed Wissam Sid Lakhdar To cite this version: Mohamed Wissam Sid Lakhdar.

More information

### Mathematical Libraries on JUQUEEN. JSC Training Course

Mitglied der Helmholtz-Gemeinschaft Mathematical Libraries on JUQUEEN JSC Training Course May 10, 2012 Outline General Informations Sequential Libraries, planned Parallel Libraries and Application Systems:

More information

### 6. Cholesky factorization

6. Cholesky factorization EE103 (Fall 2011-12) triangular matrices forward and backward substitution the Cholesky factorization solving Ax = b with A positive definite inverse of a positive definite matrix

More information

### PARDISO. User Guide Version 5.0.0

P a r a l l e l S p a r s e D i r e c t A n d M u l t i - R e c u r s i v e I t e r a t i v e L i n e a r S o l v e r s PARDISO User Guide Version 5.0.0 (Updated February 07, 2014) O l a f S c h e n k

More information

### A survey of direct methods for sparse linear systems

A survey of direct methods for sparse linear systems Timothy A. Davis, Sivasankaran Rajamanickam, and Wissam M. Sid-Lakhdar Technical Report, Department of Computer Science and Engineering, Texas A&M Univ,

More information

### THÈSE. En vue de l obtention du DOCTORAT DE L UNIVERSITÉ DE TOULOUSE. Délivré par : L Institut National Polytechnique de Toulouse (INP Toulouse)

THÈSE En vue de l obtention du DOCTORAT DE L UNIVERSITÉ DE TOULOUSE Délivré par : L Institut National Polytechnique de Toulouse (INP Toulouse) Présentée et soutenue par : Clément Weisbecker Le 28 Octobre

More information

### A Parallel Lanczos Algorithm for Eigensystem Calculation

A Parallel Lanczos Algorithm for Eigensystem Calculation Hans-Peter Kersken / Uwe Küster Eigenvalue problems arise in many fields of physics and engineering science for example in structural engineering

More information

### A study of various load information exchange mechanisms for a distributed application using dynamic scheduling

Laboratoire de l Informatique du Parallélisme École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON-UCBL n o 5668 A study of various load information exchange mechanisms for a distributed

More information

### Algorithmique pour l algèbre linéaire creuse

Algorithmique pour l algèbre linéaire creuse Pascal Hénon 12 janvier 2009 Pascal Hénon Algorithmique pour l algèbre linéaire creuse module IS309 1 Contributions Many thanks to Patrick Amestoy, Abdou Guermouche

More information

### Sparse direct methods

Sparse direct methods Week 6: Monday, Sep 24 Suppose A is a sparse matrix, and P A = LU. Will L and U also be sparse? The answer depends in a somewhat complicated way on the structure of the graph associated

More information

### THÈSE. En vue de l obtention du DOCTORAT DE L UNIVERSITÉ DE TOULOUSE. Délivré par : L Institut National Polytechnique de Toulouse (INP Toulouse)

THÈSE En vue de l obtention du DOCTORAT DE L UNIVERSITÉ DE TOULOUSE Délivré par : L Institut National Polytechnique de Toulouse (INP Toulouse) Présentée et soutenue par : François-Henry Rouet Le 7 Octobre

More information

### Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Eric Petit, Loïc Thebault, Quang V. Dinh May 2014 EXA2CT Consortium 2 WPs Organization Proto-Applications

More information

### 3 P0 P0 P3 P3 8 P1 P0 P2 P3 P1 P2

A Comparison of 1-D and 2-D Data Mapping for Sparse LU Factorization with Partial Pivoting Cong Fu y Xiangmin Jiao y Tao Yang y Abstract This paper presents a comparative study of two data mapping schemes

More information

### Abstract: We describe the beautiful LU factorization of a square matrix (or how to write Gaussian elimination in terms of matrix multiplication).

MAT 2 (Badger, Spring 202) LU Factorization Selected Notes September 2, 202 Abstract: We describe the beautiful LU factorization of a square matrix (or how to write Gaussian elimination in terms of matrix

More information

### On fast factorization pivoting methods for sparse symmetric indefinite systems

On fast factorization pivoting methods for sparse symmetric indefinite systems by Olaf Schenk 1, and Klaus Gärtner 2 Technical Report CS-2004-004 Department of Computer Science, University of Basel Submitted

More information

### PARALLEL ALGORITHMS FOR PREDICTIVE MODELLING

PARALLEL ALGORITHMS FOR PREDICTIVE MODELLING MARKUS HEGLAND Abstract. Parallel computing enables the analysis of very large data sets using large collections of flexible models with many variables. The

More information

### AN INTERFACE STRIP PRECONDITIONER FOR DOMAIN DECOMPOSITION METHODS

AN INTERFACE STRIP PRECONDITIONER FOR DOMAIN DECOMPOSITION METHODS by M. Storti, L. Dalcín, R. Paz Centro Internacional de Métodos Numéricos en Ingeniería - CIMEC INTEC, (CONICET-UNL), Santa Fe, Argentina

More information

### Doolittle Decomposition of a Matrix

Doolittle Decomposition of a Matrix It is always possible to factor a square matrix into a lower triangular matrix and an upper triangular matrix. That is, [A] = [L][U] EXAMPLE: = 2 2 2 4 Having the LU

More information

### Direct Methods for Solving Linear Systems. Matrix Factorization

Direct Methods for Solving Linear Systems Matrix Factorization Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011

More information

### Multifrontal Computations on GPUs and Their Multi-core Hosts

Multifrontal Computations on GPUs and Their Multi-core Hosts Robert F. Lucas 1, Gene Wagenbreth 1, Dan M. Davis 1, and Roger Grimes 2 1 Information Sciences Institute, University of Southern California

More information

### Lecture 5 - Triangular Factorizations & Operation Counts

LU Factorization Lecture 5 - Triangular Factorizations & Operation Counts We have seen that the process of GE essentially factors a matrix A into LU Now we want to see how this factorization allows us

More information

### Direct methods for sparse matrices

Direct methods for sparse matrices Iain S. Duff iain.duff@stfc.ac.uk STFC Rutherford Appleton Laboratory Oxfordshire, UK. and CERFACS, Toulouse, France CEA-EDF-INDRIA Schools. Sophia Antipolis. March 30

More information

### Numerical Methods I Solving Linear Systems: Sparse Matrices, Iterative Methods and Non-Square Systems

Numerical Methods I Solving Linear Systems: Sparse Matrices, Iterative Methods and Non-Square Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course G63.2010.001 / G22.2420-001,

More information

### Linear Systems. Singular and Nonsingular Matrices. Find x 1, x 2, x 3 such that the following three equations hold:

Linear Systems Example: Find x, x, x such that the following three equations hold: x + x + x = 4x + x + x = x + x + x = 6 We can write this using matrix-vector notation as 4 {{ A x x x {{ x = 6 {{ b General

More information

### Algorithmic Research and Software Development for an Industrial Strength Sparse Matrix Library for Parallel Computers

The Boeing Company P.O.Box3707,MC7L-21 Seattle, WA 98124-2207 Final Technical Report February 1999 Document D6-82405 Copyright 1999 The Boeing Company All Rights Reserved Algorithmic Research and Software

More information

### 7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix

7. LU factorization EE103 (Fall 2011-12) factor-solve method LU factorization solving Ax = b with A nonsingular the inverse of a nonsingular matrix LU factorization algorithm effect of rounding error sparse

More information

### AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 19: SVD revisited; Software for Linear Algebra Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 9 Outline 1 Computing

More information

### Numerical Linear Algebra Software

Numerical Linear Algebra Software (based on slides written by Michael Grant) BLAS, ATLAS LAPACK sparse matrices Prof. S. Boyd, EE364b, Stanford University Numerical linear algebra in optimization most

More information

### Numerical Solution of Linear Systems

Numerical Solution of Linear Systems Chen Greif Department of Computer Science The University of British Columbia Vancouver B.C. Tel Aviv University December 17, 2008 Outline 1 Direct Solution Methods

More information

### Advanced Computational Software

Advanced Computational Software Scientific Libraries: Part 2 Blue Waters Undergraduate Petascale Education Program May 29 June 10 2011 Outline Quick review Fancy Linear Algebra libraries - ScaLAPACK -PETSc

More information

### A distributed CPU-GPU sparse direct solver

A distributed CPU-GPU sparse direct solver Piyush Sao 1, Richard Vuduc 1, and Xiaoye Li 2 1 Georgia Institute of Technology, {piyush3,richie}@gatech.edu 2 Lawrence Berkeley National Laboratory, xsli@lbl.gov

More information

### Scientific Computing Programming with Parallel Objects

Scientific Computing Programming with Parallel Objects Esteban Meneses, PhD School of Computing, Costa Rica Institute of Technology Parallel Architectures Galore Personal Computing Embedded Computing Moore

More information

### c 1999 Society for Industrial and Applied Mathematics

SIAM J. MATRIX ANAL. APPL. Vol. 20, No. 4, pp. 915 952 c 1999 Society for Industrial and Applied Mathematics AN ASYNCHRONOUS PARALLEL SUPERNODAL ALGORITHM FOR SPARSE GAUSSIAN ELIMINATION JAMES W. DEMMEL,

More information

### Parallel Interior Point Solver for Structured Quadratic Programs: Application to Financial Planning Problems

Parallel Interior Point Solver for Structured uadratic Programs: Application to Financial Planning Problems Jacek Gondzio Andreas Grothey April 15th, 2003 MS-03-001 For other papers in this series see

More information

### THESE. Christof VÖMEL CERFACS

THESE pour obtenir LE TITRE DE DOCTEUR DE L INSTITUT NATIONAL POLYTECHNIQUE DE TOULOUSE Spécialité: Informatique et Télécommunications par Christof VÖMEL CERFACS Contributions à la recherche en calcul

More information

### Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms

Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms Amani AlOnazi, David E. Keyes, Alexey Lastovetsky, Vladimir Rychkov Extreme Computing Research Center,

More information

### Notes on Cholesky Factorization

Notes on Cholesky Factorization Robert A. van de Geijn Department of Computer Science Institute for Computational Engineering and Sciences The University of Texas at Austin Austin, TX 78712 rvdg@cs.utexas.edu

More information

### Yousef Saad University of Minnesota Computer Science and Engineering. CRM Montreal - April 30, 2008

A tutorial on: Iterative methods for Sparse Matrix Problems Yousef Saad University of Minnesota Computer Science and Engineering CRM Montreal - April 30, 2008 Outline Part 1 Sparse matrices and sparsity

More information

### Multicore Parallel Computing with OpenMP

Multicore Parallel Computing with OpenMP Tan Chee Chiang (SVU/Academic Computing, Computer Centre) 1. OpenMP Programming The death of OpenMP was anticipated when cluster systems rapidly replaced large

More information

### 7 Gaussian Elimination and LU Factorization

7 Gaussian Elimination and LU Factorization In this final section on matrix factorization methods for solving Ax = b we want to take a closer look at Gaussian elimination (probably the best known method

More information

### An Overview Of Software For Convex Optimization. Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM 87801 borchers@nmt.

An Overview Of Software For Convex Optimization Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM 87801 borchers@nmt.edu In fact, the great watershed in optimization isn t between linearity

More information

### Modification of the Minimum-Degree Algorithm by Multiple Elimination

Modification of the Minimum-Degree Algorithm by Multiple Elimination JOSEPH W. H. LIU York University The most widely used ordering scheme to reduce fills and operations in sparse matrix computation is

More information

### 5. Orthogonal matrices

L Vandenberghe EE133A (Spring 2016) 5 Orthogonal matrices matrices with orthonormal columns orthogonal matrices tall matrices with orthonormal columns complex matrices with orthonormal columns 5-1 Orthonormal

More information

### Avoiding Communication in Linear Algebra

Avoiding Communication in Linear Algebra Jim Demmel UC Berkeley bebop.cs.berkeley.edu Motivation Running time of an algorithm is sum of 3 terms: # flops * time_per_flop # words moved / bandwidth # messages

More information

### Diagonal, Symmetric and Triangular Matrices

Contents 1 Diagonal, Symmetric Triangular Matrices 2 Diagonal Matrices 2.1 Products, Powers Inverses of Diagonal Matrices 2.1.1 Theorem (Powers of Matrices) 2.2 Multiplying Matrices on the Left Right by

More information

### Espaces grossiers adaptatifs pour les méthodes de décomposition de domaines à deux niveaux

Espaces grossiers adaptatifs pour les méthodes de décomposition de domaines à deux niveaux Frédéric Nataf Laboratory J.L. Lions (LJLL), CNRS, Alpines et Univ. Paris VI joint work with Victorita Dolean

More information

### Solving linear systems. Solving linear systems p. 1

Solving linear systems Solving linear systems p. 1 Overview Chapter 12 from Michael J. Quinn, Parallel Programming in C with MPI and OpenMP We want to find vector x = (x 0,x 1,...,x n 1 ) as solution of

More information

### Parallel implementation of the deflated GMRES in the PETSc package

Work done while visiting the Jointlab @NCSA (10/26-11/24) 1/17 Parallel implementation of the deflated in the package in WAKAM 1, Joint work with Jocelyne ERHEL 1, William D. GROPP 2 1 INRIA Rennes, 2

More information

### Factorization Theorems

Chapter 7 Factorization Theorems This chapter highlights a few of the many factorization theorems for matrices While some factorization results are relatively direct, others are iterative While some factorization

More information

### A Load Balancing Tool for Structured Multi-Block Grid CFD Applications

A Load Balancing Tool for Structured Multi-Block Grid CFD Applications K. P. Apponsah and D. W. Zingg University of Toronto Institute for Aerospace Studies (UTIAS), Toronto, ON, M3H 5T6, Canada Email:

More information

### Section 6.1 - Inner Products and Norms

Section 6.1 - Inner Products and Norms Definition. Let V be a vector space over F {R, C}. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F,

More information

### ANSYS Solvers: Usage and Performance. Ansys equation solvers: usage and guidelines. Gene Poole Ansys Solvers Team, April, 2002

ANSYS Solvers: Usage and Performance Ansys equation solvers: usage and guidelines Gene Poole Ansys Solvers Team, April, 2002 Outline Basic solver descriptions Direct and iterative methods Why so many choices?

More information

### Experiences of numerical simulations on a PC cluster Antti Vanne December 11, 2002

xperiences of numerical simulations on a P cluster xperiences of numerical simulations on a P cluster ecember xperiences of numerical simulations on a P cluster Introduction eowulf concept Using commodity

More information

### Scaling Behavior of Linear Solvers on Large Linux Clusters

Scaling Behavior of Linear Solvers on Large Linux Clusters John Fettig 1, Wai-Yip Kwok 1, Faisal Saied 1 1 National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign

More information

### 9. Numerical linear algebra background

Convex Optimization Boyd & Vandenberghe 9. Numerical linear algebra background matrix structure and algorithm complexity solving linear equations with factored matrices LU, Cholesky, LDL T factorization

More information

### NOTUR Technology Transfer Projects (TTP)

NOTUR Technology Transfer Projects (TTP) By Trond Kvamsdal NOTUR 10. Juni 2004, Tromsø, Norway CONTENTS The concept behind the TTPs Results obtained from the TTPs Concluding remarks Purpose Enable optimal

More information

### High Performance Computing in CST STUDIO SUITE

High Performance Computing in CST STUDIO SUITE Felix Wolfheimer GPU Computing Performance Speedup 18 16 14 12 10 8 6 4 2 0 Promo offer for EUC participants: 25% discount for K40 cards Speedup of Solver

More information

### Solving Sets of Equations. 150 B.C.E., 九章算術 Carl Friedrich Gauss,

Solving Sets of Equations 5 B.C.E., 九章算術 Carl Friedrich Gauss, 777-855 Gaussian-Jordan Elimination In Gauss-Jordan elimination, matrix is reduced to diagonal rather than triangular form Row combinations

More information

### SystemofLinearEquationsGuassianElimination

Global Journal of Computer Science and Technology C Software & Data Engineering Volume 15 Issue 5 Version 1. Year 215 Type Double Blind Peer Reviewed International Research Journal Publisher Global Journals

More information

### Load balancing. David Bindel. 12 Nov 2015

Load balancing David Bindel 12 Nov 2015 Inefficiencies in parallel code Poor single processor performance Typically in the memory system Saw this in matrix multiply assignment Overhead for parallelism

More information

### A matrix-free preconditioner for sparse symmetric positive definite systems and least square problems

A matrix-free preconditioner for sparse symmetric positive definite systems and least square problems Stefania Bellavia Dipartimento di Ingegneria Industriale Università degli Studi di Firenze Joint work

More information

### A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster

Acta Technica Jaurinensis Vol. 3. No. 1. 010 A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster G. Molnárka, N. Varjasi Széchenyi István University Győr, Hungary, H-906

More information

### The Assessment of Benchmarks Executed on Bare-Metal and Using Para-Virtualisation

The Assessment of Benchmarks Executed on Bare-Metal and Using Para-Virtualisation Mark Baker, Garry Smith and Ahmad Hasaan SSE, University of Reading Paravirtualization A full assessment of paravirtualization

More information

### Best practices for efficient HPC performance with large models

Best practices for efficient HPC performance with large models Dr. Hößl Bernhard, CADFEM (Austria) GmbH PRACE Autumn School 2013 - Industry Oriented HPC Simulations, September 21-27, University of Ljubljana,

More information

### Techniques of the simplex basis LU factorization update

Techniques of the simplex basis L factorization update Daniela Renata Cantane Electric Engineering and Computation School (FEEC), State niversity of Campinas (NICAMP), São Paulo, Brazil Aurelio Ribeiro

More information

### Solving Very Large Financial Planning Problems on Blue Gene

U N I V E R S School of Mathematics T H E O I T Y H F G E D I N U R Solving Very Large Financial Planning Problems on lue Gene ndreas Grothey, University of Edinburgh joint work with Jacek Gondzio, Marco

More information

### Parallel Programming and High-Performance Computing

Parallel Programming and High-Performance Computing Part 7: Examples of Parallel Algorithms Dr. Ralf-Peter Mundani CeSIM / IGSSE Overview matrix operations JACOBI and GAUSS-SEIDEL iterations sorting Everything

More information

### Practical Numerical Training UKNum

Practical Numerical Training UKNum 7: Systems of linear equations C. Mordasini Max Planck Institute for Astronomy, Heidelberg Program: 1) Introduction 2) Gauss Elimination 3) Gauss with Pivoting 4) Determinants

More information

### Linear Systems COS 323

Linear Systems COS 323 Last time: Constrained Optimization Linear constrained optimization Linear programming (LP) Simplex method for LP General optimization With equality constraints: Lagrange multipliers

More information

### Mesh Partitioning and Load Balancing

and Load Balancing Contents: Introduction / Motivation Goals of Load Balancing Structures Tools Slide Flow Chart of a Parallel (Dynamic) Application Partitioning of the initial mesh Computation Iteration

More information

### Acoustics Analysis of Speaker

Acoustics Analysis of Speaker 1 Introduction ANSYS 14.0 offers many enhancements in the area of acoustics. In this presentation, an example speaker analysis will be shown to highlight some of the acoustics

More information

### Scientific Computing

Scientific Computing Benson Muite benson.muite@ut.ee http://kodu.ut.ee/ benson https://courses.cs.ut.ee/2015/nummat/fall/main/homepage 2 November 2015 Course Overview Lectures Monday J. Livii 2-611 10.15-12.00

More information

### Toward a New Metric for Ranking High Performance Computing Systems

SANDIA REPORT SAND2013-4744 Unlimited Release Printed June 2013 Toward a New Metric for Ranking High Performance Computing Systems Jack Dongarra, University of Tennessee Michael A. Heroux, Sandia National

More information

### P013 INTRODUCING A NEW GENERATION OF RESERVOIR SIMULATION SOFTWARE

1 P013 INTRODUCING A NEW GENERATION OF RESERVOIR SIMULATION SOFTWARE JEAN-MARC GRATIEN, JEAN-FRANÇOIS MAGRAS, PHILIPPE QUANDALLE, OLIVIER RICOIS 1&4, av. Bois-Préau. 92852 Rueil Malmaison Cedex. France

More information

### Limited Memory Solution of Complementarity Problems arising in Video Games

Laboratoire d Arithmétique, Calcul formel et d Optimisation UMR CNRS 69 Limited Memory Solution of Complementarity Problems arising in Video Games Michael C. Ferris Andrew J. Wathen Paul Armand Rapport

More information

### Optimization on Huygens

Optimization on Huygens Wim Rijks wimr@sara.nl Contents Introductory Remarks Support team Optimization strategy Amdahls law Compiler options An example Optimization Introductory Remarks Modern day supercomputers

More information

### A Low Communication Condensation-based Linear System Solver Utilizing Cramer's Rule

University of Tennessee, Knoxville Trace: Tennessee Research and Creative Exchange Doctoral Dissertations Graduate School 8-2011 A Low Communication Condensation-based Linear System Solver Utilizing Cramer's

More information

### Chapter 07: Instruction Level Parallelism VLIW, Vector, Array and Multithreaded Processors. Lesson 05: Array Processors

Chapter 07: Instruction Level Parallelism VLIW, Vector, Array and Multithreaded Processors Lesson 05: Array Processors Objective To learn how the array processes in multiple pipelines 2 Array Processor

More information

### A Parallel Geometric Multifrontal Solver Using Hierarchically Semiseparable Structure

A Parallel Geometric Multifrontal Solver Using Hierarchically Semiseparable Structure SHEN WANG, DepartmentofMathematics,PurdueUniversity XIAOYE S. LI, LawrenceBerkeleyNationalLaboratory FRANÇOIS-HENRY

More information

### HPC Deployment of OpenFOAM in an Industrial Setting

HPC Deployment of OpenFOAM in an Industrial Setting Hrvoje Jasak h.jasak@wikki.co.uk Wikki Ltd, United Kingdom PRACE Seminar: Industrial Usage of HPC Stockholm, Sweden, 28-29 March 2011 HPC Deployment

More information

### Parallel Algorithm for Dense Matrix Multiplication

Parallel Algorithm for Dense Matrix Multiplication CSE633 Parallel Algorithms Fall 2012 Ortega, Patricia Outline Problem definition Assumptions Implementation Test Results Future work Conclusions Problem

More information

### Distributed Optimization of Fiber Optic Network Layout using MATLAB. R. Pfarrhofer, M. Kelz, P. Bachhiesl, H. Stögner, and A. Uhl

Distributed Optimization of Fiber Optic Network Layout using MATLAB R. Pfarrhofer, M. Kelz, P. Bachhiesl, H. Stögner, and A. Uhl uhl@cosy.sbg.ac.at R. Pfarrhofer, M. Kelz, P. Bachhiesl, H. Stögner, and

More information

### Utilizing the quadruple-precision floating-point arithmetic operation for the Krylov Subspace Methods

Utilizing the quadruple-precision floating-point arithmetic operation for the Krylov Subspace Methods Hidehiko Hasegawa Abract. Some large linear systems are difficult to solve by the Krylov subspace methods.

More information

### Generalized Inverse Computation Based on an Orthogonal Decomposition Methodology.

International Conference on Mathematical and Statistical Modeling in Honor of Enrique Castillo. June 28-30, 2006 Generalized Inverse Computation Based on an Orthogonal Decomposition Methodology. Patricia

More information

### Object-oriented scientific computing

Object-oriented scientific computing Pras Pathmanathan Summer 2012 The finite element method Advantages of the FE method over the FD method Main advantages of FE over FD 1 Deal with Neumann boundary conditions

More information

### Numerical Analysis. Professor Donna Calhoun. Fall 2013 Math 465/565. Office : MG241A Office Hours : Wednesday 10:00-12:00 and 1:00-3:00

Numerical Analysis Professor Donna Calhoun Office : MG241A Office Hours : Wednesday 10:00-12:00 and 1:00-3:00 Fall 2013 Math 465/565 http://math.boisestate.edu/~calhoun/teaching/math565_fall2013 What is

More information

### AN INTRODUCTION TO THE FINITE ELEMENT METHOD FOR YOUNG ENGINEERS

AN INTRODUCTION TO THE FINITE ELEMENT METHOD FOR YOUNG ENGINEERS By: Eduardo DeSantiago, PhD, PE, SE Table of Contents SECTION I INTRODUCTION... 2 SECTION II 1-D EXAMPLE... 2 SECTION III DISCUSSION...

More information