Mathematical Libraries and Application Software on JUROPA and JUQUEEN
|
|
|
- Rudolf Henderson
- 10 years ago
- Views:
Transcription
1 Mitglied der Helmholtz-Gemeinschaft Mathematical Libraries and Application Software on JUROPA and JUQUEEN JSC Training Course May 2014 I.Gutheil
2 Outline General Informations Sequential Libraries Parallel Libraries and Application Systems: Threaded Libraries MPI parallel Libraries Application Software Further Information May 2014 I.Gutheil Folie 2
3 General Informations JUROPA (I) Three major Compiler versions Default: Intel with MKL Module: Intel with MKL Module: Intel with MKL Module: Intel 12.0.[3, 4, or 5] with MKL included Module: Intel 12.1.[0, 1, 2, or 4] with MKL included Module: Intel or , MKL included module unload intel, module unload mkl before loading a new intel (and MKL) Module module switch parastation parastation/mpi2-intel for some new libraries necessary May 2014 I.Gutheil Folie 3
4 General Informations JUROPA (II) Most libraries compiled with and MKL , new versions with 12.0, 12.1, or 13.1 only Module unload and module load must be called in batch scripts before execution, too Starting with intel/ MKL is included with module load intel/12.* For most libraries versions compiled with intel/12.0.* and/or 12.1.* available Latest libraries compiled with intel/ May 2014 I.Gutheil Folie 4
5 General Informations JUROPA (III) module avail lists names of available libraries module help name shows how to use library module load name prepends LD LIBRARY PATH, LIBRARY PATH and INCLUDE and sets NAME ROOT to the correct directory Link sequence important,.o always before the libraries, sometimes double linking necessary May 2014 I.Gutheil Folie 5
6 General Informations JUQUEEN (I) All libraries as modules in /bgsys/local/name module avail lists names of available libraries module help name tells how to use library module load name sets environment variables for -L$(* LIB) and -I$(* INCLUDE) to include in makefile Link sequence important,.o always before the libraries, sometimes double linking necessary May 2014 I.Gutheil Folie 6
7 General Informations (II) First all libraries will be compiled with -O3 -qstrict -g qsimd=noauto Additional version compiled without -g added also some versions with simd See module avail for available versions May 2014 I.Gutheil Folie 7
8 Sequential Libraries and Packages (I) Vendor specific Libraries JUROPA only MKL Intel R Math Kernel Library versions as mentioned in general informations JUQUEEN only ESSL (Engineering and Scientific Subroutine Library) version 5.1 in /bgsys/local/lib May 2014 I.Gutheil Folie 8
9 Sequential Libraries and Packages (II) Public domain Libraries LAPACK (Linear Algebra PACKage) ARPACK (Arnoldi PACKage) planned GSL (Gnu Scientific Library) GMP (Gnu Multiple Precision Arithmetic Library) May 2014 I.Gutheil Folie 9
10 Contents of Intel R MKL 10.* BLAS, Sparse BLAS, CBLAS LAPACK Iterative Sparse Solvers, Trust Region Solver Vector Math Library Vector Statistical Library Fourier Transform Functions Trigonometric Transform Functions May 2014 I.Gutheil Folie 10
11 Contents of Intel R MKL 10.* GMP routines Poisson Library Interface for fftw For more information see http: // Software/SystemDependentLibraries/MKLJuropa.html May 2014 I.Gutheil Folie 11
12 Contents of ESSL Version 5.1 BLAS level 1-3 and additional vector, matrix-vector, and matrix-matrix operations Sparse vector and matrix operations LAPACK computational routines for linear equation systems and eigensystems Banded linear system solvers Linear Least Squares Fast Fourier Transforms May 2014 I.Gutheil Folie 12
13 Contents of ESSL Version 5.1 (II) Numerical Quadrature Random Number Generation Interpolation For further information see IBM Engineering and Scientific Subroutine Library for Linux on POWER V5.1: Guide and Reference http: // Software/SystemDependentLibraries/ESSL_ESSLSMP.html Link to IBM documents Guide and Reference May 2014 I.Gutheil Folie 13
14 Usage of MKL (I) FORTRAN, C, and C++ callable Arrays FORTRAN like, i.e. column-first (except cblas) Three versions, old and two variants of 10.2 starting with intel/ included in intel module Compilation and linking of program name.f calling sequential MKL routines, default version: ifort name.f -o name -lmkl intel lp64 -lmkl intel thread -lmkl core -liomp5 -lpthread or ifort name.f -o name -lmkl intel lp64 -lmkl sequential -lmkl core -liomp5 -lpthread May 2014 I.Gutheil Folie 14
15 Usage of MKL (II) Compilation and linking of program name.f calling sequential MKL routines starting with intel/ module unload mkl module switch intel intel/ ifort name.f -o name -lmkl intel lp64 -lmkl sequential -lmkl core -liomp5 -lpthread Linking of MKL always dynamic, so modules must be switched before execution, too May 2014 I.Gutheil Folie 15
16 Usage of MKL (III) To use CBLAS include mkl.h into source code Compilation and linking of program name.c calling sequential MKL routines [module unload mkl module switch intel intel/12.0.3] icc name.c -o name -lmkl intel lp64 -lmkl sequential -lmkl core -liomp5 -lpthread [-lifcore -lifport] May 2014 I.Gutheil Folie 16
17 Usage of ESSL FORTRAN, C, and C++ callable, Arrays FORTRAN like, i.e. column-first Header file essl.h for C and C++ Installed in /bgsys/local/lib (not as module) May 2014 I.Gutheil Folie 17
18 Usage of ESSL (II) Compilation and linking of program name.f calling ESSL routines mpixlf90 r name.f -L/bgsys/local/lib -lesslbg Compilation and linking of program name.c calling ESSL routines mpixlc r name.c -I/opt/ibmmath/essl/5.1/include -L/bgsys/local/lib -lesslbg -L/opt/ibmcmp/xlf/bg/14.1/lib64 -lxl -lxlopt -lxlf90 r -lxlfmath -lm -lrt May 2014 I.Gutheil Folie 18
19 LAPACK (I) Part of MKL on Juropa, until intel/11.1.* seperate file, starting with intel/ in libmkl core.a Public domain version 3.3 and on JUQUEEN Experimental C-LAPACK, liblapacke.a in version on JUQUEEN Must be used together with ESSL (or ESSLsmp) Some routines already in ESSL Attention, some calling sequences are different! Experimental LAPACK header file available for C-usage of lapack 3.3 on JUQUEEN (may also be tried with 3.4.2) May 2014 I.Gutheil Folie 19
20 LAPACK (II) Compilation and linking of FORTRAN program name.f calling LAPACK routines JUROPA: (see usage of MKL), -lmkl lapack only up to intel/ JUQUEEN: module load lapack/3.4.2[ g][ simd] mpixlf90 r name.f -Wl,-allow-multiple-definition -L/bgsys/local/lib [-lessl[smp]bg] -L$(LAPACK LIB) -llapack -lessl[smp]bg ESSL must be linked after LAPACK to resolve references May 2014 I.Gutheil Folie 20
21 Arpack ARPACK, ARnoldi PACKage, Version 2.1 Iterative solver for sparse eigenvalue problems Reverse communication interface FORTRAN 77 Calls LAPACK and BLAS routines May 2014 I.Gutheil Folie 21
22 GSL GNU Scientific Library Version 1.13, 1.14(default), and 1.15 with intel/ on JUROPA, 1.15 on JUQUEEN Provides a wide range of mathematical routines Not recommended for performance reasons Often used by configure scripts module load gsl[/1.13][/1.15] JUROPA module load gsl/1.15 O3 JUQUEEN May 2014 I.Gutheil Folie 22
23 NAG Libraries NAG Fortran 77 Mark 23 and 24 on JUROPA in /usr/local/lib, Mark 22 on JUQUEEN: as module more than 1600 user-callable routines NAG C Mark 23: (JUROPA only) as module more than 1000 user-callable routines fl90 Release 4, (JUROPA only) in /usr/local/lib 43 new generic user-callable routines May 2014 I.Gutheil Folie 23
24 Parallell Libraries Threaded Parallelism MKL (JUROPA) is multi-threaded or at least thread-save usage as with sequential routines if OMP NUM THREADS not set, 8 threads used always use ifort name.f -o name -lmkl intel lp64 -lmkl intel thread -lmkl core -liomp5 -lpthread ESSLsmp 5.1 (JUQUEEN) Usage: mpixlf90 r name.f -L/bgsys/local/lib -lesslsmpbg FFTW 3.3 (Fastest Fourier Transform of the West) On JUQUEEN threaded and OpenMP version May 2014 I.Gutheil Folie 24
25 Parallell Libraries MPI Parallelism ScaLAPACK (Scalable Linear Algebra PACKage) ELPA (Eigenvalue SoLvers for Petaflop-Applications) Elemental, C++ framework for parallel dense linear algebra FFTW (Fastest Fourier Transform of the West) MUMPS (MUltifrontal Massively Parallel sparse direct Solver) ParMETIS (Parallel Graph Partitioning) hypre (high performance preconditioners) May 2014 I.Gutheil Folie 25
26 MPI Parallelism (II) PARPACK (Parallel ARPACK), Eigensolver planned SPRNG (Scalable Parallel Random Number Generator) SUNDIALS (SUite of Nonlinear and DIfferential/ALgebraic equation Solvers) Parallel Systems, MPI Parallelism PETSc, toolkit for partial differential equations May 2014 I.Gutheil Folie 26
27 ScaLAPACK JUROPA: part of MKL JUQUEEN: ScaLAPACK Release 2.0.2, contains already BLACS FORTRAN, also C-Interface, scalapack.h incomplete LAPACK has to be linked, too, $LAPACK DIR set together with scalapack May 2014 I.Gutheil Folie 27
28 Contents of ScaLAPACK Parallel BLAS 1-3, PBLAS Version 2 Dense linear system solvers Banded linear system solvers Solvers for Linear Least Squares Problem Singular value decomposition Eigenvalues and eigenvectors of dense symmetric/hermitian matrices May 2014 I.Gutheil Folie 28
29 Usage on JUROPA Linking a program name.f calling routines from ScaLAPACK, default version: mpif77 name.f -lmkl scalapack lp64 -lmkl blacs intelmpi lp64 -lmkl lapack -lmkl intel lp64 -lmkl intel thread -lmkl core -liomp5 -lpthread Intel compiler version and later: module unload mkl module switch intel intel/ (for example) mpif77 name.f -lmkl scalapack lp64 -lmkl blacs intelmpi lp64 -lmkl intel lp64 -lmkl intel thread -lmkl core -liomp5 -lpthread May 2014 I.Gutheil Folie 29
30 Usage on JUQUEEN Compilation and linking of a program name.f calling ScaLAPACK routines: module load scalapack/2.0.2[ g][ simd] mpixlf90 r name.f -L$SCALAPACK LIB -lscalapack -L$LAPACK LIB -llapack -L/bgsys/local/lib -lessl[smp]bg May 2014 I.Gutheil Folie 30
31 ELPA Eigenvalue SoLvers for Petaflop-Applications ELPA uses ScaLAPACK, must be linked together with scalapack FORTRAN 95, same data-distribution as ScaLAPACK http: //elpa.rzg.mpg.de/elpa-english?set_language=en JUROPA development version from November 2012, compiled with intel and version from November 2013 compiled with intel , allows hybrid parallelization JUQUEEN MPI and hybrid version from November 2013 May 2014 I.Gutheil Folie 31
32 Elemental C++ framework, two-dimensional data distribution element by element JUROPA version 0.81 compiled with intel/13.1.3, pure MPI version and hybrid version JUQUEEN 0.78-p1, MPI-only May 2014 I.Gutheil Folie 32
33 MUMPS: Multifrontal Massively Parallel sparse direct Solver Solution of linear systems with symmetric positive definite matrices, general symmetric matrices, general unsymmetric matrices Real or Complex Parallel factorization and solve phase, iterative refinement and backward error analysis F90 and MPI Version 4.8.4, 4.9.2, and on JUROPA, version on JUQUEEN May 2014 I.Gutheil Folie 33
34 ParMETIS Parallel Graph Partinioning and Fill-reducing Matrix Ordering developed in Karypis Lab at the University of Minnesota Version 3.1.1, 3.2.0, and on JUROPA, and on JUQUEEN http: //glaros.dtc.umn.edu/gkhome/metis/parmetis/overview Hypre High performance preconditioners Version 2.0.0, 2.6.0b,2.7.0b, and 2.8.0b on JUROPA, 2.8.0b and 2.9.0b, also version with bigint, on JUQUEEN, bigint cannot be used together with essl May 2014 I.Gutheil Folie 34
35 FFTW Version 2.1.5, this old version contains an MPI-parallel version of FFTW on JUROPA and JUQUEEN Version 3.2 and 3.3 on JUROPA, compiled with intel/ Version 3.3 on JUQUEEN May 2014 I.Gutheil Folie 35
36 PARPACK ARPACK Version 2.1 and PARPACK MPI-Version Must be linked with LAPACK and BLAS Reverse communication interface, user has to supply parallel matrix-vector multiplication May 2014 I.Gutheil Folie 36
37 SPRNG The Scalable Parallel Random Number Generators Library for ASCI Monte Carlo Computations Version 2.0: various random number generators in one Library Version 1.0 seperate library for each random number generator Sundials (CVODE) Package for the solution of ordinary differential equations, Version 2.3.0, 2.4.0, and on JUROPA, on JUQUEEN https: //computation.llnl.gov/casc/sundials/main.html May 2014 I.Gutheil Folie 37
38 PETSc Portable, Extensible Toolkit for Scientific Computation Numerical solution of partial differential equations version on JUROPA and on JUQUEEN basic real and complex versions advanced versions with download other packages on JUQUEEN also version with 8-Byte integer Usage at Research Centre Juelich: module avail petsc module help petsc/[whatever version you want] May 2014 I.Gutheil Folie 38
39 Software for Materials Science Simulation Available on Package JUQUEEN JUROPA Workstations ADF yes Amber yes CP2K yes yes CPMD yes yes Dalton yes Gromacs yes yes GPAW planned yes LAMMPS yes yes Molpro yes yes NAMD yes yes NWChem yes Tremolo yes TURBOMOLE yes May 2014 I.Gutheil Folie 39
40 Further informations and JSC-people Support/Software/Software_node.html Mailto I.Gutheil: Parallel mathematical Libraries J.Mextorf: NAG libraries F.Janetzko: Software for materials science B.Körfgen: Physics and Engineering software Software: May 2014 I.Gutheil Folie 40
Mathematical Libraries on JUQUEEN. JSC Training Course
Mitglied der Helmholtz-Gemeinschaft Mathematical Libraries on JUQUEEN JSC Training Course May 10, 2012 Outline General Informations Sequential Libraries, planned Parallel Libraries and Application Systems:
Introduction to Linux and Cluster Basics for the CCR General Computing Cluster
Introduction to Linux and Cluster Basics for the CCR General Computing Cluster Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St Buffalo, NY 14203 Phone: 716-881-8959
JUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert
Mitglied der Helmholtz-Gemeinschaft JUROPA Linux Cluster An Overview 19 May 2014 Ulrich Detert JuRoPA JuRoPA Jülich Research on Petaflop Architectures Bull, Sun, ParTec, Intel, Mellanox, Novell, FZJ JUROPA
Public Domain commercial vendor specific
Numerical Libraries Numerical L ibraries Public Domain commercial vendor specific 1 Public Domain Lapack-3 linear equations, eigenproblems BLAS fast linear kernels Linpack linear equations Eispack eigenproblems
Cluster performance, how to get the most out of Abel. Ole W. Saastad, Dr.Scient USIT / UAV / FI April 18 th 2013
Cluster performance, how to get the most out of Abel Ole W. Saastad, Dr.Scient USIT / UAV / FI April 18 th 2013 Introduction Architecture x86-64 and NVIDIA Compilers MPI Interconnect Storage Batch queue
Mitglied der Helmholtz-Gemeinschaft JUQUEEN. Best Practices. Florian Janetzko / Wolfgang Frings. 2. Februar 2014
Mitglied der Helmholtz-Gemeinschaft JUQUEEN Best Practices 2. Februar 2014 Florian Janetzko / Wolfgang Frings Outline Production Environment Module Environment Job Execution Basic Porting Compilers and
Experiences of numerical simulations on a PC cluster Antti Vanne December 11, 2002
xperiences of numerical simulations on a P cluster xperiences of numerical simulations on a P cluster ecember xperiences of numerical simulations on a P cluster Introduction eowulf concept Using commodity
It s Not A Disease: The Parallel Solver Packages MUMPS, PaStiX & SuperLU
It s Not A Disease: The Parallel Solver Packages MUMPS, PaStiX & SuperLU A. Windisch PhD Seminar: High Performance Computing II G. Haase March 29 th, 2012, Graz Outline 1 MUMPS 2 PaStiX 3 SuperLU 4 Summary
Poisson Equation Solver Parallelisation for Particle-in-Cell Model
WDS'14 Proceedings of Contributed Papers Physics, 233 237, 214. ISBN 978-8-7378-276-4 MATFYZPRESS Poisson Equation Solver Parallelisation for Particle-in-Cell Model A. Podolník, 1,2 M. Komm, 1 R. Dejarnac,
CHEOPS Cologne High Efficient Operating Platform for Science Brief Instructions
CHEOPS Cologne High Efficient Operating Platform for Science Brief Instructions (Version: 07.10.2013) Foto: Thomas Josek/JosekDesign Viktor Achter Dr. Stefan Borowski Lech Nieroda Dr. Lars Packschies Volker
HSL and its out-of-core solver
HSL and its out-of-core solver Jennifer A. Scott [email protected] Prague November 2006 p. 1/37 Sparse systems Problem: we wish to solve where A is Ax = b LARGE Informal definition: A is sparse if many
Athena Knowledge Base
Athena Knowledge Base The Athena Visual Studio Knowledge Base contains a number of tips, suggestions and how to s that have been recommended by the users of the software. We will continue to enhance this
AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS
AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS Revised Edition James Epperson Mathematical Reviews BICENTENNIAL 0, 1 8 0 7 z ewiley wu 2007 r71 BICENTENNIAL WILEY-INTERSCIENCE A John Wiley & Sons, Inc.,
AMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 19: SVD revisited; Software for Linear Algebra Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 9 Outline 1 Computing
APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder
APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large
Report on Project: Advanced System Monitoring for the Parallel Tools Platform (PTP)
Mitglied der Helmholtz-Gemeinschaft Report on Project: Advanced System Monitoring for the Parallel Tools Platform (PTP) September, 2014 Wolfgang Frings and Carsten Karbach Project progress Server caching
Service Partition Specialized Linux nodes. Compute PE Login PE Network PE System PE I/O PE
2 Service Partition Specialized Linux nodes Compute PE Login PE Network PE System PE I/O PE Microkernel on Compute PEs, full featured Linux on Service PEs. Service PEs specialize by function Software Architecture
The CNMS Computer Cluster
The CNMS Computer Cluster This page describes the CNMS Computational Cluster, how to access it, and how to use it. Introduction (2014) The latest block of the CNMS Cluster (2010) Previous blocks of the
Linux clustering. Morris Law, IT Coordinator, Science Faculty, Hong Kong Baptist University
Linux clustering Morris Law, IT Coordinator, Science Faculty, Hong Kong Baptist University PII 4-node clusters started in 1999 PIII 16 node cluster purchased in 2001. Plan for grid For test base HKBU -
A Crash course to (The) Bighouse
A Crash course to (The) Bighouse Brock Palen [email protected] SVTI Users meeting Sep 20th Outline 1 Resources Configuration Hardware 2 Architecture ccnuma Altix 4700 Brick 3 Software Packaged Software
Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi
Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi ICPP 6 th International Workshop on Parallel Programming Models and Systems Software for High-End Computing October 1, 2013 Lyon, France
1 Bull, 2011 Bull Extreme Computing
1 Bull, 2011 Bull Extreme Computing Table of Contents HPC Overview. Cluster Overview. FLOPS. 2 Bull, 2011 Bull Extreme Computing HPC Overview Ares, Gerardo, HPC Team HPC concepts HPC: High Performance
Cluster Computing at HRI
Cluster Computing at HRI J.S.Bagla Harish-Chandra Research Institute, Chhatnag Road, Jhunsi, Allahabad 211019. E-mail: [email protected] 1 Introduction and some local history High performance computing
Cluster Computing in a College of Criminal Justice
Cluster Computing in a College of Criminal Justice Boris Bondarenko and Douglas E. Salane Mathematics & Computer Science Dept. John Jay College of Criminal Justice The City University of New York 2004
PARALLEL PROGRAMMING
PARALLEL PROGRAMMING TECHNIQUES AND APPLICATIONS USING NETWORKED WORKSTATIONS AND PARALLEL COMPUTERS 2nd Edition BARRY WILKINSON University of North Carolina at Charlotte Western Carolina University MICHAEL
22S:295 Seminar in Applied Statistics High Performance Computing in Statistics
22S:295 Seminar in Applied Statistics High Performance Computing in Statistics Luke Tierney Department of Statistics & Actuarial Science University of Iowa August 30, 2007 Luke Tierney (U. of Iowa) HPC
INTEL PARALLEL STUDIO XE EVALUATION GUIDE
Introduction This guide will illustrate how you use Intel Parallel Studio XE to find the hotspots (areas that are taking a lot of time) in your application and then recompiling those parts to improve overall
MPI Hands-On List of the exercises
MPI Hands-On List of the exercises 1 MPI Hands-On Exercise 1: MPI Environment.... 2 2 MPI Hands-On Exercise 2: Ping-pong...3 3 MPI Hands-On Exercise 3: Collective communications and reductions... 5 4 MPI
DARPA, NSF-NGS/ITR,ACR,CPA,
Spiral Automating Library Development Markus Püschel and the Spiral team (only part shown) With: Srinivas Chellappa Frédéric de Mesmay Franz Franchetti Daniel McFarlin Yevgen Voronenko Electrical and Computer
Multicore Parallel Computing with OpenMP
Multicore Parallel Computing with OpenMP Tan Chee Chiang (SVU/Academic Computing, Computer Centre) 1. OpenMP Programming The death of OpenMP was anticipated when cluster systems rapidly replaced large
Matrix Multiplication
Matrix Multiplication CPS343 Parallel and High Performance Computing Spring 2016 CPS343 (Parallel and HPC) Matrix Multiplication Spring 2016 1 / 32 Outline 1 Matrix operations Importance Dense and sparse
Sourcery Overview & Virtual Machine Installation
Sourcery Overview & Virtual Machine Installation Damian Rouson, Ph.D., P.E. Sourcery, Inc. www.sourceryinstitute.org Sourcery, Inc. About Us Sourcery, Inc., is a software consultancy founded by and for
Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture
White Paper Intel Xeon processor E5 v3 family Intel Xeon Phi coprocessor family Digital Design and Engineering Three Paths to Faster Simulations Using ANSYS Mechanical 16.0 and Intel Architecture Executive
Code Generation Tools for PDEs. Matthew Knepley PETSc Developer Mathematics and Computer Science Division Argonne National Laboratory
Code Generation Tools for PDEs Matthew Knepley PETSc Developer Mathematics and Computer Science Division Argonne National Laboratory Talk Objectives Introduce Code Generation Tools - Installation - Use
The Assessment of Benchmarks Executed on Bare-Metal and Using Para-Virtualisation
The Assessment of Benchmarks Executed on Bare-Metal and Using Para-Virtualisation Mark Baker, Garry Smith and Ahmad Hasaan SSE, University of Reading Paravirtualization A full assessment of paravirtualization
Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms
Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Family-Based Platforms Executive Summary Complex simulations of structural and systems performance, such as car crash simulations,
Algorithmic Research and Software Development for an Industrial Strength Sparse Matrix Library for Parallel Computers
The Boeing Company P.O.Box3707,MC7L-21 Seattle, WA 98124-2207 Final Technical Report February 1999 Document D6-82405 Copyright 1999 The Boeing Company All Rights Reserved Algorithmic Research and Software
Numerical Methods I Eigenvalue Problems
Numerical Methods I Eigenvalue Problems Aleksandar Donev Courant Institute, NYU 1 [email protected] 1 Course G63.2010.001 / G22.2420-001, Fall 2010 September 30th, 2010 A. Donev (Courant Institute)
HPC Wales Skills Academy Course Catalogue 2015
HPC Wales Skills Academy Course Catalogue 2015 Overview The HPC Wales Skills Academy provides a variety of courses and workshops aimed at building skills in High Performance Computing (HPC). Our courses
INTEL PARALLEL STUDIO EVALUATION GUIDE. Intel Cilk Plus: A Simple Path to Parallelism
Intel Cilk Plus: A Simple Path to Parallelism Compiler extensions to simplify task and data parallelism Intel Cilk Plus adds simple language extensions to express data and task parallelism to the C and
GPUs for Scientific Computing
GPUs for Scientific Computing p. 1/16 GPUs for Scientific Computing Mike Giles [email protected] Oxford-Man Institute of Quantitative Finance Oxford University Mathematical Institute Oxford e-research
Overview of HPC systems and software available within
Overview of HPC systems and software available within Overview Available HPC Systems Ba Cy-Tera Available Visualization Facilities Software Environments HPC System at Bibliotheca Alexandrina SUN cluster
SOLVING LINEAR SYSTEMS
SOLVING LINEAR SYSTEMS Linear systems Ax = b occur widely in applied mathematics They occur as direct formulations of real world problems; but more often, they occur as a part of the numerical analysis
A Grid-Aware Web Interface with Advanced Service Trading for Linear Algebra Calculations
A Grid-Aware Web Interface with Advanced Service Trading for Linear Algebra Calculations Hrachya Astsatryan 1, Vladimir Sahakyan 1, Yuri Shoukouryan 1, Michel Daydé 2, Aurelie Hurault 2, Marc Pantel 2,
High Performance Computing Cluster Quick Reference User Guide
High Performance Computing Cluster Quick Reference User Guide Base Operating System: Redhat(TM) / Scientific Linux 5.5 with Alces HPC Software Stack Copyright 2011 Alces Software Ltd All Rights Reserved
Numerical Analysis. Professor Donna Calhoun. Fall 2013 Math 465/565. Office : MG241A Office Hours : Wednesday 10:00-12:00 and 1:00-3:00
Numerical Analysis Professor Donna Calhoun Office : MG241A Office Hours : Wednesday 10:00-12:00 and 1:00-3:00 Fall 2013 Math 465/565 http://math.boisestate.edu/~calhoun/teaching/math565_fall2013 What is
High Performance Matrix Inversion with Several GPUs
High Performance Matrix Inversion on a Multi-core Platform with Several GPUs Pablo Ezzatti 1, Enrique S. Quintana-Ortí 2 and Alfredo Remón 2 1 Centro de Cálculo-Instituto de Computación, Univ. de la República
Mitglied der Helmholtz-Gemeinschaft. System monitoring with LLview and the Parallel Tools Platform
Mitglied der Helmholtz-Gemeinschaft System monitoring with LLview and the Parallel Tools Platform November 25, 2014 Carsten Karbach Content 1 LLview 2 Parallel Tools Platform (PTP) 3 Latest features 4
The Forthcoming Petascale Systems Era Got Tools?
Era Got Tools? Tony Drummond Computational Research Division Lawrence Berkeley National Laboratory Salishan April 21, 2005 Where are the applications? Accelerator Science Astrophysics Biology Chemistry
Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises
Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises Pierre-Yves Taunay Research Computing and Cyberinfrastructure 224A Computer Building The Pennsylvania State University University
PLGrid Infrastructure Solutions For Computational Chemistry
PLGrid Infrastructure Solutions For Computational Chemistry Mariola Czuchry, Klemens Noga, Mariusz Sterzel ACC Cyfronet AGH 2 nd Polish- Taiwanese Conference From Molecular Modeling to Nano- and Biotechnology,
Debugging with TotalView
Tim Cramer 17.03.2015 IT Center der RWTH Aachen University Why to use a Debugger? If your program goes haywire, you may... ( wand (... buy a magic... read the source code again and again and...... enrich
Building a Top500-class Supercomputing Cluster at LNS-BUAP
Building a Top500-class Supercomputing Cluster at LNS-BUAP Dr. José Luis Ricardo Chávez Dr. Humberto Salazar Ibargüen Dr. Enrique Varela Carlos Laboratorio Nacional de Supercómputo Benemérita Universidad
Parallel Ray Tracing using MPI: A Dynamic Load-balancing Approach
Parallel Ray Tracing using MPI: A Dynamic Load-balancing Approach S. M. Ashraful Kadir 1 and Tazrian Khan 2 1 Scientific Computing, Royal Institute of Technology (KTH), Stockholm, Sweden [email protected],
OpenMP & MPI CISC 879. Tristan Vanderbruggen & John Cavazos Dept of Computer & Information Sciences University of Delaware
OpenMP & MPI CISC 879 Tristan Vanderbruggen & John Cavazos Dept of Computer & Information Sciences University of Delaware 1 Lecture Overview Introduction OpenMP MPI Model Language extension: directives-based
64-Bit versus 32-Bit CPUs in Scientific Computing
64-Bit versus 32-Bit CPUs in Scientific Computing Axel Kohlmeyer Lehrstuhl für Theoretische Chemie Ruhr-Universität Bochum March 2004 1/25 Outline 64-Bit and 32-Bit CPU Examples
Part I Courses Syllabus
Part I Courses Syllabus This document provides detailed information about the basic courses of the MHPC first part activities. The list of courses is the following 1.1 Scientific Programming Environment
Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC
Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC Goals of the session Overview of parallel MATLAB Why parallel MATLAB? Multiprocessing in MATLAB Parallel MATLAB using the Parallel Computing
Yousef Saad University of Minnesota Computer Science and Engineering. CRM Montreal - April 30, 2008
A tutorial on: Iterative methods for Sparse Matrix Problems Yousef Saad University of Minnesota Computer Science and Engineering CRM Montreal - April 30, 2008 Outline Part 1 Sparse matrices and sparsity
Giac/Xcas, a swiss knife for mathematics
Bernard Parisse Bernard Parisse University of Grenoble I Trophées du Libre 2007 Plan 1 : interface for CAS, dynamic geometry and spreadsheet, audience: scienti c students to research 2 : a C++ library,
CUDA programming on NVIDIA GPUs
p. 1/21 on NVIDIA GPUs Mike Giles [email protected] Oxford University Mathematical Institute Oxford-Man Institute for Quantitative Finance Oxford eresearch Centre p. 2/21 Overview hardware view
YALES2 porting on the Xeon- Phi Early results
YALES2 porting on the Xeon- Phi Early results Othman Bouizi Ghislain Lartigue Innovation and Pathfinding Architecture Group in Europe, Exascale Lab. Paris CRIHAN - Demi-journée calcul intensif, 16 juin
BIG CPU, BIG DATA. Solving the World s Toughest Computational Problems with Parallel Computing. Alan Kaminsky
Solving the World s Toughest Computational Problems with Parallel Computing Alan Kaminsky Solving the World s Toughest Computational Problems with Parallel Computing Alan Kaminsky Department of Computer
Fast Multipole Method for particle interactions: an open source parallel library component
Fast Multipole Method for particle interactions: an open source parallel library component F. A. Cruz 1,M.G.Knepley 2,andL.A.Barba 1 1 Department of Mathematics, University of Bristol, University Walk,
How High a Degree is High Enough for High Order Finite Elements?
This space is reserved for the Procedia header, do not use it How High a Degree is High Enough for High Order Finite Elements? William F. National Institute of Standards and Technology, Gaithersburg, Maryland,
Introduction to ACENET Accelerating Discovery with Computational Research May, 2015
Introduction to ACENET Accelerating Discovery with Computational Research May, 2015 What is ACENET? What is ACENET? Shared regional resource for... high-performance computing (HPC) remote collaboration
Making the Monte Carlo Approach Even Easier and Faster. By Sergey A. Maidanov and Andrey Naraikin
Making the Monte Carlo Approach Even Easier and Faster By Sergey A. Maidanov and Andrey Naraikin Libraries of random-number generators for general probability distributions can make implementing Monte
Performance and Scalability of the NAS Parallel Benchmarks in Java
Performance and Scalability of the NAS Parallel Benchmarks in Java Michael A. Frumkin, Matthew Schultz, Haoqiang Jin, and Jerry Yan NASA Advanced Supercomputing (NAS) Division NASA Ames Research Center,
Mathematics (MAT) MAT 061 Basic Euclidean Geometry 3 Hours. MAT 051 Pre-Algebra 4 Hours
MAT 051 Pre-Algebra Mathematics (MAT) MAT 051 is designed as a review of the basic operations of arithmetic and an introduction to algebra. The student must earn a grade of C or in order to enroll in MAT
Notes on Cholesky Factorization
Notes on Cholesky Factorization Robert A. van de Geijn Department of Computer Science Institute for Computational Engineering and Sciences The University of Texas at Austin Austin, TX 78712 [email protected]
Programming Languages & Tools
4 Programming Languages & Tools Almost any programming language one is familiar with can be used for computational work (despite the fact that some people believe strongly that their own favorite programming
Best practices for efficient HPC performance with large models
Best practices for efficient HPC performance with large models Dr. Hößl Bernhard, CADFEM (Austria) GmbH PRACE Autumn School 2013 - Industry Oriented HPC Simulations, September 21-27, University of Ljubljana,
Why (and Why Not) to Use Fortran
Why (and Why Not) to Use Fortran p. 1/?? Why (and Why Not) to Use Fortran Instead of C++, Matlab, Python etc. Nick Maclaren University of Cambridge Computing Service [email protected], 01223 334761 June 2012
Hands-on exercise: NPB-OMP / BT
Hands-on exercise: NPB-OMP / BT VI-HPS Team 1 Tutorial exercise objectives Familiarise with usage of VI-HPS tools complementary tools capabilities & interoperability Prepare to apply tools productively
Adaptive Stable Additive Methods for Linear Algebraic Calculations
Adaptive Stable Additive Methods for Linear Algebraic Calculations József Smidla, Péter Tar, István Maros University of Pannonia Veszprém, Hungary 4 th of July 204. / 2 József Smidla, Péter Tar, István
WESTMORELAND COUNTY PUBLIC SCHOOLS 2011 2012 Integrated Instructional Pacing Guide and Checklist Computer Math
Textbook Correlation WESTMORELAND COUNTY PUBLIC SCHOOLS 2011 2012 Integrated Instructional Pacing Guide and Checklist Computer Math Following Directions Unit FIRST QUARTER AND SECOND QUARTER Logic Unit
The Asynchronous Dynamic Load-Balancing Library
The Asynchronous Dynamic Load-Balancing Library Rusty Lusk, Steve Pieper, Ralph Butler, Anthony Chan Mathematics and Computer Science Division Nuclear Physics Division Outline The Nuclear Physics problem
The Top Six Advantages of CUDA-Ready Clusters. Ian Lumb Bright Evangelist
The Top Six Advantages of CUDA-Ready Clusters Ian Lumb Bright Evangelist GTC Express Webinar January 21, 2015 We scientists are time-constrained, said Dr. Yamanaka. Our priority is our research, not managing
Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes
Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Eric Petit, Loïc Thebault, Quang V. Dinh May 2014 EXA2CT Consortium 2 WPs Organization Proto-Applications
Documentation Installation of the PDR code
Documentation Installation of the PDR code Franck Le Petit mardi 30 décembre 2008 Requirements Requirements depend on the way the code is run and on the version of the code. To install locally the Meudon
Advanced MPI. Hybrid programming, profiling and debugging of MPI applications. Hristo Iliev RZ. Rechen- und Kommunikationszentrum (RZ)
Advanced MPI Hybrid programming, profiling and debugging of MPI applications Hristo Iliev RZ Rechen- und Kommunikationszentrum (RZ) Agenda Halos (ghost cells) Hybrid programming Profiling of MPI applications
Introduction Installation Comparison. Department of Computer Science, Yazd University. SageMath. A.Rahiminasab. October9, 2015 1 / 17
Department of Computer Science, Yazd University SageMath A.Rahiminasab October9, 2015 1 / 17 2 / 17 SageMath(previously Sage or SAGE) System for Algebra and Geometry Experimentation is mathematical software
HPC enabling of OpenFOAM R for CFD applications
HPC enabling of OpenFOAM R for CFD applications Towards the exascale: OpenFOAM perspective Ivan Spisso 25-27 March 2015, Casalecchio di Reno, BOLOGNA. SuperComputing Applications and Innovation Department,
ACCELERATING COMMERCIAL LINEAR DYNAMIC AND NONLINEAR IMPLICIT FEA SOFTWARE THROUGH HIGH- PERFORMANCE COMPUTING
ACCELERATING COMMERCIAL LINEAR DYNAMIC AND Vladimir Belsky Director of Solver Development* Luis Crivelli Director of Solver Development* Matt Dunbar Chief Architect* Mikhail Belyi Development Group Manager*
2IP WP8 Materiel Science Activity report March 6, 2013
2IP WP8 Materiel Science Activity report March 6, 2013 Codes involved in this task ABINIT (M.Torrent) Quantum ESPRESSO (F. Affinito) YAMBO + Octopus (F. Nogueira) SIESTA (G. Huhs) EXCITING/ELK (A. Kozhevnikov)
SAGE, the open source CAS to end all CASs?
SAGE, the open source CAS to end all CASs? Thomas Risse Faculty of Electrical and Electronics Engineering and Computer Sciences, Bremen University of Applied Sciences, Germany Abstract SAGE, the 'Software
