Mesh Generation and Load Balancing
|
|
|
- Chastity Greene
- 10 years ago
- Views:
Transcription
1 Mesh Generation and Load Balancing Stan Tomov Innovative Computing Laboratory Computer Science Department The University of Tennessee April 04, 2012 CS /04/2012 Slide 1 / 19
2 Outline Motivation Reliable & efficient PDE simulations for high end computing systems Background PDE simulation concept: approximation is over a mesh Error Analysis Simulation error: related to local mesh size Adaptive Mesh Generation Support parallel refinement/derefinement and element migration Load Balancing Scalability of the computation on modern architectures Data structures Algorithmically motivated: multigrid, domain decomposition, etc. For performance optimization: architecture aware computing Numerical Example Conclusions Slide 2 / 19
3 Motivation PDE simulations have errors stemming from the (... mesh, numerical approximation (related to the The need for Reliable: error to be less than desirable tolerance and Efficient: do not do overkill computation PDE simulations for High end computing systems. Slide 3 / 19
4 Background In general: Error from the discretization is proportional to the mesh size A problem: localized physical phenomena deteriorate the approximation properties of classical PDE approximations How can we find a good mesh, i.e. yielding small and reliable error and efficient computation For example flows near wells; faults; moving fronts, etc. Slide 4 / 19
5 Background Solution: (1) determine (automatically) the regions of singular behaviour, and (2) refine them in a balanced manner Example: Efficiency of locally adapted vs uniform approximation of on an L-shaped domain Slide 5 / 19
6 Background Computational framework of the Adaptive methods: Solve PDE Evaluate the approximation error Is error acceptable no yes done ( refinement Improve the approximation (h/p i.e. a process of continuous feedback from the computation to find a reliable and efficient numerical PDE approximation Slide 6 / 19
7 Error Analysis ( FEM The numerical solution of PDE (e.g. Boundary value problem: Au = f, subject to boundary conditions Get a weak formulation: (Au, ϕ) = (f, ϕ) - multiply by test function ϕ and integrate over the domain a( u, ϕ) = <f, ϕ> for ϕ S φ i Galerkin (FEM) problem: Find u h S h S s.t. a( u h, ϕ h ) = <f, ϕ h > for ϕ h S h The error e u u h The Error problem: a( e, ϕ) = a( u- u h, ϕ) = <A(u- u h, ϕ> = <f-a u h, ϕ> < R h, ϕ> for ϕ S Various error estimators: depend on how we solve the Error problem i Slide 7 / 19
8 Adaptive Mesh Generation good mesh ~ good approximation Huge area of research and software development See Steven Owen s (Sandia) survey Which one to choose? Classification: structured/unstructured, element type, support/or no adaptivity, sequential/parallel, etc. Algorithm requirements: conforming/non-conforming, problem size, etc For HPC on 100s of 1000s of processors: parallel adaptive Software design: framework (application is embedded) or ( compliant toolkits (CCA interface Important algorithmic issues to consider low bookkeeping and storage overhead, easy data transfer between meshes, load balancing Slide 8 / 19
9 Adaptive Mesh Generation Mesh generation techniques Regenerate the mesh locally or globally; Appealing only for steady state problems; produce meshes with particular properties (Delaunay/Voronoi, etc.). Hierarchical refinement (common method of choice in AMR; (* ParaGrid used in keep hierarchy of meshes; good for both steady & transient problems; algorithms to maintain mesh quality. Various hybrid methods h-refinement with various local node movements (r-refinement/mesh smoothing); various patch-grid refinement strategies; techniques for coupling various grids, etc. Slide 9 / 19
10 Adaptive Mesh Generation Hierarchical mesh generation ( bisection Element subdivision (e.g. tetrahedral edge ( 2/3D Hierarchy is usually stored in tree (e.g. quad/octrees in Facilitate coarsening Natural creation of multilevel data structures for multilevel solvers Research on various formats/tricks to reduce storage overhead Exploring the deterministic nature of refinement ( elements (relation of parent-child Slide 10 / 19
11 Load Balancing The need for load balance throughout the adaptive solution process Minimize idle time + interprocessor comm. ~ scalability Partitioning for Load balance, and minimal interface is NP-complete but there are many heuristics discovered, See the survey Schloegel K., Karypis G. and Kumar V.: "Graph Partitioning for High Performance Scientific Simulations", in CRPC Parallel Computing Handbook by Dongarra J., Foster I., Fox G., Kennedy K. and White A. (eds.), Morgan Kaufmann, Slide 11 / 19
12 History of partitioning algorithms ( (Vipin Kumar: Space filling curve ( Party (available in Metis, Jostle, Cartesian nested dissection Multilevel partitioning Slide 12 / 19
13 Dynamic Load Balancing Issues to consider: (? matrices Load balance (what about DD with conditioned subdomain Minimize edge cut ( expensive Minimize data redistribution cost (most Rebuild internal and shared data structures What about balance for multilevel data structures? Two main techniques (ParMETIS supports both): ( neighbors Diffusive ( diffuse load among Global (global repartition + smart remapping to minimize ( cost redistribution Which one to choose? See for example: R. Biswas, S. Das, D. Harvey, L. Oliker, Parallel Dynamic Load Balancing Strategies For Adaptive Irregular Applications Slide 13 / 19
14 Data Structures Adaptive methods run at a fraction of the performance peak of cache-based machines: This is due to irregular memory access patterns because of their dynamic nature and unstructured sparse matrices produced Can be improved but efficient parallel programming is difficult for this class of problems A lot of current work in the field is on data structures Improve memory access patterns for better cache reuse (to be discussed further in Lecture #3 ) Slide 14 / 19
15 Data Structures In particular: for adaptive mesh generation Need distributed tree (quadtree/octree) for efficient derefinement Contiguous storage space + hash table access ( performance ) Use the deterministic nature of the ( storage refinement/coarsening (minimize data Slide 15 / 19
16 Data Structures Algorithmically motivated Multigrid Domain decomposition For performance of parallel matrix-vector product Pre-compute & block inter-processor communication patterns Index ordering (SAW, space filling curves, Cuthill McKee, etc) and structures for sparse matrix storage for register blocking, and cache blocking see the Sparsity & BeBOP projects at Berkeley; Techniques: similar to blocking for dense matrices; ( tuning architecture dependant (need processor-specific Slide 16 / 19
17 When does register/cache blocking work? ( group (see R. Nishtala et.al., 2006; R. Vuduc 2003; Berkeley optimization Discontinuous Galerkin FEM is of great current interest Mesh and nonzero matrix structures for approximation of order 1 and 3 ( applications (pictures from D.Darmofal, 2004; MIT; aerospace Naturally occurring dense blocks: open possibilities for various register and multiple level of cache blocking techniques Tuning: Machine dependant Performance can be surprising ( 2003 (Vuduc, Need for automatic machine-specific search register blocking for sparse 3-diagonal matrix consisting of ( 2 8x8 dense blocks (on Intel Itanium explored by storing them as 8x8 blocks what about r x c block storage? shown is the speedup relative to unblocked 1 x 1 code Slide 17 / 19
18 Numerical Example An example of contaminant flow in porous media: ) 1000,500,500 ( 30 mg/l ) 0,0,0 ( Slide 18 / 19
19 Conclusions Adaptive methods: A computational methodology for reliable and efficient numerical solution of PDE problems Multidisciplinary field CS, math, engineering Need multidisciplinary effort for their successful development Overview of the CS aspects The goal: develop the methodology and build it into an intelligent adaptable reconfigurable system for current and next generation supercomputers Slide 19 / 19
Fast Multipole Method for particle interactions: an open source parallel library component
Fast Multipole Method for particle interactions: an open source parallel library component F. A. Cruz 1,M.G.Knepley 2,andL.A.Barba 1 1 Department of Mathematics, University of Bristol, University Walk,
Mesh Partitioning and Load Balancing
and Load Balancing Contents: Introduction / Motivation Goals of Load Balancing Structures Tools Slide Flow Chart of a Parallel (Dynamic) Application Partitioning of the initial mesh Computation Iteration
FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG
FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG INSTITUT FÜR INFORMATIK (MATHEMATISCHE MASCHINEN UND DATENVERARBEITUNG) Lehrstuhl für Informatik 10 (Systemsimulation) Massively Parallel Multilevel Finite
Load Balancing Strategies for Parallel SAMR Algorithms
Proposal for a Summer Undergraduate Research Fellowship 2005 Computer science / Applied and Computational Mathematics Load Balancing Strategies for Parallel SAMR Algorithms Randolf Rotta Institut für Informatik,
Partitioning and Dynamic Load Balancing for Petascale Applications
Partitioning and Dynamic Load Balancing for Petascale Applications Karen Devine, Sandia National Laboratories Erik Boman, Sandia National Laboratories Umit Çatalyürek, Ohio State University Lee Ann Riesen,
Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes
Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Eric Petit, Loïc Thebault, Quang V. Dinh May 2014 EXA2CT Consortium 2 WPs Organization Proto-Applications
How High a Degree is High Enough for High Order Finite Elements?
This space is reserved for the Procedia header, do not use it How High a Degree is High Enough for High Order Finite Elements? William F. National Institute of Standards and Technology, Gaithersburg, Maryland,
Characterizing the Performance of Dynamic Distribution and Load-Balancing Techniques for Adaptive Grid Hierarchies
Proceedings of the IASTED International Conference Parallel and Distributed Computing and Systems November 3-6, 1999 in Cambridge Massachusetts, USA Characterizing the Performance of Dynamic Distribution
HPC enabling of OpenFOAM R for CFD applications
HPC enabling of OpenFOAM R for CFD applications Towards the exascale: OpenFOAM perspective Ivan Spisso 25-27 March 2015, Casalecchio di Reno, BOLOGNA. SuperComputing Applications and Innovation Department,
Hypergraph-based Dynamic Load Balancing for Adaptive Scientific Computations
Hypergraph-based Dynamic Load Balancing for Adaptive Scientific Computations Umit V. Catalyurek, Erik G. Boman, Karen D. Devine, Doruk Bozdağ, Robert Heaphy, and Lee Ann Riesen Ohio State University Sandia
Yousef Saad University of Minnesota Computer Science and Engineering. CRM Montreal - April 30, 2008
A tutorial on: Iterative methods for Sparse Matrix Problems Yousef Saad University of Minnesota Computer Science and Engineering CRM Montreal - April 30, 2008 Outline Part 1 Sparse matrices and sparsity
Distributed Dynamic Load Balancing for Iterative-Stencil Applications
Distributed Dynamic Load Balancing for Iterative-Stencil Applications G. Dethier 1, P. Marchot 2 and P.A. de Marneffe 1 1 EECS Department, University of Liege, Belgium 2 Chemical Engineering Department,
Matrix Multiplication
Matrix Multiplication CPS343 Parallel and High Performance Computing Spring 2016 CPS343 (Parallel and HPC) Matrix Multiplication Spring 2016 1 / 32 Outline 1 Matrix operations Importance Dense and sparse
Lecture 12: Partitioning and Load Balancing
Lecture 12: Partitioning and Load Balancing G63.2011.002/G22.2945.001 November 16, 2010 thanks to Schloegel,Karypis and Kumar survey paper and Zoltan website for many of today s slides and pictures Partitioning
Scientific Computing Programming with Parallel Objects
Scientific Computing Programming with Parallel Objects Esteban Meneses, PhD School of Computing, Costa Rica Institute of Technology Parallel Architectures Galore Personal Computing Embedded Computing Moore
Hash-Storage Techniques for Adaptive Multilevel Solvers and Their Domain Decomposition Parallelization
Contemporary Mathematics Volume 218, 1998 B 0-8218-0988-1-03018-1 Hash-Storage Techniques for Adaptive Multilevel Solvers and Their Domain Decomposition Parallelization Michael Griebel and Gerhard Zumbusch
Performance of Dynamic Load Balancing Algorithms for Unstructured Mesh Calculations
Performance of Dynamic Load Balancing Algorithms for Unstructured Mesh Calculations Roy D. Williams, 1990 Presented by Chris Eldred Outline Summary Finite Element Solver Load Balancing Results Types Conclusions
Distributed communication-aware load balancing with TreeMatch in Charm++
Distributed communication-aware load balancing with TreeMatch in Charm++ The 9th Scheduling for Large Scale Systems Workshop, Lyon, France Emmanuel Jeannot Guillaume Mercier Francois Tessier In collaboration
Designing and Building Applications for Extreme Scale Systems CS598 William Gropp www.cs.illinois.edu/~wgropp
Designing and Building Applications for Extreme Scale Systems CS598 William Gropp www.cs.illinois.edu/~wgropp Welcome! Who am I? William (Bill) Gropp Professor of Computer Science One of the Creators of
New Challenges in Dynamic Load Balancing
New Challenges in Dynamic Load Balancing Karen D. Devine 1, Erik G. Boman, Robert T. Heaphy, Bruce A. Hendrickson Discrete Algorithms and Mathematics Department, Sandia National Laboratories, Albuquerque,
Dynamic Load Balancing in Charm++ Abhinav S Bhatele Parallel Programming Lab, UIUC
Dynamic Load Balancing in Charm++ Abhinav S Bhatele Parallel Programming Lab, UIUC Outline Dynamic Load Balancing framework in Charm++ Measurement Based Load Balancing Examples: Hybrid Load Balancers Topology-aware
FEM Software Automation, with a case study on the Stokes Equations
FEM Automation, with a case study on the Stokes Equations FEM Andy R Terrel Advisors: L R Scott and R C Kirby Numerical from Department of Computer Science University of Chicago March 1, 2006 Masters Presentation
Power-Aware High-Performance Scientific Computing
Power-Aware High-Performance Scientific Computing Padma Raghavan Scalable Computing Laboratory Department of Computer Science Engineering The Pennsylvania State University http://www.cse.psu.edu/~raghavan
A FAST AND HIGH QUALITY MULTILEVEL SCHEME FOR PARTITIONING IRREGULAR GRAPHS
SIAM J. SCI. COMPUT. Vol. 20, No., pp. 359 392 c 998 Society for Industrial and Applied Mathematics A FAST AND HIGH QUALITY MULTILEVEL SCHEME FOR PARTITIONING IRREGULAR GRAPHS GEORGE KARYPIS AND VIPIN
HPC Deployment of OpenFOAM in an Industrial Setting
HPC Deployment of OpenFOAM in an Industrial Setting Hrvoje Jasak [email protected] Wikki Ltd, United Kingdom PRACE Seminar: Industrial Usage of HPC Stockholm, Sweden, 28-29 March 2011 HPC Deployment
Chapter 12: Multiprocessor Architectures. Lesson 01: Performance characteristics of Multiprocessor Architectures and Speedup
Chapter 12: Multiprocessor Architectures Lesson 01: Performance characteristics of Multiprocessor Architectures and Speedup Objective Be familiar with basic multiprocessor architectures and be able to
AN APPROACH FOR SECURE CLOUD COMPUTING FOR FEM SIMULATION
AN APPROACH FOR SECURE CLOUD COMPUTING FOR FEM SIMULATION Jörg Frochte *, Christof Kaufmann, Patrick Bouillon Dept. of Electrical Engineering and Computer Science Bochum University of Applied Science 42579
A Pattern-Based Approach to. Automated Application Performance Analysis
A Pattern-Based Approach to Automated Application Performance Analysis Nikhil Bhatia, Shirley Moore, Felix Wolf, and Jack Dongarra Innovative Computing Laboratory University of Tennessee (bhatia, shirley,
HPC Infrastructure Development in Bulgaria
HPC Infrastructure Development in Bulgaria Svetozar Margenov [email protected] Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Acad. G. Bonchev Str. Bl. 25-A,
P013 INTRODUCING A NEW GENERATION OF RESERVOIR SIMULATION SOFTWARE
1 P013 INTRODUCING A NEW GENERATION OF RESERVOIR SIMULATION SOFTWARE JEAN-MARC GRATIEN, JEAN-FRANÇOIS MAGRAS, PHILIPPE QUANDALLE, OLIVIER RICOIS 1&4, av. Bois-Préau. 92852 Rueil Malmaison Cedex. France
Basin simulation for complex geological settings
Énergies renouvelables Production éco-responsable Transports innovants Procédés éco-efficients Ressources durables Basin simulation for complex geological settings Towards a realistic modeling P. Havé*,
CHAPTER 1 INTRODUCTION
1 CHAPTER 1 INTRODUCTION 1.1 MOTIVATION OF RESEARCH Multicore processors have two or more execution cores (processors) implemented on a single chip having their own set of execution and architectural recourses.
Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer
Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer Stan Posey, MSc and Bill Loewe, PhD Panasas Inc., Fremont, CA, USA Paul Calleja, PhD University of Cambridge,
Load Balancing and Communication Optimization for Parallel Adaptive Finite Element Methods
Load Balancing and Communication Optimization for Parallel Adaptive Finite Element Methods J. E. Flaherty, R. M. Loy, P. C. Scully, M. S. Shephard, B. K. Szymanski, J. D. Teresco, and L. H. Ziantz Computer
Load Balance Strategies for DEVS Approximated Parallel and Distributed Discrete-Event Simulations
Load Balance Strategies for DEVS Approximated Parallel and Distributed Discrete-Event Simulations Alonso Inostrosa-Psijas, Roberto Solar, Verónica Gil-Costa and Mauricio Marín Universidad de Santiago,
Load Balancing on a Non-dedicated Heterogeneous Network of Workstations
Load Balancing on a Non-dedicated Heterogeneous Network of Workstations Dr. Maurice Eggen Nathan Franklin Department of Computer Science Trinity University San Antonio, Texas 78212 Dr. Roger Eggen Department
ParFUM: A Parallel Framework for Unstructured Meshes. Aaron Becker, Isaac Dooley, Terry Wilmarth, Sayantan Chakravorty Charm++ Workshop 2008
ParFUM: A Parallel Framework for Unstructured Meshes Aaron Becker, Isaac Dooley, Terry Wilmarth, Sayantan Chakravorty Charm++ Workshop 2008 What is ParFUM? A framework for writing parallel finite element
Introduction to Parallel Computing. George Karypis Parallel Programming Platforms
Introduction to Parallel Computing George Karypis Parallel Programming Platforms Elements of a Parallel Computer Hardware Multiple Processors Multiple Memories Interconnection Network System Software Parallel
Optimizing Load Balance Using Parallel Migratable Objects
Optimizing Load Balance Using Parallel Migratable Objects Laxmikant V. Kalé, Eric Bohm Parallel Programming Laboratory University of Illinois Urbana-Champaign 2012/9/25 Laxmikant V. Kalé, Eric Bohm (UIUC)
Praktikum Wissenschaftliches Rechnen (Performance-optimized optimized Programming)
Praktikum Wissenschaftliches Rechnen (Performance-optimized optimized Programming) Dynamic Load Balancing Dr. Ralf-Peter Mundani Center for Simulation Technology in Engineering Technische Universität München
Dynamic Load Balancing of SAMR Applications on Distributed Systems y
Dynamic Load Balancing of SAMR Applications on Distributed Systems y Zhiling Lan, Valerie E. Taylor Department of Electrical and Computer Engineering Northwestern University, Evanston, IL 60208 fzlan,
Integrated System Modeling for Handling Big Data in Electric Utility Systems
Integrated System Modeling for Handling Big Data in Electric Utility Systems Stephanie Hamilton Brookhaven National Laboratory Robert Broadwater EDD [email protected] 1 Finding Good Solutions for the Hard
A Refinement-tree Based Partitioning Method for Dynamic Load Balancing with Adaptively Refined Grids
A Refinement-tree Based Partitioning Method for Dynamic Load Balancing with Adaptively Refined Grids William F. Mitchell Mathematical and Computational Sciences Division National nstitute of Standards
walberla: Towards an Adaptive, Dynamically Load-Balanced, Massively Parallel Lattice Boltzmann Fluid Simulation
walberla: Towards an Adaptive, Dynamically Load-Balanced, Massively Parallel Lattice Boltzmann Fluid Simulation SIAM Parallel Processing for Scientific Computing 2012 February 16, 2012 Florian Schornbaum,
Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms
Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms Amani AlOnazi, David E. Keyes, Alexey Lastovetsky, Vladimir Rychkov Extreme Computing Research Center,
Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 by Tan, Steinbach, Kumar 1 What is Cluster Analysis? Finding groups of objects such that the objects in a group will
Dynamic Load Balancing for Cluster Computing Jaswinder Pal Singh, CSE @ Technische Universität München. e-mail: [email protected]
Dynamic Load Balancing for Cluster Computing Jaswinder Pal Singh, CSE @ Technische Universität München. e-mail: [email protected] Abstract: In parallel simulations, partitioning and load-balancing algorithms
Adapting scientific computing problems to cloud computing frameworks Ph.D. Thesis. Pelle Jakovits
Adapting scientific computing problems to cloud computing frameworks Ph.D. Thesis Pelle Jakovits Outline Problem statement State of the art Approach Solutions and contributions Current work Conclusions
GEDAE TM - A Graphical Programming and Autocode Generation Tool for Signal Processor Applications
GEDAE TM - A Graphical Programming and Autocode Generation Tool for Signal Processor Applications Harris Z. Zebrowitz Lockheed Martin Advanced Technology Laboratories 1 Federal Street Camden, NJ 08102
Proposal of Dynamic Load Balancing Algorithm in Grid System
www.ijcsi.org 186 Proposal of Dynamic Load Balancing Algorithm in Grid System Sherihan Abu Elenin Faculty of Computers and Information Mansoura University, Egypt Abstract This paper proposed dynamic load
Feedback guided load balancing in a distributed memory environment
Feedback guided load balancing in a distributed memory environment Constantinos Christofi August 18, 2011 MSc in High Performance Computing The University of Edinburgh Year of Presentation: 2011 Abstract
Uintah Framework. Justin Luitjens, Qingyu Meng, John Schmidt, Martin Berzins, Todd Harman, Chuch Wight, Steven Parker, et al
Uintah Framework Justin Luitjens, Qingyu Meng, John Schmidt, Martin Berzins, Todd Harman, Chuch Wight, Steven Parker, et al Uintah Parallel Computing Framework Uintah - far-sighted design by Steve Parker
Large-Scale Reservoir Simulation and Big Data Visualization
Large-Scale Reservoir Simulation and Big Data Visualization Dr. Zhangxing John Chen NSERC/Alberta Innovates Energy Environment Solutions/Foundation CMG Chair Alberta Innovates Technology Future (icore)
Load Balancing Strategies for Multi-Block Overset Grid Applications NAS-03-007
Load Balancing Strategies for Multi-Block Overset Grid Applications NAS-03-007 M. Jahed Djomehri Computer Sciences Corporation, NASA Ames Research Center, Moffett Field, CA 94035 Rupak Biswas NAS Division,
Best practices for efficient HPC performance with large models
Best practices for efficient HPC performance with large models Dr. Hößl Bernhard, CADFEM (Austria) GmbH PRACE Autumn School 2013 - Industry Oriented HPC Simulations, September 21-27, University of Ljubljana,
Partitioning and Load Balancing for Emerging Parallel Applications and Architectures
Chapter Partitioning and Load Balancing for Emerging Parallel Applications and Architectures Karen D. Devine, Erik G. Boman, and George Karypis. Introduction An important component of parallel scientific
Towards real-time image processing with Hierarchical Hybrid Grids
Towards real-time image processing with Hierarchical Hybrid Grids International Doctorate Program - Summer School Björn Gmeiner Joint work with: Harald Köstler, Ulrich Rüde August, 2011 Contents The HHG
Multiphase Flow - Appendices
Discovery Laboratory Multiphase Flow - Appendices 1. Creating a Mesh 1.1. What is a geometry? The geometry used in a CFD simulation defines the problem domain and boundaries; it is the area (2D) or volume
High Performance Computing. Course Notes 2007-2008. HPC Fundamentals
High Performance Computing Course Notes 2007-2008 2008 HPC Fundamentals Introduction What is High Performance Computing (HPC)? Difficult to define - it s a moving target. Later 1980s, a supercomputer performs
Performance Metrics and Scalability Analysis. Performance Metrics and Scalability Analysis
Performance Metrics and Scalability Analysis 1 Performance Metrics and Scalability Analysis Lecture Outline Following Topics will be discussed Requirements in performance and cost Performance metrics Work
CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing
CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate
Parallel Computing. Benson Muite. [email protected] http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage
Parallel Computing Benson Muite [email protected] http://math.ut.ee/ benson https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage 3 November 2014 Hadoop, Review Hadoop Hadoop History Hadoop Framework
A scalable multilevel algorithm for graph clustering and community structure detection
A scalable multilevel algorithm for graph clustering and community structure detection Hristo N. Djidjev 1 Los Alamos National Laboratory, Los Alamos, NM 87545 Abstract. One of the most useful measures
COM 444 Cloud Computing
COM 444 Cloud Computing Lec 3: Virtual Machines and Virtualization of Clusters and Datacenters Prof. Dr. Halûk Gümüşkaya [email protected] [email protected] http://www.gumuskaya.com Virtual
High Performance Computing in CST STUDIO SUITE
High Performance Computing in CST STUDIO SUITE Felix Wolfheimer GPU Computing Performance Speedup 18 16 14 12 10 8 6 4 2 0 Promo offer for EUC participants: 25% discount for K40 cards Speedup of Solver
Hierarchically Parallel FE Software for Assembly Structures : FrontISTR - Parallel Performance Evaluation and Its Industrial Applications
CO-DESIGN 2012, October 23-25, 2012 Peing University, Beijing Hierarchically Parallel FE Software for Assembly Structures : FrontISTR - Parallel Performance Evaluation and Its Industrial Applications Hiroshi
ACCELERATING COMMERCIAL LINEAR DYNAMIC AND NONLINEAR IMPLICIT FEA SOFTWARE THROUGH HIGH- PERFORMANCE COMPUTING
ACCELERATING COMMERCIAL LINEAR DYNAMIC AND Vladimir Belsky Director of Solver Development* Luis Crivelli Director of Solver Development* Matt Dunbar Chief Architect* Mikhail Belyi Development Group Manager*
A STUDY OF TASK SCHEDULING IN MULTIPROCESSOR ENVIROMENT Ranjit Rajak 1, C.P.Katti 2, Nidhi Rajak 3
A STUDY OF TASK SCHEDULING IN MULTIPROCESSOR ENVIROMENT Ranjit Rajak 1, C.P.Katti, Nidhi Rajak 1 Department of Computer Science & Applications, Dr.H.S.Gour Central University, Sagar, India, [email protected]
Dynamic Mapping and Load Balancing on Scalable Interconnection Networks Alan Heirich, California Institute of Technology Center for Advanced Computing Research The problems of mapping and load balancing
Big Data Optimization at SAS
Big Data Optimization at SAS Imre Pólik et al. SAS Institute Cary, NC, USA Edinburgh, 2013 Outline 1 Optimization at SAS 2 Big Data Optimization at SAS The SAS HPA architecture Support vector machines
Fire Simulations in Civil Engineering
Contact: Lukas Arnold l.arnold@fz- juelich.de Nb: I had to remove slides containing input from our industrial partners. Sorry. Fire Simulations in Civil Engineering 07.05.2013 Lukas Arnold Fire Simulations
Portable Parallel Programming for the Dynamic Load Balancing of Unstructured Grid Applications
Portable Parallel Programming for the Dynamic Load Balancing of Unstructured Grid Applications Rupak Biswas MRJ Technology Solutions NASA Ames Research Center Moffett Field, CA 94035, USA rbis was@ nas.nasa.gov
Guide to Partitioning Unstructured Meshes for Parallel Computing
Guide to Partitioning Unstructured Meshes for Parallel Computing Phil Ridley Numerical Algorithms Group Ltd, Wilkinson House, Jordan Hill Road, Oxford, OX2 8DR, UK, email: [email protected] April 17,
Partitioning and Divide and Conquer Strategies
and Divide and Conquer Strategies Lecture 4 and Strategies Strategies Data partitioning aka domain decomposition Functional decomposition Lecture 4 and Strategies Quiz 4.1 For nuclear reactor simulation,
Data Mining Project Report. Document Clustering. Meryem Uzun-Per
Data Mining Project Report Document Clustering Meryem Uzun-Per 504112506 Table of Content Table of Content... 2 1. Project Definition... 3 2. Literature Survey... 3 3. Methods... 4 3.1. K-means algorithm...
Load Balancing Techniques
Load Balancing Techniques 1 Lecture Outline Following Topics will be discussed Static Load Balancing Dynamic Load Balancing Mapping for load balancing Minimizing Interaction 2 1 Load Balancing Techniques
Petascale Software Challenges. William Gropp www.cs.illinois.edu/~wgropp
Petascale Software Challenges William Gropp www.cs.illinois.edu/~wgropp Petascale Software Challenges Why should you care? What are they? Which are different from non-petascale? What has changed since
