Generating Virtual Worlds with Supercomputer Simulations

Similar documents
walberla: A software framework for CFD applications

Fast Parallel Algorithms for Computational Bio-Medicine

FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG

walberla: A software framework for CFD applications on Compute Cores

Towards real-time image processing with Hierarchical Hybrid Grids

Collaborative and Interactive CFD Simulation using High Performance Computers

Computational Engineering Programs at the University of Erlangen-Nuremberg

walberla: Towards an Adaptive, Dynamically Load-Balanced, Massively Parallel Lattice Boltzmann Fluid Simulation

ME6130 An introduction to CFD 1-1

Efficient Storage, Compression and Transmission

P013 INTRODUCING A NEW GENERATION OF RESERVOIR SIMULATION SOFTWARE

and RISC Optimization Techniques for the Hitachi SR8000 Architecture

Express Introductory Training in ANSYS Fluent Lecture 1 Introduction to the CFD Methodology

Model of a flow in intersecting microchannels. Denis Semyonov

Optimizing Performance of the Lattice Boltzmann Method for Complex Structures on Cache-based Architectures

HPC enabling of OpenFOAM R for CFD applications

Hash-Storage Techniques for Adaptive Multilevel Solvers and Their Domain Decomposition Parallelization

LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR

The Application of a Black-Box Solver with Error Estimate to Different Systems of PDEs

Basin simulation for complex geological settings

Iterative Solvers for Linear Systems

HPC Deployment of OpenFOAM in an Industrial Setting

High Performance Computing in the Multi-core Area

CFD Applications using CFD++ Paul Batten & Vedat Akdag

Introduction to CFD Analysis

Customer Training Material. Lecture 2. Introduction to. Methodology ANSYS FLUENT. ANSYS, Inc. Proprietary 2010 ANSYS, Inc. All rights reserved.

Lecture 16 - Free Surface Flows. Applied Computational Fluid Dynamics

Introduction to CFD Analysis

Optimized Hybrid Parallel Lattice Boltzmann Fluid Flow Simulations on Complex Geometries

Applied mathematics and mathematical statistics

Simulation of Fluid-Structure Interactions in Aeronautical Applications

Aeroacoustic Analogy for the Computation of Aeroacoustic Fields in Partially Closed Domains

Mesh Generation and Load Balancing

AN EFFECT OF GRID QUALITY ON THE RESULTS OF NUMERICAL SIMULATIONS OF THE FLUID FLOW FIELD IN AN AGITATED VESSEL

Load Balancing Strategies for Parallel SAMR Algorithms

TESLA Report

Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms

Multi-GPU Load Balancing for Simulation and Rendering

THE CFD SIMULATION OF THE FLOW AROUND THE AIRCRAFT USING OPENFOAM AND ANSA

Dr. Raju Namburu Computational Sciences Campaign U.S. Army Research Laboratory. The Nation s Premier Laboratory for Land Forces UNCLASSIFIED

Multiphase Flow - Appendices

Calculation of Eigenmodes in Superconducting Cavities

Lecture 7 - Meshing. Applied Computational Fluid Dynamics

CCTech TM. ICEM-CFD & FLUENT Software Training. Course Brochure. Simulation is The Future

Large-Scale Reservoir Simulation and Big Data Visualization

Interactive simulation of an ash cloud of the volcano Grímsvötn

How To Write A Program For The Pd Framework

CFD Based Air Flow and Contamination Modeling of Subway Stations

Authors: Masahiro Watanabe*, Motoi Okuda**, Teruo Matsuzawa*** Speaker: Masahiro Watanabe**

Computer Graphics AACHEN AACHEN AACHEN AACHEN. Public Perception of CG. Computer Graphics Research. Methodological Approaches

A Load Balancing Tool for Structured Multi-Block Grid CFD Applications

Oliver Röhrle Ulrich Rüde Barbara Wohlmuth. Garching, September 17/18, 2012

Introduction to DISC and Hadoop

Performance of the JMA NWP models on the PC cluster TSUBAME.

BOĞAZİÇİ UNIVERSITY COMPUTER ENGINEERING

OpenFOAM Optimization Tools

Unleashing the Performance Potential of GPUs for Atmospheric Dynamic Solvers

Parallel Simplification of Large Meshes on PC Clusters

Introduction to Visualization with VTK and ParaView

TwinMesh for Positive Displacement Machines: Structured Meshes and reliable CFD Simulations

O.F.Wind Wind Site Assessment Simulation in complex terrain based on OpenFOAM. Darmstadt,

Parallel 3D Image Segmentation of Large Data Sets on a GPU Cluster

Performance Improvement of Application on the K computer

Very special thanks to Wolfgang Gentzsch and Burak Yenier for making the UberCloud HPC Experiment possible.

Scalable coupling of Molecular Dynamics (MD) and Direct Numerical Simulation (DNS) of multi-scale flows

OpenFOAM Workshop. Yağmur Gülkanat Res.Assist.

Advanced Volume Rendering Techniques for Medical Applications

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS

COMPUTATIONAL FLUID DYNAMICS USING COMMERCIAL CFD CODES

NUMERICAL ANALYSIS OF THE EFFECTS OF WIND ON BUILDING STRUCTURES

Chemical Engineering - CHEN

Yousef Saad University of Minnesota Computer Science and Engineering. CRM Montreal - April 30, 2008

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms

Real Time Simulation of Power Plants

FLUID MECHANICS IM0235 DIFFERENTIAL EQUATIONS - CB _1

How To Run A Cdef Simulation

High Performance Multi-Layer Ocean Modeling. University of Kentucky, Computer Science Department, 325 McVey Hall, Lexington, KY , USA.

Curriculum Vitae. George Breyiannis. Mechanical Engineer, PhD

Hardware-Aware Analysis and. Presentation Date: Sep 15 th 2009 Chrissie C. Cui

Towards Optimal Performance for Lattice Boltzmann Applications on Terascale Computers

The Scientific Data Mining Process

C3.8 CRM wing/body Case

Highly Scalable Dynamic Load Balancing in the Atmospheric Modeling System COSMO-SPECS+FD4

Design and Optimization of a Portable Lattice Boltzmann Code for Heterogeneous Architectures

Fast Multipole Method for particle interactions: an open source parallel library component

CFD Application on Food Industry; Energy Saving on the Bread Oven

Computers in Film Making

Master s in Computational Mathematics. Faculty of Mathematics, University of Waterloo

Steady Flow: Laminar and Turbulent in an S-Bend

Modeling of Earth Surface Dynamics and Related Problems Using OpenFOAM

Spatial Discretisation Schemes in the PDE framework Peano for Fluid-Structure Interactions

ParFUM: A Parallel Framework for Unstructured Meshes. Aaron Becker, Isaac Dooley, Terry Wilmarth, Sayantan Chakravorty Charm++ Workshop 2008

Scientific Computing Programming with Parallel Objects

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes

Fire Simulations in Civil Engineering

Performance Evaluation of Amazon EC2 for NASA HPC Applications!

Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer

A Multi-layered Domain-specific Language for Stencil Computations

Multi-Block Gridding Technique for FLOW-3D Flow Science, Inc. July 2004

Transcription:

Generating Virtual Worlds with Supercomputer Simulations U. Rüde (LSS Erlangen, ruede@cs.fau.de) joint work with many collaborators and students Lehrstuhl für Informatik 10 (Systemsimulation) Universität Erlangen-Nürnberg www10.informatik.uni-erlangen.de December 19, 2007 1

Overview Computers as tools for scientists: What is Computational Science Examples of Simulation for Science and Engineering Simulating Flow Simulations Biomedical Applications Conclusions 2

Motivation 3

How much is a PetaFlops? 10 6 = 1 MegaFlops: Intel 486 Zur Anzeige wird der QuickTime Dekompressor TIFF (Unkomprimiert) benötigt. 33MHz PC (~1989) 10 9 = 1 GigaFlops: Intel Pentium III 1GHz (~2000) If every person on earth does one operation every 6 seconds, all humans together have 1 GigaFlops performance (less than a current laptop from Aldi) 10 12 = 1 TeraFlops: HLRB-I 1344 Proc., ~ 2000 10 15 = 1 PetaFlops >250 000 Proc. Cores?, ~2008? If every person on earth runs a 486 PC, we all together have an aggregate Performance of 6 PetaFlops. Zur Anzeige wird der QuickTime Dekompressor TIFF (Unkomprimiert) benötigt. HLRB-I: 2 TFlops Zur Anzeige wird der QuickTime Dekompressor TIFF (Unkomprimiert) benötigt. HLRB-II: 63 TFlops 10

The Two Principles of Science Three Theory Mathematical Models, Differential Equations, Newton Experiments Observation and prototypes empirical Sciences Computational Science Simulation, Optimization (quantitative) virtual Reality 11

SIAM s Definition of CSE http://www.siam siam.org/cse/report.htm CSE is a broad multidisciplinary area that encompasses applications in science/engineering, applied mathematics, numerical analysis, and computer science. Computer models and computer simulations have become an important part of the research repertoire, supplementing (and in some cases replacing) experimentation. Going from application area to computational results requires domain expertise, mathematical modeling,, numerical analysis, algorithm development, software implementation, program execution, analysis, validation and visualization of results.. CSE involves all of this. 13

SIAM's Definition of CSE (2) Especially: What is it NOT! CSE makes use of the techniques of applied mathematics and computer science for the development of problem-solving methodologies and robust tools which will be the building blocks for solutions to scientific and engineering problems of ever-increasing complexity. It differs from mathematics or computer science in that analysis and methodologies are directed specifically at the solution of problem classes from science and engineering,, and will generally require a detailed knowledge or substantial collaboration from those disciplines. The computing and mathematical techniques used may be more domain specific, and the computer science and mathematics skills needed will be broader. CSE is more than a scientist or engineer using a canned code to generate and visualize results (skipping all of the intermediate steps). 14

Fluid Flow Simulation Metal Foams Nano Technology Fancy Physics In Collaboration with: Lehrstuhl Werkstoffkunde und Technologie der Metalle, Erlangen (R.F. Singer, C. Körner) Lehrstuhl für Bauinformatik, TU München (E. Rank) Institut für Computeranwendungen im Bauingenieurwesen, TU Braunschweig (M. Krafczyk) Lehrstuhl für Feststoff- und Grenzflächenverfahrenstechnik, Erlangen (W. Peukert, H.-J. Schmid) Computer Graphics Lab, ETH Zürich (M. Pauly) 15

First Test Breaking Dam Zur Anzeige wird der QuickTime Dekompressor YUV420 codec benötigt. 16

Falling Drop Zur Anzeige wird der QuickTime Dekompressor YUV420 codec benötigt. 17

Falling Meteor Zur Anzeige wird der QuickTime Dekompressor YUV420 codec benötigt. 18

The interface between Compute only fluid Liquid and Gas Special free surface conditions on interface 22

Why so compute intensive? Millions to billions of cells (1000x1000x1000) Thousands to millions of time steps hundreds of operations in each cell and time step The curse of dimensionality! 23

Visualization Ray-tracing Refraction Reflection Caustics About 15 Min per frame = 1 day for 4 secs About same compute time as flow simulation 24

Process Simulation of Foam Production poorly understood: coalescence collapse, drying, solidification etc. Simulation as tool to understand and control the process 26

Rising Bubbles Zur Anzeige wird der QuickTime Dekompressor YUV420 codec benötigt. 27

Simultaneously Rising Bubbles Zur Anzeige wird der QuickTime Dekompressor YUV420 codec benötigt. 28

Experimental Verification Zur Anzeige wird der QuickTime Dekompressor YUV420 codec benötigt. Simulation and Experiment: Diplomarbeit N. Thürey 29

Foaming Simulation Zur Anzeige wird der QuickTime Dekompressor Cinepak benötigt. Zur Anzeige wird der QuickTime Dekompressor benötigt. 30

Fancy Physics Zur Anzeige wird der QuickTime Dekompressor YUV420 codec benötigt. 31

Moving Nano Particles in a Liquid Zur Anzeige wird der QuickTime Dekompressor YUV420 codec benötigt. K. Iglberger - Master Thesis C. Feichtinger - Diplomarbeit 32

Bio-medical and Bio-chemical Simulation Bood Flow in an Aneurysma HIV-Protease Bio-Electrical Fields 33

Datensatz Pulsating Blood Flow in an Aneurysma Master Thesis Jan GötzG Zur Anzeige wird der QuickTime Dekompressor YUV420 codec benötigt. Collaboration with Neuroradiologie (Prof. Dörfler, D Dr. Richter) Image Processing Simulation Fluid Mechanics (Prof. Durst) 34

Datensatz Pulsating Blood Flow in an Aneurysma Master Thesis Jan GötzG Collaboration with Neuro-Radiology (Prof. Dörfler, D Dr. Richter) Image Processing Simulation Fluid Mechanics (Prof. Durst) Zur Anzeige wird der QuickTime Dekompressor YUV420 codec benötigt. 35

Bio-Electromagnetic Fields Source Localisation Collaboration with: Chr. Johnson (Univ. of Utah), C. Popa (Ovidius Univ. Constanta), Bart Vanrumste, (Univ. of Canterbury, New Zealand), G. Greiner, F. Fahlbusch (Erlangen), C. Wolters (Münster) Erlangen Neuro Surgeons at work View through operation microscope 36

Simulation or better do experiments? Source localisation by open brain measurements Operation planning with a virtual head model (Chr. Johnson, Utah) 37

Molekular Dynamics Simulation of HIV-Protease Zur Anzeige wird der QuickTime Dekompressor YUV420 codec benötigt. H. Sticht (Inst. für f r Biochemie) 38

International Master (and PhD) Programme Computational Engineering What is this about? it is not Computer Science it is not Mathematics it is not a conventional engineering field it is an interdisciplinary combination of all three - the foundation of future science Master Program in Erlangen Hours Option (Elite Program) jointly with TU Munich Information: http://www9.informatik.uni-erlangen.de/ce/ 39

Acknowledgements Collaborators In Erlangen: WTM, LSE, LSTM, LGDV, RRZE, Neurozentrum, Radiologie, etc. Especially for foams: C. Körner (WTM) International: Utah, Technion, Constanta, Ghent, Boulder, München, Zürich,... Dissertationen Projects U. Fabricius (AMG-Methods and SW-Engineering for parallelization) C. Freundl (Parelle Expression Templates for PDE-solver) K. Iglberger (Rigid Body Dynamics) J. Götz (LBM, blood flow) T. Gradl (Parallel multigrid)... and 8 more 25 Diplom- /Master- Thesis Studien- /Bachelor- Thesis Especially for Performance-Analysis/ Optimization for LBM J. Wilke, K. Iglberger, S. Donath, B. Gmeiner... and 23 more KONWIHR, DFG, NATO, BMBF Elitenetzwerk Bayern Bavarian Graduate School in Computational Engineering (with TUM, since 2004) Special International PhD program: Identifikation, Optimierung und Steuerung für f r technische Anwendungen (with Bayreuth and Würzburg) W since Jan. 2006. 40

Thank you for your interest! Questions? 41

Part II-a Towards Scalable FE Software Scalable Algorithms: Multigrid 42

What is Multigrid? Has nothing to do with grid computing A general methodology multi - scale (actually it is the original ) many different applications developped in the 1970s -... Useful e.g. for solving elliptic PDEs large sparse systems of equations iterative convergence rate independent of problem size asymptotically optimal complexity -> algorithmic scalability! can solve e.g. 2D Poisson Problem in ~ 30 operations per gridpoint efficient parallelization - if one knows how to do it best (maybe the only?) basis for fully scalable FE solvers 43

Multigrid: V-Cycle Goal: solve A h u h = f h using a hierarchy of grids Relax on Correct Residual Restrict Interpolate Solve by recursion 44

Part II - b Towards Scalable FE Software Scalable Architecture Hierarchical Hybrid Grids 46

Hierarchical Hybrid Grids (HHG) Unstructured input grid Resolves geometry of problem domain Patch-wise regular refinement generates nested grid hierarchies naturally suitable for geometric multigrid algorithms New: Modify storage formats and operations on the grid to exploit the regular substructures Does an unstructured grid with 100 000 000 000 elements make sense? HHG - Ultimate Parallel FE Performance! 47

HHG refinement example Input Grid 48

HHG Refinement example Refinement Level one 49

HHG Refinement example Refinement Level Two 50

HHG Refinement example Structured Interior 51

HHG Refinement example Structured Interior 52

HHG Refinement example Edge Interior 53

HHG Refinement example Edge Interior 54

Common HHG Misconceptions Hierarchical hybrid grids (HHG) are not only another block structured grid HHG are more flexible (unstructured, hybrid input grids) are not only another unstructured geometric multigrid package HHG achieve better performance unstructured treatment of regular regions does not improve performance 55

Parallel HHG - Framework Design Goals To realize good parallel scalability: Minimize latency by reducing the number of messages that must be sent Optimize for high bandwidth interconnects large messages Avoid local copying into MPI buffers 56

HHG for Parallelization Use regular HHG patches for partitioning the domain 57

HHG Parallel Update Algorithm for each vertex do apply operation to vertex end for update vertex primary dependencies for each edge do copy from vertex interior apply operation to edge copy to vertex halo end for update edge primary dependencies for each element do copy from edge/vertex interiors apply operation to element copy to edge/vertex halos end for update secondary dependencies 58

Part II - c Towards Scalable FE Software Performance Results 59

Single Processor HHG Performance on Itanium for Relaxation of a Tetrahedral Finite Element Mesh 60

HHG: Parallel Scalability #Procs #DOFS x 10 6 #Els x 10 6 #Input Els GFLOP/s Time [s] 64 2,144 12,884 6144 100/75 68 128 4,288 25,769 12288 200/147 69 256 8,577 51,539 24576 409/270 76 512 17,167 103,079 49152 762/545 75 1024 17,167 103,079 49152 1,456/964 43 Parallel scalability of Poisson problem discretized by tetrahedral finite elements - SGI Altix (Itanium-2 1.6 GHz) B. Bergen, F. Hülsemann, U. Ruede: Is 1.7 10 10 unknowns the largest finite element system that can be solved today? SuperComputing, Nov 2005. See also: ISC Award 2006 for Application Scalability. 61

Part III - a Free Surface Flow Simulation The Lattice Boltzmann Method 62

Free surface flow: Breaking Dam Zur Anzeige wird der QuickTime Dekompressor YUV420 codec benötigt. 63

The Lattice-Boltzmann Method (2) Weakly compressible approximation of the Navier-Stokes equations Easy implementation Applicable for small Mach numbers (< 0.1) Easy to adapt, e.g. for Complicated or time-varying geometries Free surfaces Additional physical and chemical effects 65

The Lattice-Boltzmann Method (3) Real valued representation of particles Discrete velocities and positions Algorithm proceeds in two steps: Stream Collide 66

Fluid Cell Treatment Algorithm proceeds in two steps: Stream: advect fluid elements (copy DFs to neighbors) Collide: compute collisions of fluid molecules 67

Fluid Cell Treatment Algorithm proceeds in two steps: Stream: advect fluid elements (copy DFs to neighbors) Collide: compute collisions of fluid molecules 68

Fluid Cell Treatment Algorithm proceeds in two steps: Stream: advect fluid elements (copy DFs to neighbors) Collide: compute collisions of fluid molecules 69

Fluid Cell Treatment Algorithm proceeds in two steps: Stream: advect fluid elements (copy DFs to neighbors) Collide: compute collisions of fluid molecules 70

Fluid Cell Treatment Algorithm proceeds in two steps: Stream: advect fluid elements (copy DFs to neighbors) Collide: compute collisions of fluid molecules 71

The Collide Step Amounts for collisions of particles during movement Weigh equilibrium velocities and velocities from streaming depending on fluid viscosity 72

Stream/Collide: LBM in Equations Zur Anzeige wird der QuickTime Dekompressor TIFF (LZW) benötigt. Equilibrium DF: Zur Anzeige wird der QuickTime Dekompressor TIFF (LZW) benötigt. Zur Anzeige wird der QuickTime Dekompressor TIFF (LZW) benötigt. Zur Anzeige wird der QuickTime Dekompressor TIFF (LZW) benötigt. Zur Anzeige wird der QuickTime Dekompressor TIFF (LZW) benötigt. Zur Anzeige wird der QuickTime Dekompressor TIFF (LZW) benötigt. Zur Anzeige wird der QuickTime Dekompressor TIFF (LZW) benötigt. 73

Stability & Turbulence Modelling Smagorinsky Subgrid Model: Similar to approach for NS-Solvers Model subgrid-scale vortices by locally changing the viscosity Implementation for LBM Reynolds stress tensor computed for each cell Changes only in collision operator Ca. 20% slowdown, significant gain due to decreased resolution requirements 74

Falling Drop with Turbulence Model Zur Anzeige wird der QuickTime Dekompressor benötigt. 75

Falling Drop with Turbulence Model (slower( slower) Zur Anzeige wird der QuickTime Dekompressor benötigt. 76

Part III - b Free Surface Flow Simulation Volume of Fluids 77

Free surfaces with LBM Metal Foams huge gas volumes Only simulate and track fluid motion Compute boundary conditions at free surface Three cell types: Empty/Gas, Fluid, Interface 78

Boundary Conditions Problem: Missing distribution functions at interface cells after streaming! Gas Liquid Reconstruction such that macroscopic boundary conditions are satisfied. Körner et al. Lattice Boltzmann Model for Free Surface Flow, Journal of Computational Physics 79

Free surface simulations Algorithmic Overview: Before stream step, compute mass exchange across cell boundaries for interface cells Calculate bubble volumes and pressure Surface curvature for surface tension Change topology if interface cells become full or empty keep layer of interface cells closed 88

Free Surface Cell Conversions Emptied interface cell > gas Filled interface cell > fluid Guarantee closed layer of interface cells Redistribute mass in the neighborhood 89

Curvature calculation (version I) Alternative approaches: Integrate normals over surface (weighted triangles) Level set methods (track surface as implicit function) 90

Surface Tension (Vers. 2) Marching-cube surface triangulation Compute a curvature for each triangle κ = 1 2 δa δv δa = A A n r n r 3 1 A n r 2 δv A Associate with each LBM cell the average curvature of its triangles Complicated Beats level sets for our applications (mass conservation) 91

Part III - c Free Surface Flow Simulation Application: Metal Foam 92

Towards Simulating Metal Foams Bubble growth, coalescence, collapse, drainage, rheology,, etc. are still poorly understood Simulation as a tool to better understand, control and optimize the process 94

Rising Bubbles Zur Anzeige wird der QuickTime Dekompressor YUV420 codec benötigt. 95

More Rising Bubbles Zur Anzeige wird der QuickTime Dekompressor YUV420 codec benötigt. 96

Simulation Verification by Experiment Zur Anzeige wird der QuickTime Dekompressor YUV420 codec benötigt. Simulation and Experiment: Diplomarbeit N. Thürey 97

Foaming Simulation 1 Zur Anzeige wird der QuickTime Dekompressor Cinepak benötigt. 98

Numerical Experiment: Single Rising Bubble 101

Part III - d Free Surface Flow Simulation Parallel Performance 102

Parallelization Standard LBM-Code: Scalability on SR 8000-F1 Largest Simulation: 1,08*10 9 cells 370 GByte memory Communication Cost because of large data volume (64 MByte) Efficiency ~ 75% Dissertation T. Pohl (2006) 103

Free surface LBM-Code Parallelization Standard LBM Free surface LBM 1 sweep through grid 5 sweeps through grid Cell type changes, Closed boundary for bubbles, Initialization of modified cells, Mass balance correction 104

Free surface LBM-Code: Parallelization Standard LBM Free surface LBM 1 sweep through grid 5 sweeps through grid 1 row of ghost nodes 4 rows of ghost nodes 105

Performance on SR 8000 Standard LBM-Code Free surface LBM-Code Performance lousy on a single node! Conditionals: 2,9 SLBM 51 free surface LBM Pentium 4: almost no degradation ~ 10% SR 8000: enormous degradation (pseudo-vector, predictable jumps) 106

Parallel Performance LSS- Cluster Fujitsu- Siemens 107

Part III - c Free Surface Flow Simulation Visualization and Animation 108

Adaptive Grids Performance Zur Anzeige wird der QuickTime Dekompressor benötigt. Speed up: factor 2-4 for larger resolutions Insignifcant overhead for small resolutions 109

Example Coupled Simulations Zur Anzeige wird der QuickTime Dekompressor benötigt. 113

Physically Based Animation Special Effects e.g. for Computer generated movies Realistic appearance necessary, but only where it s absolutely necessary > Control Fluid or other simulations Examples of Fluid Simulations in Movies: Harry Potter 4 (ship-scene), Ice Age 2 (throughout), Poseidon 114

Simulations with Fluid Control Zur Anzeige wird der QuickTime Dekompressor mpeg4 benötigt. 115

Part IV Outlook 116

Acknowledgements Collaborators In Erlangen: WTM, LSE, LSTM, LGDV, RRZE, Neurozentrum, Radiologie, etc. Especially for foams: C. Körner (WTM) International: Utah, Technion, Constanta, Ghent, Boulder, München, Zürich,... Dissertationen Projects U. Fabricius (AMG-Verfahren and SW-Engineering for parallelization) C. Freundl (Parelle Expression Templates for PDE-solver) J. Härtlein (Expression Templates for FE-Applications) N. Thürey (LBM, free surfaces) T. Pohl (Parallel LBM)... and 6 more 19 Diplom- /Master- Thesis Studien- /Bachelor- Thesis Especially for Performance-Analysis/ Optimization for LBM J. Wilke, K. Iglberger, S. Donath... and 23 more KONWIHR, DFG, NATO, BMBF Elitenetzwerk Bayern Bavarian Graduate School in Computational Engineering (with TUM, since 2004) Special International PhD program: Identifikation, Optimierung und Steuerung für f r technische Anwendungen (with Bayreuth and Würzburg) W since Jan. 2006. 117

Talk is Over Please wake up! 118