Explicit Early-Z Culling for Efficient Fluid Flow Simulation and Rendering

Similar documents
Hardware-Aware Analysis and. Presentation Date: Sep 15 th 2009 Chrissie C. Cui

Overview Motivation and applications Challenges. Dynamic Volume Computation and Visualization on the GPU. GPU feature requests Conclusions

CUBE-MAP DATA STRUCTURE FOR INTERACTIVE GLOBAL ILLUMINATION COMPUTATION IN DYNAMIC DIFFUSE ENVIRONMENTS

Interactive Level-Set Deformation On the GPU

Graphics. Computer Animation 고려대학교 컴퓨터 그래픽스 연구실. kucg.korea.ac.kr 1

GPU Shading and Rendering: Introduction & Graphics Hardware

Fluid Dynamics and the Navier-Stokes Equation

Recent Advances and Future Trends in Graphics Hardware. Michael Doggett Architect November 23, 2005

An Extremely Inexpensive Multisampling Scheme

GPU Architecture. Michael Doggett ATI

Image Processing and Computer Graphics. Rendering Pipeline. Matthias Teschner. Computer Science Department University of Freiburg

Writing Applications for the GPU Using the RapidMind Development Platform

Interactive Level-Set Segmentation on the GPU

OpenGL Performance Tuning

OpenEXR Image Viewing Software

Hardware-Aware Analysis and Optimization of Stable Fluids

Silverlight for Windows Embedded Graphics and Rendering Pipeline 1

Lecture Notes, CEng 477

Visualizing Data: Scalable Interactivity

Using Photorealistic RenderMan for High-Quality Direct Volume Rendering

3D Interactive Information Visualization: Guidelines from experience and analysis of applications

How To Make A Texture Map Work Better On A Computer Graphics Card (Or Mac)

Multiphase Flow - Appendices

Visualization and Feature Extraction, FLOW Spring School 2016 Prof. Dr. Tino Weinkauf. Flow Visualization. Image-Based Methods (integration-based)

Blender Notes. Introduction to Digital Modelling and Animation in Design Blender Tutorial - week 9 The Game Engine

Computer Graphics Hardware An Overview

Fast Fluid Dynamics Simulation on the GPU

Differential Balance Equations (DBE)

Dimensional analysis is a method for reducing the number and complexity of experimental variables that affect a given physical phenomena.

How To Teach Computer Graphics

Shader Model 3.0. Ashu Rege. NVIDIA Developer Technology Group

Hardware design for ray tracing

Advanced Graphics and Animations for ios Apps

Computer Applications in Textile Engineering. Computer Applications in Textile Engineering

Choosing a digital camera for your microscope John C. Russ, Materials Science and Engineering Dept., North Carolina State Univ.

TESLA Report

Volume visualization I Elvins

Interactive simulation of an ash cloud of the volcano Grímsvötn

Optimizing Unity Games for Mobile Platforms. Angelo Theodorou Software Engineer Unite 2013, 28 th -30 th August

DATA VISUALIZATION OF THE GRAPHICS PIPELINE: TRACKING STATE WITH THE STATEVIEWER

Advanced CFD Methods 1

An Interactive Visualization Tool for the Analysis of Multi-Objective Embedded Systems Design Space Exploration

Computer Graphics CS 543 Lecture 12 (Part 1) Curves. Prof Emmanuel Agu. Computer Science Dept. Worcester Polytechnic Institute (WPI)

Course Overview. CSCI 480 Computer Graphics Lecture 1. Administrative Issues Modeling Animation Rendering OpenGL Programming [Angel Ch.

NVPRO-PIPELINE A RESEARCH RENDERING PIPELINE MARKUS TAVENRATH MATAVENRATH@NVIDIA.COM SENIOR DEVELOPER TECHNOLOGY ENGINEER, NVIDIA

Flame On: Real-Time Fire Simulation for Video Games. Simon Green, NVIDIA Christopher Horvath, Pixar

Making natural looking Volumetric Clouds In Blender 2.48a

Dynamic Resolution Rendering

Example Chapter 08-Number 09: This example demonstrates some simple uses of common canned effects found in popular photo editors to stylize photos.

GPU Point List Generation through Histogram Pyramids

Visualization of 2D Domains

Data Visualization Using Hardware Accelerated Spline Interpolation

GEDAE TM - A Graphical Programming and Autocode Generation Tool for Signal Processor Applications

Cork Education and Training Board. Programme Module for. 3 Dimensional Computer Graphics. Leading to. Level 5 FETAC

Computers in Film Making

Radeon HD 2900 and Geometry Generation. Michael Doggett

Introduction to COMSOL. The Navier-Stokes Equations

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms

TWO-DIMENSIONAL FINITE ELEMENT ANALYSIS OF FORCED CONVECTION FLOW AND HEAT TRANSFER IN A LAMINAR CHANNEL FLOW

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software

Comp 410/510. Computer Graphics Spring Introduction to Graphics Systems

ART 269 3D Animation Fundamental Animation Principles and Procedures in Cinema 4D

INTRODUCTION TO RENDERING TECHNIQUES

L20: GPU Architecture and Models

Optimizing AAA Games for Mobile Platforms

Animation. Persistence of vision: Visual closure:

Image Synthesis. Transparency. computer graphics & visualization

FREE FALL. Introduction. Reference Young and Freedman, University Physics, 12 th Edition: Chapter 2, section 2.5

GPU(Graphics Processing Unit) with a Focus on Nvidia GeForce 6 Series. By: Binesh Tuladhar Clay Smith

Computer Graphics Global Illumination (2): Monte-Carlo Ray Tracing and Photon Mapping. Lecture 15 Taku Komura

2.5 Physically-based Animation

How To Run A Factory I/O On A Microsoft Gpu 2.5 (Sdk) On A Computer Or Microsoft Powerbook 2.3 (Powerpoint) On An Android Computer Or Macbook 2 (Powerstation) On

CSE 564: Visualization. GPU Programming (First Steps) GPU Generations. Klaus Mueller. Computer Science Department Stony Brook University

Interactive Visualization of Magnetic Fields

Color Balancing Techniques

Differential Relations for Fluid Flow. Acceleration field of a fluid. The differential equation of mass conservation

Accelerating Wavelet-Based Video Coding on Graphics Hardware

Introductory FLUENT Training

Parallelism and Cloud Computing

mouse (or the option key on Macintosh) and move the mouse. You should see that you are able to zoom into and out of the scene.

Multi-Block Gridding Technique for FLOW-3D Flow Science, Inc. July 2004

Geometric Constraints

Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture.

Compiling the Kohonen Feature Map Into Computer Graphics Hardware

Real-Time Fluid Dynamics for Games. Jos Stam Research Alias wavefront

Multiresolution 3D Rendering on Mobile Devices

Voronoi Treemaps in D3

Real-Time Fluid Dynamics for Games

C O M P U C O M P T U T E R G R A E R G R P A H I C P S Computer Animation Guoying Zhao 1 / 66 /

Graphical displays are generally of two types: vector displays and raster displays. Vector displays

GPGPU accelerated Computational Fluid Dynamics

Eu-NORSEWInD - Assessment of Viability of Open Source CFD Code for the Wind Industry

Investigation of Color Aliasing of High Spatial Frequencies and Edges for Bayer-Pattern Sensors and Foveon X3 Direct Image Sensors

Paper Pulp Dewatering

1 The basic equations of fluid dynamics

COMPARISON OF SOLUTION ALGORITHM FOR FLOW AROUND A SQUARE CYLINDER

Data Parallel Computing on Graphics Hardware. Ian Buck Stanford University

Digitization of Old Maps Using Deskan Express 5.0

A NEW METHOD OF STORAGE AND VISUALIZATION FOR MASSIVE POINT CLOUD DATASET

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA

Transcription:

Explicit Early-Z Culling for Efficient Fluid Flow Simulation and Rendering ATI Research Technical Report August 2, 24 Pedro V. Sander Natalya Tatarchuk Jason L. Mitchell ATI Research, Inc. 62 Forest Street Marlborough, MA 1752, USA ATI Research, Inc. 62 Forest Street Marlborough, MA 1752, USA ATI Research, Inc. 62 Forest Street Marlborough, MA 1752, USA (a) Brute force (b) Early-z Figure 1: Comparison of fluid flow simulations with and without our early-z acceleration techniques. Both simulations use a 512 512 grid (cropped for the figure) and render at 53fps. Abstract We present an efficient algorithm for simulation and rendering of fluid flow on graphics hardware. Our algorithm takes advantage of explicit early-z culling to reduce the amount of computation during both the simulation and rendering steps. Our approach is straightforward to implement and speeds up our simulations in some cases by a factor of three. 1. Introduction In recent years, graphics processors have been applied to broader areas of computation, such as simulation of natural phenomena. Due to their highly parallel nature and increasingly general computational models, GPUs are well matched with the demands of fluid flow simulation. In this paper we present acceleration techniques for simulation and rendering of fluid flow. We have implemented a fluid simulation on the GPU based on the solution of Navier-Stokes equations which uses explicit early-z culling as a means of avoiding certain unnecessary computations. More specifically, we present a culling technique for the projection step of the fluid flow simulation and a culling technique for efficiently rendering fluid flow to the screen. Fluid flow simulation naturally maps to current graphics hardware by performing the different steps of the simulation using full screen quadrilaterals and doing all the work in pixel shaders (see Harris [23b] for a thorough description). Prior to execution of a pixel shader, the graphics hardware performs a check of the interpolated z value against the z value in the z buffer. This occurs for any pixels which are actually going to use the primitive s interpolated z (rather than compute z in the pixel shader itself). This additional check provides not only an added efficiency win when using long, costly pixel shaders, but also provides a form of pixel-level flow control in specific situations. The z buffer can be thought of as containing condition codes governing the execution of expensive pixel shaders. Inserting inexpensive rendering passes whose only job is to appropriately set the condition codes for subsequent expensive rendering passes can increase performance significantly. This approach is known as early-z culling in real-time rendering. Page 1 of 6

Our techniques are based on the observation that fluid flow is often concentrated in sub-regions of the simulation grid. The early-z optimizations that we employ significantly reduce the amount of computation on regions that have little to no fluid density or pressure, saving computational resources for regions with higher flow concentration, or for rendering other objects in the scene. This paper is structured as follows. Section 2 describes previous work on fluid simulation and how it relates to our approach. Section 3 outlines our algorithm. Sections 4 and 5 describe the two acceleration techniques that we employ. In Section 6, we present results of our algorithm. In Section 7, we conclude and present directions for future work. 2. Previous work Early methods for fluid simulation were based on explicit integration schemes. These methods do not produce a stable simulation unless the simulation timestep is very small. Stam [1999] introduced an unconditionally stable model for fluid simulation. Stam s approach uses a semi- Lagrangian integration method and a projection step to ensure incompressibility [Chorin 1967]. This solver allows for much higher timesteps, resulting in faster, real-time stable simulations. Many recent papers on fluid simulation for different natural phenomena, such as smoke [Fedkiw et al. 21], fire [Nguyen et al. 22], and clouds [Harris et al. 23], are partly based on this solver. We also use this solver as a basis for our acceleration methods for both 2D and 3D simulation. For details on Stam s stable Navier- Stokes solver, we refer the reader to the paper [Stam 1999]. For a thorough description of fluid simulation on graphics hardware, we recommend Harris [23b]. Recently presented optimization techniques simulate fluid flow more efficiently by using the graphics hardware. Harris et al. [23] use a red-black Gauss-Seidel relaxation method on their fluid simulation as a vectorized optimization technique. They also achieve faster rendering rates by amortizing their simulation over several frames. In their 3D solver, they propose using a flat 3D texture that stores all slices of a 3D volume in order to improve the efficiency of their simulation. Rasmussen et al. [23] describe an interpolation method to create high-resolution 3D fields from a small number of 2D fields, significantly reducing computation and memory requirements. Several methods have been proposed to approximate light scattering and other visual phenomena in order to achieve realistic results in less time (e.g., Fedkiw et al. [21] and Harris et al. [23]). To our knowledge, the optimization methods to date do not dynamically adapt based on the fact that fluid is often localized, or at least, more highly concentrated in certain areas of the uniform simulation grid. This paper presents an acceleration method that takes advantage of this observation to dynamically reduce computation on regions that have little to no fluid density or pressure. Our method is simple to implement and can be used in conjunction with any of the methods outlined above. 3. Algorithm overview Next, we outline the steps of our algorithm, which simulates incompressible fluid flow. We perform our simulations with no viscosity, so the diffusion step is omitted. All of the steps of our fluid simulation algorithm are implemented on the GPU using HLSL pixel shaders and are executed by rendering full-screen quadrilaterals to renderable textures. For additional details on how such a flow algorithm is implemented on the GPU, please refer to Harris [23b]. First we will describe how the early z approach optimizes the 2D fluid flow simulation, and later describe how to extend this approach to 3D fluid flow. The first two passes insert flow into the density and velocity buffers based on mouse input: InsertVelocity() InsertDensity() Next, both the density and velocity buffers are advected based on the content of the velocity buffer: AdvectVelocity() AdvectDensity() Finally the projection step is computed and the velocity buffer is updated in order to remain mass conserving. The brute-force algorithm for computing pressure is as follows: ComputeDivergence() for ( int i = ; i < n; i++ ) UpdatePressure() SubtractGradient() The UpdatePressure() pass is the bottleneck of the algorithm, as it has to be executed approximately 3 times in order to yield visually pleasing results. Our algorithm adds a new pass to prime the z buffer, and employs early-z culling during the pressure computation step: ComputeDivergence() SetZBufferUsingPressureFromPreviousIteration() for(int i = ; i < n; i++) UpdatePressureWithEarlyZCulling() SubtractGradient() The details of the algorithm can be found in Section 4. After each step of the simulation, we render the density buffer to the screen in a single pass: RenderDensityToScreen() Due to current hardware limitations, bilinear interpolation of 32-bit floating point buffers must be performed in the pixel shader. This can be very costly if the screen resolution is significantly higher than the simulation grid resolution. In order to reduce the rendering cost, we again Page 2 of 6

employ early-z culling to skip this computation on cells that have little or no density: SetZBufferUsingDensity() RenderDensityToScreenWithEarlyZCulling() Additional details of this particular optimization can be found in Section 5. (a) Pressure buffer for simulation culling (b) Culled pixels during rendering tagged in red Figure 2: Visualization of early-z culling. 4. Projection optimization In this section, we describe an optimization that is performed during the projection step, the most expensive step of the simulation. This step is performed as a series of rendering passes to solve a linear system using a relaxation method. This optimization could also be considered for the diffusion step when simulating highly viscous fluid. When approximating the solution to this linear system, the higher the number of iterations (rendering passes), the more accurate the result. Instead of performing the same number of passes on all cells, we perform more passes on regions where the pressure is higher and fewer passes on regions with little or no pressure. This is accomplished by performing an additional inexpensive rendering pass that sets the z value of each cell in the simulation based on the maximum value of that cell and a four of its nearby cells from the pressure buffer of the previous iteration of the simulation. We simply set the z value for a particular cell x to be depth = saturate(αp + β) where P is the maximum pressure among the neighbors of x, and α and β are constants. We achieved best results with α = 2. and β =.1. The β value ensures that, even where the pressure is very small, at least some pressure computation passes will be performed. We take into account the pressure of the neighbors of x, because each relaxation step computes the new pressure for a given cell as a function of their neighbors. Thus, cells with high pressure may significantly increase the pressure of cells around it. In our experiments we looked at neighbors that were 2 cells away in each of the four directions. After priming the z buffer, we perform the pressure computation passes. In order to reduce this computation, we set the depth compare state to less than or equal and linearly increase the z value on each of the projection passes. On the first pass, the z value is set to 1/N, where N is the total number of pressure passes. On the second pass it is 2/N, and so on. Therefore, on the first pass, all cells are processed (because of our β value), and on subsequent passes, the number of cells that are processed gradually decreases. Figure 2a shows the pressure buffer which is used to set the z buffer that culls the projection computation. Darker values indicate regions of lower pressure, where fewer iterations need to be performed. The passes that enforce boundary conditions on the pressure computation are not culled. However, since they only affect the pixels on the grid boundaries, it does not hinder the performance of the heavily fill-bound simulation. Note that this optimization is an approximation and does not necessarily yield physically correct results. However, the visual improvement of using this method is evident, and performing 5 pressure computation passes with this culling technique yields more realistic results than performing 1 pressure computation passes with the brute force algorithm in the same amount of time. (a) 3D view (b) Density buffer (c) Pressure buffer Figure 3: Visualization of 3D flow and the density and pressure buffers. 4.1. Extension to 3D fluid flow We also extended the above optimization to 3D fluid flow simulation. Harris [23] introduces the idea of simulating 3D flow using a tiled 2D texture (Figure 3ab). This allows each step of the simulation to be performed with one single pass for all the slices, without having to switch render targets. The downside is that the texture coordinate computation is a bit more expensive and proper care must be taken at the boundaries of the slices to correctly account for boundary conditions. However, using this technique is more efficient than having to constantly switch render targets. Our optimization naturally extends to 3D flow. As in 2D flow, pressure computation is performed by doing multiple passes to update the pressure buffer (Figure 3c). Similarly, we perform one pass to prime the z buffer based on the pressure of the previous simulation iteration, and then, when performing the pressure computation passes, we perform early-z culling the same way we did with 2D flow. Page 3 of 6

Passes to enforce boundary conditions are rendered as several thin quadrilaterals that tile the texture square. As in the 2D simulation, during these passes, none of the cells are culled. 5. Rendering optimization The next optimization is performed at rendering time. When rendering the 32-bit floating point density buffer to the screen, bilinear interpolation must be performed in the pixel shader. This computation can be expensive, especially if the dimensions of render target are significantly higher than those of the fluid simulation. In order to avoid applying the bilinear interpolation shader to regions of the screen that have very small density, prior to rendering the contents of the density buffer, we perform an inexpensive rendering pass that simply sets the z buffer value to the density of that particular pixel using nearest as the texture lookup filter. When rendering, we set the depth compare state to less than or equal, and set the z value to a small constant ε in the vertex shader (e.g.,.1). Thus, all pixels whose densities are smaller than ε are culled. In Figure 2b, all pixels that were too dark to be visible were culled and are tagged in red for visualization purposes. Figure 1b shows the final rendering without the tagged pixels. Note that this optimization does not directly affect the simulation, since it is only used for rendering. Since less computation time is used for rendering, this optimization does indirectly affect the simulation, because more computation time can be used by the simulation to improve its quality. As shown in the results section, this optimization is particularly useful for applications that zoom in to regions of the flow, or render the result to high resolution buffers. While the projection optimization works well with 3D flow, unfortunately our rendering optimization does not yield an improvement for 3D flow. The cost of an additional pass for each slice, and of switching shaders multiple times is very high. 6. Results In this section, we present some results of applying the optimizations described in the earlier sections. Figure 1 compares a brute-force simulation and a simulation with early-z culling. Both examples simulate and render the 512 512 simulation grid at 53 frames per second. Since the brute-force approach performs the same number of pressure computations passes on all cells, it only manages 1 pressure computation passes. Our early-z method performs somewhere between 5 and 5 pressure computation passes, depending on the value in Figure 2a. Since our optimization allows for a high number of passes on areas of high pressure, it yields a more realistic result. Our method also renders the flow to a 512 512 window using the optimization outlined in Section 5. However, for this example the rendering optimization does not yield a significant improvement, as will be made clear below. Figure 4 graphs the performance of our 2D fluid simulation with and without each of our optimizations. The frame-rate is on the y-axis, while the resolution is on the x-axis (we have data points for 128 128, 256 256, 512 512, and 124 124 simulation grid resolutions). In each case, the screen resolution for the rendered output is the same as the simulation grid resolution. Five curves are plotted. One using the brute-force method, two using the simulation optimization, and two using both the simulation and rendering optimizations. When the optimizations are used, the frame rate is variable, so we use two curves for measuring performance, one without any flow, and one with the entire grid filled with flow. The actual frame rate will be somewhere between these two curves, depending on how much density and pressure is present. As evidenced by the three lowest curves on the graph, the penalty incurred by having the extra pass to set the z buffer is extremely small. On the other hand, if significant portions of the screen have no flow, the savings can be significant with no visual loss in quality. At 128 128, the savings due to the simulation optimization can be up to a factor of two with no visual loss in simulation quality. At 124 124, the savings can be up to a factor of three. The improvement due to the rendering optimization was not significant in this experiment due to the fact that the resolution of the output window is the same as the grid resolution. Since the bulk of the computation happens during the projection step of the simulation, the resolution of the output window would have to be larger than that of the simulation grid for the rendering optimization to yield significant improvement. Figure 5 graphs the same experiment, but with the output window resolution fixed at 124 124. In this case, if the simulation grid has significantly lower resolution, the savings due to the rendering optimization are significant nearly a factor of two for the 128 128 simulation grid. Figures 7 and 8 show different examples in which the pressure buffer is not cleared from one pass to the next (causing extremely swirling-like flow). In Figure 7, the small number of passes in the pressure buffer coupled with a slow frame rate results in a velocity field that is not mass conserving. In contrast, when using our culling techniques, the result is significantly more stable. 3D Flow. Figure 9 shows results applied to a 128 128 16 3D fluid flow simulation. Note that, although the improvement is not as significant as in the 2D simulations, the result using our techniques is more realistic. Blockers. These culling techniques are also very suitable for fluid flow simulations with blockers. Since no computation needs to be performed on most cells that are blocked (approximately half of the cells in Figure 6), these methods can further reduce computational costs. Note that blocked cells that have neighbors that are not blocked cannot be culled and need to be processed in order to yield the proper effect when fluid collides with the blocker. Page 4 of 6

35 Early-z projection+rendering Early-z projection Brute force Early-z projection (fill) Early-z projection+rendering (fill) 3 25 For future work, it would be interesting to further investigate methods to take advantage of the locality of fluid in the simulation. We are currently investigating a method with adaptive grid sample locations. A hierarchical culling approach could also yield significant savings. 2 FPS The main limitation of this approach is that it does not yield a significant improvement to simulations that have a large amount of fluid over the entire simulation grid. But even in the worst case, our simulations will not significantly impair the quality or rendering speed in such settings, as evidenced by Figures 4 and 5. 15 1 5 2 4 6 8 1 12 Resolution (# of cells) Figure 4: Timings of the different culling methods (screen resolution set to simulation resolution) 3 Early-z projection+rendering Early-z projection Brute force Early-z projection (fill) Early-z projection+rendering (fill) 25 FPS 2 15 1 Figure 6: Pink flow colliding with blocker in Van Gogh s Starry Night. 5 2 4 6 8 1 12 Resolution (# of cells) Figure 5: Timings of the different culling methods (124 124 screen resolution) 7. Summary and future work We have presented optimization techniques that take advantage of early-z culling to efficiently simulate and render fluid flow. The methods presented in this paper are straightforward to implement and yield a significant improvement in rendering speed for the same quality, or conversely, an improvement in quality for a given frame rate. Our results demonstrate that we obtain simulations that look more physically accurate than brute-force simulations at a given rendering speed. Our method excels in scenes where flow is concentrated on specific regions of the grid, such as scenes with blockers. References Chorin, A. 1967. A Numerical Method for Solving Incompressible Viscous Flow Problems. Journal of Computational Physics 2, pages 12 26. Fedkiw, R., Stam, J., and Jensen, H. 21. Visual Simulation of Smoke. In Proceedings of SIGGRAPH 21, pages 15 22. Harris, M. J., Baxter, W. V., Scheuermann, T., and Lastra, A. 23. Simulation of cloud dynamics on graphics hardware. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware, pages 92 11. Harris, M. J. 23b. Real-time cloud simulation and rendering. Ph.D. dissertation. University of North Carolina at Chapel Hill,, 23. Nguyen, D., Fedkiw, R., and Jensen, H. 22. Physically Based Modeling and Animation of Fire. In Proceedings of SIGGRAPH 2, pages 736 744. Rasmussen N., Nguyen, D. Q., Geiger, W., and Fedkiw, R. 23. Smoke simulation for large scale phenomena. In Proceedings of SIGGRAPH 23, pages 73-715. Stam, J. 1999. Stable fluids. In Proceedings of SIGGRAPH 1999, pages 121 128. Page 5 of 6

(a) Brute force (3 projection passes) (b) Early-z (up to 15 projection passes) Figure 7: Side by side of a 124 124 2D simulation. Both simulations render at 25fps. The small number of passes on the brute-force example causes the velocity field not to be mass conserving. (a) Brute force (5fps) (b) Early-z (16fps) Figure 8: Side by side of a 124 124 2D simulation. Both simulations have 4 projection passes. The lower frame rate on the brute-force example causes artifacts when flow is inserted at a constant, faster rate (e.g., interactive mouse input). (a) Brute force (3 projection passes) (b) Early-z (up to 3 projection passes) Figure 9: 128 128 16 3D flow simulation. Both examples render at 25fps. Page 6 of 6