Parallel 3D Image Segmentation of Large Data Sets on a GPU Cluster

Size: px
Start display at page:

Download "Parallel 3D Image Segmentation of Large Data Sets on a GPU Cluster"

Transcription

1 Parallel 3D Image Segmentation of Large Data Sets on a GPU Cluster Aaron Hagan and Ye Zhao Kent State University Abstract. In this paper, we propose an inherent parallel scheme for 3D image segmentation of large volume data on a GPU cluster. This method originates from an extended Lattice Boltzmann Model (LBM), and provides a new numerical solution for solving the level set equation. As a local, explicit and parallel scheme, our method lends itself to several favorable features: (1) Very easy to implement with the core program only requiring a few lines of code; (2) Implicit computation of curvatures; (3) Flexible control of generating smooth segmentation results; (4) Strong amenability to parallel computing, especially on low-cost, powerful graphics hardware (GPU). The parallel computational scheme is well suited for cluster computing, leading to a good solution for segmenting very large data sets. 1 Introduction Large scale 3D images are becoming very popular in many scientific domains including medical imaging, biology, industry etc. These images are often susceptible to noise during their acquisition. Image segmentation is a post processing technique that can show clearer results for analysis and registration. This topic has been a widely studied area of both 2D and 3D image processing and has been explored with a variety of techniques including (not limited to) region growing, contour evolutions, and image thresholding. The method we propose for performing image segmentation is an inherently parallel scheme based on the level set equation. Solving the level set equation is performed by using an extended lattice Boltzmann model (LBM) which provides an alternative numerical solution for the equation. This method has several advantages in that it is very easy and straightforward to implement, implicitly includes the computation of curvatures, has a unique parameter that controls the smoothness of the results, and finally, is parallel which allows it to be mapped to low-cost graphics hardware in a single GPU or GPU cluster environment. The level set method uses a partial differential equation (PDE) to model and track how fronts evolve in a discrete domain by maintaining and updating a distance field to the fronts. Previous methods based on the level set formulation discretize the PDE with finite difference operators which lead to complex numerical computations. The state-of-the-art narrow-band method applies an adaptive strategy where the level set computation is only performed on a narrow G. Bebis et al. (Eds.): ISVC 2009, Part II, LNCS 5876, pp , c Springer-Verlag Berlin Heidelberg 2009

2 Parallel 3D Image Segmentation of Large Data Sets on a GPU Cluster 961 region around the propagating contour. To expedite the narrow band on graphics hardware, Lefohn. et al [1,2] proposed a successful GPU implementation with narrow band packing and virtual memory management that arranges CPU-GPU data communication. The proposed method uses a GPU cluster environment to perform the segmentation of large datasets. With simple local operations, this method is a tool that is easily implemented on distributed machines, with minimal data management and communication through the network. Furthermore, it has the ability to handle curvature flows with its explicit computation, leading to a controllable noise reduction effect in the segmentation results. In detail, previous methods apply the narrow band with the corresponding priority data structure to adaptively propagate fronts to the target regions. A re-initialization of the narrow band is required to maintain the valid distance field. After this step, the new narrow band is packed and reloaded to the GPU. Our method is different in that we do not use a narrow band approach, and therefore, the distance field is always valid in the whole domain and no reload is needed. As a result there is no CPU-GPU crosstalk during segmentation and the data structures are easy to manage as they reside completely in the graphics hardware. Abandoning the adaptive strategies of this solver may appear unconventional at first, as memory consumption increases and computations are performed globally. There is, however, less management of the data (in terms of narrow band computing) and future work could be expanded to develop an adaptive method for this solver. More importantly, this strategy arises based on the rapidly increasing computational power of GPUs (i.e. speed and memory size). For example, GPU memory is increasing rapidly, current graphics cards are equipped with up to 4 GB of memory on a single unit. Cluster systems that contain many GPUs with large memory capacity are becoming available and being used in many scientific applications. We anticipate that this trend will continue in the future. Further benefit comes from this method s easy implementation with under 100 lines of CPU and GPU code. A knowledgeable graduate student can implement the program in a short period of time. In summary, our approach shows that large volumetric data sets can be segmented in parallel on multiple GPUs with fast performance and satisfying results. To the best of our knowledge this approach is the first to solve level set segmentation of large 3D images on a GPU cluster. 2 Related Work The level set equation has been used in a wide variety of image processing operations such as noise removal, object detection, and modeling equations of motion [3]. The level set equation can be used to perform the segmentation by creating an initial contour surface in the target image and having it evolve to regions of interest normally defined as target intensity values or gradients to attract the curve [4]. For GPU acceleration, a handful of work [5,1,6] has successfully applied the level set equation for image segmentation by solving it on the GPU.

3 962 A. Hagan and Y. Zhao The LBM method has been used in natural phenomena modeling in computer graphics and visualization [7]. The LBM-based diffusion has been used in image processing [8], where an anisotropic 2D image denoising is implemented on the CPU. Recently, Zhao [9] showed how the LBM scheme can be derived to solve volume smoothing, image fairing, and image editing applications. Tölke [10] described the parallel nature of LBM and how it can be mapped to the GPU through the CUDA library for modeling computational fluid dynamics. GPU cluster computing has been a rapidly developing research area which is adopted in many scientific computing tasks. Fan et al. [11] used the classic LBM for fluid modeling with their Zippy programming model for GPU clusters. 3 Introduction to LBM LBM [12] originates from the cellular automata scheme which models fictitious particles on a discrete grid where each point of the grid contains a particular lattice structure with links to its neighbors. The lattice structures are defined by D3Q19 and D3Q7 which define the dimension and how many links are connected between the lattice and its neighbors. The fictitious particles moving along the links and their averaging behaviors are initially used to simulate traditional fluid dynamics. Using the numerical computing process derived from microscopic statistical physics, this recovers the Naiver-Stokes equations governing flow behaviors. The independent variables in the LBM equation consist of particle distribution functions of each link from a grid point to one of its neighbors. The particle distribution functions model the probability of a packet of particles streaming across one lattice link to its corresponding neighbor. Between two consecutive steps of the streaming computation, the function is modified by performing local relaxation that models inter-particle collisions. We refer the interested readers to a complete physical description [12], and its usage in visual simulations [7]. LBM Computation. The first step in performing a LBM simulation is to discretize the simulation domain to a grid, and generate the lattice structure for each grid point. For LBM simulations each grid point has a variety of different links to its neighbors. During each step of the simulation, collision and streaming computations are performed which are mathematically described as: collision f i (x,t )=f i (x,t) 1 τ (f i(x,t) f eq i (x,t)), (1) streaming f i (x + e i,t+1)=f i (x,t ), (2) The local equilibrium particle distribution, f eq i, models collisions as a statistical redistribution of momentum. At a given time step t, each particle distribution function, f i, along one link vector e i at a lattice point, x, isupdatedbya relaxation process with respect to f eq i. The collision process is controlled by a relaxation parameter τ. τ controls the rate at which the equation approaches the equilibrium state. After collision, the post-collision result is propagated to x + e i. Here, x + e i locates a neighboring lattice point along the link i. This

4 Parallel 3D Image Segmentation of Large Data Sets on a GPU Cluster 963 provides the distribution function value at time step t+1. f eq i can be defined by the Bhatnagar, Gross, Krook (BGK) model as f eq i (ρ, u) =ρ(a i + B i (e i u)+c i (e i u) 2 + D i u 2 ), (3) where A i to D i are constant coefficients chosen via the geometry of the lattice links and ρ is the fluid density computed as the accumulation of particle distributions by: ρ = f i. (4) i The LBM can be easily extended to incorporate additional micro-physics, such as an external force F. This force affects the local particle distribution functions as follows: (2τ 1) f i f i + B i (F e i ). (5) 2τ By applying Chapman-Enskog analysis [13], the Navier-Stokes equation can be recovered from the equilibrium equation as: u =0, (6) u + u u = νδu + F. (7) t Here defines the gradient operator ( x, y, z )andδ is the Laplacian Δ = 2 = 2 x + 2 y + 2 z. Extended LBM. Though initially designed for fluid dynamics, the LBM method can be modified for modeling typical diffusion computations. Equation 3 can be simplified to: f eq i (ρ) =A i ρ, (8) which erases momentum terms and in effect removes the nonlinear advection term in the Navier-Stokes equation, which aren t needed for solving diffusion equations. As shown in [9], the parabolic diffusion equation can be recovered by the Chapman-Enskog expansion: ρ = γ ρ, (9) t where γ is a diffusion coefficient defined for a D3Q7 lattice by the relaxation parameter τ as: γ = 1 (2τ 1). (10) 6 In this case, we can also include the external force in the same way as in Equation 5. And thus, the modified LBM computation can recover the following equation: ρ = γ ρ + F. (11) t Using this equation to compute a distance field (replace ρ by φ), we can recover the level set equation, where F is used to accommodate the speed function and the first term relates to the curvature flow effects.

5 964 A. Hagan and Y. Zhao 4 Solving Level Set Equation A distance field defines how far all points in a domain are to an existing surface, where the distance is signed to distinguish between inside and outside the surface. In this way, the surface S is defined as: φ : R 3 R for p R 3. The distance function is defined as: φ(p) =sgn(p) min{ p q : q S} (12) The surface can be considered as points with a zero distance value. In image segmentation, the zero level set starts from an arbitrary starting shape and evolves itself by the following level set equation: φ t = φ [αd(x)+γ φ φ ] (13) where φ is the distance, D(x) is the speed function that performs as a driving force to move the evolving level set to target regions, with a user-controlling parameter α (we use 0.01 in the examples). The second term represents curvature flow (smoothing). γ determines the level of curvature-based smoothness in the results. For a regular distance field, φ = 1, which leads the last term to γ φ. Notethat φ = 1 in our framework at all steps, since we do not use an adaptive approach and the distance field is valid in the whole domain. From this, Equation 13 is only a variational formula of Equation 11. It shows that the modified LBM computation leads to a new solution to the level set equation, enabling us to use the simple, explicit, parallel computational process for volume segmentation. In this way, our method also has the potential to be applied to other level set based applications. In our implementation, we apply a simple D3Q7 lattice that uses less memory and improves the performance, comparing with a traditional fluid solver using a D3Q19 lattice. This is made possible since the level set solution does not need to solve the nonlinear advection term as in the Navier-Stokes equations, and the D3Q7 lattice can provide enough accuracy. Driving Speed Function. Speed functions are designed to make the evolving front of the zero level set propagate to certain target regions. We use a popular approach [3,14] where the speed function is defined by the difference between a target isovalue and a density value at each grid position: D(I) =ɛ I T, (14) where I is the voxel/pixel value at the grid position, and T represents the target density isovalue that the front should evolve to. As the front moves closer to the target region, the speed will converge to zero. The speed term also carries properties that allow the front to propagate in either direction, based on the sign of the function. The propagating front will expand if I falls in the T -ɛ or T +ɛ range, otherwise it will contract. The function D(I) is easily applied in our

6 Parallel 3D Image Segmentation of Large Data Sets on a GPU Cluster 965 LBM computation as a body force F in Equation 11. D(I) can also be derived based on the gradient defining the object boundary or other user-specified rules. Level Set Curvature Computation. As mentioned earlier the LBM scheme inherently contains properties that model curvature during the collision and streaming process. The benefit of the LBM method is that the curvature does not need to be computed explicitly, because it is hidden in the microscopic LBM collision procedure. From the LBM-solved diffusion Equation 9, we substitute fluid density ρ by the distance value φ. And then by applying φ = 1 for distance field, we get: φ t where κ represents the mean curvature: φ = γ ( ) φ = γκ φ, (15) φ κ = ( φ ). (16) φ In summary, the modified LBM can implicitly provide the curvature-based smoothing effects, in contrast to upwinding difference methods that need to explicitly compute curvature components. (a) (b) Fig. 1. (a) Data decomposition to cluster nodes for LBM computations. (b) Ghost layers used to transfer boundary data between neighboring nodes in the network. 5 Cluster Computing Our method can easily be implemented on a single GPU, but to handle large data sets, we extend the algorithm to multiple GPUs organized in a cluster environment. Our cluster is Linux-operated and consists of seventeen nodes, each having a dual core or quad core AMD Opteron processor and a Nvidia 8800 GTX graphics card with 768 MB memory. The 3D volume data set is divided into 16 blocks and sent to the 16 worker nodes in a 4 4 organization of the nodes. One master node is used for managing initial data division, collecting and assembling results, and visualization. The master node assembles separate

7 966 A. Hagan and Y. Zhao Table 1. Performance report: Per step (in seconds) averaging speed to perform LBM computation and ghost layer communication, the memory size (in MB) per node, and the total ghost layer data size on the GPU cluster with 4 4 configuration. LBM Ghost Layer Total GPU Mem. Ghost Layer Model Size Speed Transfer Speed Size Data Per Step Per Step Per Step Per Node Size CT Head MRI Head Abdomen Colon Aneurism Porche Bonsai results with correct coordinate transformation and indexing, and uses a Marching Cubes method to render the segmented features of the distance field. Figure 1(a) shows this cluster configuration and data distribution. The data distribution is accelerated by using OpenMP to distribute the data in parallel. Between consecutive LBM steps, it is necessary for the worker nodes to share the LBM data (i.e., f i values) residing on the boundaries between each pair of neighboring blocks. We apply a ghost layer method to handle this problem. Each data block contains an extra layer of data, the ghost layer, to communicate with each of its neighbors, which is shown in Fig. 1(b). For example, one node A performs computation on data layer A 1 to A n. After each step, data in A 1 is transferred to the ghost layer B n+1 of its neighbor node B. Meanwhile, B s B n layer will be transferred to the ghost layer A 0 of A. In the next step, A will use A 0 to implement streaming operation and B will use B n+1 as well. The data transfer only involves the boundary layers with a very small amount of data compared with the total data size. With an infiniband network equipped on our cluster, data can be transferred with a speed at an order of gigabits per second. 6 Results and Performance We ran several volumetric data sets with various data sizes on our GPU cluster, where the computation was completely GPU-based. A 3D Aneurism dataset is used in Fig. 2 to show the results. The image sequence demonstrates the process of the level set propagation in different steps. A Marching Cubes method is used to generate a triangle mesh of the zero level set. The total steps used for the level set to reach final results are determined by the position and shape of the initial starting level set (we use a simple sphere in our examples). Fig. 4 shows another example of a CT abdomen data set and Fig. 3 uses a bonsai volume. Table 1 outlines the performance results of several datasets. The average speed per step is composed of two parts: LBM computation and ghost layer handling. We also report the GPU memory consumption on each node, and the total size

8 Parallel 3D Image Segmentation of Large Data Sets on a GPU Cluster 967 (a) (b) (c) (d) Fig. 2. Results of segmenting an Aneurism dataset with a target iso-value of 32. Data size is γ = 1.5. (a) Level set propagates after 3 steps; (b) After 10 Steps; (c) After 25 Steps; (d) After 50 Steps. (a) (b) (c) (d) Fig. 3. Results of segmenting a volumetric Bonsai data with a target iso-value of 20. γ = Data size is (a) Initial level set as a sphere; (b) After 25 Steps; (c) After 45 Steps; (d) Direct rendering of isosurface with density values of 20. (a) (b) Fig. 4. Results of segmenting a 3D CT addomen dataset with a target iso-value of 62. γ = Data size is (a) Segmentation result (distance field) after 25 steps; (b) Direct rendering of isosurface with density values of 62. of all ghost layers that determines the network traffic speed. It clearly shows that our method achieves very good performance to segment very large data. For the largest Bonsai data, it averages 6.81 seconds per step. The segmentation of the Bonsai completes in 45 steps, leading to a total processing time of around

9 968 A. Hagan and Y. Zhao 306 seconds. The segmentation usually uses tens of total steps for large data. With an averaging per step speed at a few seconds, the whole process can generally be accomplished in tens to hundreds of seconds depending on the data sets and the initial distance field. In detail, the LBM level set computation is very fast even for a large data set. It uses 1.57 seconds for the Bonsai data, which runs on one volume per node due to our data division scheme. The computation for the ghost layers handling is a little slower, which include (1) data readback from GPUs, (2) network transfer, and (3) data write to GPUs. For the Bonsai data, it costs 5.24 seconds. The total ghost layer data size (on all the nodes) reaches 101.1MB. Although this data size does not impose a challenge on the infiniband network, the GPU readback may consume a little more time than the LBM computation which is a known bottleneck of GPU computing. We plan to improve performance with further optimization of the ghost layer processing on faster GPUs and a new cluster configuration. 7 Conclusion Common segmentation techniques such as isovalue threshholding are not adequate enough to handle complex 3D images generated by medical or other scanning devices. It proves necessary to implement advanced techniques which have the power to give clearer segmentation results by solving level set equations. Popular level set approaches on a single GPU are not easily extended to large volume data sets which are prevalent in practical applications. We have proposed an inherent parallel method to solve the segmentation problem flexibly and efficiently on single and multiple GPUs. Based on an extended LBM method, our method lends itself as a good segmentation tool with easy implementation, implicit curvature handling, and thus controllable smoothness of the segmented data. With its parallel scheme only minimal data processing is required for implementing the method in an GPU cluster compared with the previous single GPU approaches. We have reported good performance of multiple data sets on the cluster. In summary, our scheme provides a viable solution for large-scale 3D image segmentation in adoption of distributed computing technology. It has great potential to be applied in various applications. In the future, we will work on combining parallel visualization techniques with the segmentation, to further augment the ability of this method. Acknowledgement This work is partially supported by NSF grant IIS and Kent State Research Council. The cluster is funded by the daytaohio Wright Center of Innovation by the Ohio Department of Development. Please find the color version of this paper at the author s homepage.

10 Parallel 3D Image Segmentation of Large Data Sets on a GPU Cluster 969 References 1. Lefohn, A., Cates, J., Whitaker, R.: Interactive, GPU-based level sets for 3d brain tumor segmentation. In: Medical Image Computing and Computer Assisted Intervention, MICCAI, pp (2003) 2. Cates, J.E., Lefohn, A.E., Whitaker, R.T.: Gist: An interactive, GPU-based levelset segmentation tool for 3d medical images. Medical Image Analysis 10, (2004) 3. Sethian, J.: Level set methods and fast marching methods: Evolving interfaces in computational geometry, fluid mechanics, computer vision, and materials science (1999) 4. Malladi, R., Sethian, J.A., Vemuri, B.C.: Shape modeling with front propagation: A level set approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 17, (1995) 5. Klar, O.: Interactive GPU based segmentation of large medical volume data with level sets. Diploma Thesis, VRVis and University Koblenz-Landau (2006) 6. Rumpf, M., Strzodka, R.: Level set segmentation in graphics hardware. In: Proceedings of IEEE International Conference on Image Processing (ICIP 2001), vol. 3, pp (2001) 7. Zhao, Y., Kaufman, A., Mueller, K., Thuerey, N., Rüde, U., Iglberger, K.: Interactive lattice-based flow simulation and visualization. In: Tutorial, IEEE Visualization Conference (2008) 8. Jawerth, B., Lin, P., Sinzinger, E.: Lattice Boltzmann models for anisotropic diffusion of images. Journal of Mathematical Imaging and Vision 11, (1999) 9. Zhao, Y.: Lattice Boltzmann based PDE solver on the GPU. Visual Computer, (2008) 10. Tölke, J.: Implementation of a lattice boltzmann kernel using the compute unified device architecture developed by nvidia. Computing and Visualization in Science (2008) 11. Fan, Z., Qiu, F., Kaufman, A.E.: Zippy: A framework for computation and visualization on a gpu cluster. Computer Graphics Forum 27(2), (2008) 12. Succi, S.: Numerical Mathematics and Scientific Computation. In: The Lattice Boltzmann Equation for Fluid Dynamics and Beyond. Oxford University Press, Oxford (2001) 13. He, X., Luo, L.: Lattice Boltzmann model for the incompressible Navier-Stokes equation. Journal of Statistical Physics 88(3/4), (1997) 14. Lefohn, A.E., Kniss, J.M., Hansen, C.D., Whitaker, R.T.: A streaming narrow-band algorithm: Interactive computation and visualization of level sets. IEEE Transactions on Visualization and Computer Graphics 10(4), (2004)

Interactive Level-Set Segmentation on the GPU

Interactive Level-Set Segmentation on the GPU Interactive Level-Set Segmentation on the GPU Problem Statement Goal Interactive system for deformable surface manipulation Level-sets Challenges Deformation is slow Deformation is hard to control Solution

More information

Interactive Level-Set Deformation On the GPU

Interactive Level-Set Deformation On the GPU Interactive Level-Set Deformation On the GPU Institute for Data Analysis and Visualization University of California, Davis Problem Statement Goal Interactive system for deformable surface manipulation

More information

LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR

LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR Frédéric Kuznik, frederic.kuznik@insa lyon.fr 1 Framework Introduction Hardware architecture CUDA overview Implementation details A simple case:

More information

Level Set Evolution Without Re-initialization: A New Variational Formulation

Level Set Evolution Without Re-initialization: A New Variational Formulation Level Set Evolution Without Re-initialization: A New Variational Formulation Chunming Li 1, Chenyang Xu 2, Changfeng Gui 3, and Martin D. Fox 1 1 Department of Electrical and 2 Department of Imaging 3

More information

ultra fast SOM using CUDA

ultra fast SOM using CUDA ultra fast SOM using CUDA SOM (Self-Organizing Map) is one of the most popular artificial neural network algorithms in the unsupervised learning category. Sijo Mathew Preetha Joy Sibi Rajendra Manoj A

More information

ME6130 An introduction to CFD 1-1

ME6130 An introduction to CFD 1-1 ME6130 An introduction to CFD 1-1 What is CFD? Computational fluid dynamics (CFD) is the science of predicting fluid flow, heat and mass transfer, chemical reactions, and related phenomena by solving numerically

More information

HPC enabling of OpenFOAM R for CFD applications

HPC enabling of OpenFOAM R for CFD applications HPC enabling of OpenFOAM R for CFD applications Towards the exascale: OpenFOAM perspective Ivan Spisso 25-27 March 2015, Casalecchio di Reno, BOLOGNA. SuperComputing Applications and Innovation Department,

More information

Overview Motivation and applications Challenges. Dynamic Volume Computation and Visualization on the GPU. GPU feature requests Conclusions

Overview Motivation and applications Challenges. Dynamic Volume Computation and Visualization on the GPU. GPU feature requests Conclusions Module 4: Beyond Static Scalar Fields Dynamic Volume Computation and Visualization on the GPU Visualization and Computer Graphics Group University of California, Davis Overview Motivation and applications

More information

Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute andek@vtc.vt.

Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute andek@vtc.vt. Medical Image Processing on the GPU Past, Present and Future Anders Eklund, PhD Virginia Tech Carilion Research Institute andek@vtc.vt.edu Outline Motivation why do we need GPUs? Past - how was GPU programming

More information

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms Björn Rocker Hamburg, June 17th 2010 Engineering Mathematics and Computing Lab (EMCL) KIT University of the State

More information

Stream Processing on GPUs Using Distributed Multimedia Middleware

Stream Processing on GPUs Using Distributed Multimedia Middleware Stream Processing on GPUs Using Distributed Multimedia Middleware Michael Repplinger 1,2, and Philipp Slusallek 1,2 1 Computer Graphics Lab, Saarland University, Saarbrücken, Germany 2 German Research

More information

Parallel Large-Scale Visualization

Parallel Large-Scale Visualization Parallel Large-Scale Visualization Aaron Birkland Cornell Center for Advanced Computing Data Analysis on Ranger January 2012 Parallel Visualization Why? Performance Processing may be too slow on one CPU

More information

Level Set Framework, Signed Distance Function, and Various Tools

Level Set Framework, Signed Distance Function, and Various Tools Level Set Framework Geometry and Calculus Tools Level Set Framework,, and Various Tools Spencer Department of Mathematics Brigham Young University Image Processing Seminar (Week 3), 2010 Level Set Framework

More information

CFD SIMULATION OF SDHW STORAGE TANK WITH AND WITHOUT HEATER

CFD SIMULATION OF SDHW STORAGE TANK WITH AND WITHOUT HEATER International Journal of Advancements in Research & Technology, Volume 1, Issue2, July-2012 1 CFD SIMULATION OF SDHW STORAGE TANK WITH AND WITHOUT HEATER ABSTRACT (1) Mr. Mainak Bhaumik M.E. (Thermal Engg.)

More information

PyFR: Bringing Next Generation Computational Fluid Dynamics to GPU Platforms

PyFR: Bringing Next Generation Computational Fluid Dynamics to GPU Platforms PyFR: Bringing Next Generation Computational Fluid Dynamics to GPU Platforms P. E. Vincent! Department of Aeronautics Imperial College London! 25 th March 2014 Overview Motivation Flux Reconstruction Many-Core

More information

TWO-DIMENSIONAL FINITE ELEMENT ANALYSIS OF FORCED CONVECTION FLOW AND HEAT TRANSFER IN A LAMINAR CHANNEL FLOW

TWO-DIMENSIONAL FINITE ELEMENT ANALYSIS OF FORCED CONVECTION FLOW AND HEAT TRANSFER IN A LAMINAR CHANNEL FLOW TWO-DIMENSIONAL FINITE ELEMENT ANALYSIS OF FORCED CONVECTION FLOW AND HEAT TRANSFER IN A LAMINAR CHANNEL FLOW Rajesh Khatri 1, 1 M.Tech Scholar, Department of Mechanical Engineering, S.A.T.I., vidisha

More information

NVIDIA IndeX Enabling Interactive and Scalable Visualization for Large Data Marc Nienhaus, NVIDIA IndeX Engineering Manager and Chief Architect

NVIDIA IndeX Enabling Interactive and Scalable Visualization for Large Data Marc Nienhaus, NVIDIA IndeX Engineering Manager and Chief Architect SIGGRAPH 2013 Shaping the Future of Visual Computing NVIDIA IndeX Enabling Interactive and Scalable Visualization for Large Data Marc Nienhaus, NVIDIA IndeX Engineering Manager and Chief Architect NVIDIA

More information

GPUs for Scientific Computing

GPUs for Scientific Computing GPUs for Scientific Computing p. 1/16 GPUs for Scientific Computing Mike Giles mike.giles@maths.ox.ac.uk Oxford-Man Institute of Quantitative Finance Oxford University Mathematical Institute Oxford e-research

More information

Design and Optimization of a Portable Lattice Boltzmann Code for Heterogeneous Architectures

Design and Optimization of a Portable Lattice Boltzmann Code for Heterogeneous Architectures Design and Optimization of a Portable Lattice Boltzmann Code for Heterogeneous Architectures E Calore, S F Schifano, R Tripiccione Enrico Calore INFN Ferrara, Italy Perspectives of GPU Computing in Physics

More information

Turbomachinery CFD on many-core platforms experiences and strategies

Turbomachinery CFD on many-core platforms experiences and strategies Turbomachinery CFD on many-core platforms experiences and strategies Graham Pullan Whittle Laboratory, Department of Engineering, University of Cambridge MUSAF Colloquium, CERFACS, Toulouse September 27-29

More information

Real-time Visual Tracker by Stream Processing

Real-time Visual Tracker by Stream Processing Real-time Visual Tracker by Stream Processing Simultaneous and Fast 3D Tracking of Multiple Faces in Video Sequences by Using a Particle Filter Oscar Mateo Lozano & Kuzahiro Otsuka presented by Piotr Rudol

More information

NUMERICAL ANALYSIS OF THE EFFECTS OF WIND ON BUILDING STRUCTURES

NUMERICAL ANALYSIS OF THE EFFECTS OF WIND ON BUILDING STRUCTURES Vol. XX 2012 No. 4 28 34 J. ŠIMIČEK O. HUBOVÁ NUMERICAL ANALYSIS OF THE EFFECTS OF WIND ON BUILDING STRUCTURES Jozef ŠIMIČEK email: jozef.simicek@stuba.sk Research field: Statics and Dynamics Fluids mechanics

More information

Accelerating CFD using OpenFOAM with GPUs

Accelerating CFD using OpenFOAM with GPUs Accelerating CFD using OpenFOAM with GPUs Authors: Saeed Iqbal and Kevin Tubbs The OpenFOAM CFD Toolbox is a free, open source CFD software package produced by OpenCFD Ltd. Its user base represents a wide

More information

Coupling Forced Convection in Air Gaps with Heat and Moisture Transfer inside Constructions

Coupling Forced Convection in Air Gaps with Heat and Moisture Transfer inside Constructions Coupling Forced Convection in Air Gaps with Heat and Moisture Transfer inside Constructions M. Bianchi Janetti 1, F. Ochs 1 and R. Pfluger 1 1 University of Innsbruck, Unit for Energy Efficient Buildings,

More information

ABSTRACT FOR THE 1ST INTERNATIONAL WORKSHOP ON HIGH ORDER CFD METHODS

ABSTRACT FOR THE 1ST INTERNATIONAL WORKSHOP ON HIGH ORDER CFD METHODS 1 ABSTRACT FOR THE 1ST INTERNATIONAL WORKSHOP ON HIGH ORDER CFD METHODS Sreenivas Varadan a, Kentaro Hara b, Eric Johnsen a, Bram Van Leer b a. Department of Mechanical Engineering, University of Michigan,

More information

Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms

Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms Amani AlOnazi, David E. Keyes, Alexey Lastovetsky, Vladimir Rychkov Extreme Computing Research Center,

More information

Introduction to CFD Analysis

Introduction to CFD Analysis Introduction to CFD Analysis 2-1 What is CFD? Computational Fluid Dynamics (CFD) is the science of predicting fluid flow, heat and mass transfer, chemical reactions, and related phenomena by solving numerically

More information

2.2 Creaseness operator

2.2 Creaseness operator 2.2. Creaseness operator 31 2.2 Creaseness operator Antonio López, a member of our group, has studied for his PhD dissertation the differential operators described in this section [72]. He has compared

More information

Interactive simulation of an ash cloud of the volcano Grímsvötn

Interactive simulation of an ash cloud of the volcano Grímsvötn Interactive simulation of an ash cloud of the volcano Grímsvötn 1 MATHEMATICAL BACKGROUND Simulating flows in the atmosphere, being part of CFD, is on of the research areas considered in the working group

More information

HPC Deployment of OpenFOAM in an Industrial Setting

HPC Deployment of OpenFOAM in an Industrial Setting HPC Deployment of OpenFOAM in an Industrial Setting Hrvoje Jasak h.jasak@wikki.co.uk Wikki Ltd, United Kingdom PRACE Seminar: Industrial Usage of HPC Stockholm, Sweden, 28-29 March 2011 HPC Deployment

More information

Employing Complex GPU Data Structures for the Interactive Visualization of Adaptive Mesh Refinement Data

Employing Complex GPU Data Structures for the Interactive Visualization of Adaptive Mesh Refinement Data Volume Graphics (2006) T. Möller, R. Machiraju, T. Ertl, M. Chen (Editors) Employing Complex GPU Data Structures for the Interactive Visualization of Adaptive Mesh Refinement Data Joachim E. Vollrath Tobias

More information

OpenFOAM Optimization Tools

OpenFOAM Optimization Tools OpenFOAM Optimization Tools Henrik Rusche and Aleks Jemcov h.rusche@wikki-gmbh.de and a.jemcov@wikki.co.uk Wikki, Germany and United Kingdom OpenFOAM Optimization Tools p. 1 Agenda Objective Review optimisation

More information

Volume visualization I Elvins

Volume visualization I Elvins Volume visualization I Elvins 1 surface fitting algorithms marching cubes dividing cubes direct volume rendering algorithms ray casting, integration methods voxel projection, projected tetrahedra, splatting

More information

walberla: A software framework for CFD applications on 300.000 Compute Cores

walberla: A software framework for CFD applications on 300.000 Compute Cores walberla: A software framework for CFD applications on 300.000 Compute Cores J. Götz (LSS Erlangen, jan.goetz@cs.fau.de), K. Iglberger, S. Donath, C. Feichtinger, U. Rüde Lehrstuhl für Informatik 10 (Systemsimulation)

More information

Multi-Block Gridding Technique for FLOW-3D Flow Science, Inc. July 2004

Multi-Block Gridding Technique for FLOW-3D Flow Science, Inc. July 2004 FSI-02-TN59-R2 Multi-Block Gridding Technique for FLOW-3D Flow Science, Inc. July 2004 1. Introduction A major new extension of the capabilities of FLOW-3D -- the multi-block grid model -- has been incorporated

More information

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing Innovation Intelligence Devin Jensen August 2012 Altair Knows HPC Altair is the only company that: makes HPC tools

More information

FEAWEB ASP Issue: 1.0 Stakeholder Needs Issue Date: 03/29/2000. 04/07/2000 1.0 Initial Description Marco Bittencourt

FEAWEB ASP Issue: 1.0 Stakeholder Needs Issue Date: 03/29/2000. 04/07/2000 1.0 Initial Description Marco Bittencourt )($:(%$63 6WDNHKROGHU1HHGV,VVXH 5HYLVLRQ+LVWRU\ 'DWH,VVXH 'HVFULSWLRQ $XWKRU 04/07/2000 1.0 Initial Description Marco Bittencourt &RQILGHQWLDO DPM-FEM-UNICAMP, 2000 Page 2 7DEOHRI&RQWHQWV 1. Objectives

More information

GPU File System Encryption Kartik Kulkarni and Eugene Linkov

GPU File System Encryption Kartik Kulkarni and Eugene Linkov GPU File System Encryption Kartik Kulkarni and Eugene Linkov 5/10/2012 SUMMARY. We implemented a file system that encrypts and decrypts files. The implementation uses the AES algorithm computed through

More information

Introduction to CFD Analysis

Introduction to CFD Analysis Introduction to CFD Analysis Introductory FLUENT Training 2006 ANSYS, Inc. All rights reserved. 2006 ANSYS, Inc. All rights reserved. 2-2 What is CFD? Computational fluid dynamics (CFD) is the science

More information

Fast Parallel Algorithms for Computational Bio-Medicine

Fast Parallel Algorithms for Computational Bio-Medicine Fast Parallel Algorithms for Computational Bio-Medicine H. Köstler, J. Habich, J. Götz, M. Stürmer, S. Donath, T. Gradl, D. Ritter, D. Bartuschat, C. Feichtinger, C. Mihoubi, K. Iglberger (LSS Erlangen)

More information

Variational approach to restore point-like and curve-like singularities in imaging

Variational approach to restore point-like and curve-like singularities in imaging Variational approach to restore point-like and curve-like singularities in imaging Daniele Graziani joint work with Gilles Aubert and Laure Blanc-Féraud Roma 12/06/2012 Daniele Graziani (Roma) 12/06/2012

More information

Analecta Vol. 8, No. 2 ISSN 2064-7964

Analecta Vol. 8, No. 2 ISSN 2064-7964 EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,

More information

MEng, BSc Applied Computer Science

MEng, BSc Applied Computer Science School of Computing FACULTY OF ENGINEERING MEng, BSc Applied Computer Science Year 1 COMP1212 Computer Processor Effective programming depends on understanding not only how to give a machine instructions

More information

Fluid Dynamics and the Navier-Stokes Equation

Fluid Dynamics and the Navier-Stokes Equation Fluid Dynamics and the Navier-Stokes Equation CMSC498A: Spring 12 Semester By: Steven Dobek 5/17/2012 Introduction I began this project through a desire to simulate smoke and fire through the use of programming

More information

THE CFD SIMULATION OF THE FLOW AROUND THE AIRCRAFT USING OPENFOAM AND ANSA

THE CFD SIMULATION OF THE FLOW AROUND THE AIRCRAFT USING OPENFOAM AND ANSA THE CFD SIMULATION OF THE FLOW AROUND THE AIRCRAFT USING OPENFOAM AND ANSA Adam Kosík Evektor s.r.o., Czech Republic KEYWORDS CFD simulation, mesh generation, OpenFOAM, ANSA ABSTRACT In this paper we describe

More information

Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61

Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61 F# Applications to Computational Financial and GPU Computing May 16th Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61 Today! Why care about F#? Just another fashion?! Three success stories! How Alea.cuBase

More information

Influence of Load Balancing on Quality of Real Time Data Transmission*

Influence of Load Balancing on Quality of Real Time Data Transmission* SERBIAN JOURNAL OF ELECTRICAL ENGINEERING Vol. 6, No. 3, December 2009, 515-524 UDK: 004.738.2 Influence of Load Balancing on Quality of Real Time Data Transmission* Nataša Maksić 1,a, Petar Knežević 2,

More information

Modeling Rotor Wakes with a Hybrid OVERFLOW-Vortex Method on a GPU Cluster

Modeling Rotor Wakes with a Hybrid OVERFLOW-Vortex Method on a GPU Cluster Modeling Rotor Wakes with a Hybrid OVERFLOW-Vortex Method on a GPU Cluster Mark J. Stock, Ph.D., Adrin Gharakhani, Sc.D. Applied Scientific Research, Santa Ana, CA Christopher P. Stone, Ph.D. Computational

More information

Fast Multipole Method for particle interactions: an open source parallel library component

Fast Multipole Method for particle interactions: an open source parallel library component Fast Multipole Method for particle interactions: an open source parallel library component F. A. Cruz 1,M.G.Knepley 2,andL.A.Barba 1 1 Department of Mathematics, University of Bristol, University Walk,

More information

~ Greetings from WSU CAPPLab ~

~ Greetings from WSU CAPPLab ~ ~ Greetings from WSU CAPPLab ~ Multicore with SMT/GPGPU provides the ultimate performance; at WSU CAPPLab, we can help! Dr. Abu Asaduzzaman, Assistant Professor and Director Wichita State University (WSU)

More information

CFD analysis for road vehicles - case study

CFD analysis for road vehicles - case study CFD analysis for road vehicles - case study Dan BARBUT*,1, Eugen Mihai NEGRUS 1 *Corresponding author *,1 POLITEHNICA University of Bucharest, Faculty of Transport, Splaiul Independentei 313, 060042, Bucharest,

More information

Robust Algorithms for Current Deposition and Dynamic Load-balancing in a GPU Particle-in-Cell Code

Robust Algorithms for Current Deposition and Dynamic Load-balancing in a GPU Particle-in-Cell Code Robust Algorithms for Current Deposition and Dynamic Load-balancing in a GPU Particle-in-Cell Code F. Rossi, S. Sinigardi, P. Londrillo & G. Turchetti University of Bologna & INFN GPU2014, Rome, Sept 17th

More information

Numerical Calculation of Laminar Flame Propagation with Parallelism Assignment ZERO, CS 267, UC Berkeley, Spring 2015

Numerical Calculation of Laminar Flame Propagation with Parallelism Assignment ZERO, CS 267, UC Berkeley, Spring 2015 Numerical Calculation of Laminar Flame Propagation with Parallelism Assignment ZERO, CS 267, UC Berkeley, Spring 2015 Xian Shi 1 bio I am a second-year Ph.D. student from Combustion Analysis/Modeling Lab,

More information

Computer Graphics Hardware An Overview

Computer Graphics Hardware An Overview Computer Graphics Hardware An Overview Graphics System Monitor Input devices CPU/Memory GPU Raster Graphics System Raster: An array of picture elements Based on raster-scan TV technology The screen (and

More information

MEL 807 Computational Heat Transfer (2-0-4) Dr. Prabal Talukdar Assistant Professor Department of Mechanical Engineering IIT Delhi

MEL 807 Computational Heat Transfer (2-0-4) Dr. Prabal Talukdar Assistant Professor Department of Mechanical Engineering IIT Delhi MEL 807 Computational Heat Transfer (2-0-4) Dr. Prabal Talukdar Assistant Professor Department of Mechanical Engineering IIT Delhi Time and Venue Course Coordinator: Dr. Prabal Talukdar Room No: III, 357

More information

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

GPU System Architecture. Alan Gray EPCC The University of Edinburgh GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems

More information

Interactive Visualization of Magnetic Fields

Interactive Visualization of Magnetic Fields JOURNAL OF APPLIED COMPUTER SCIENCE Vol. 21 No. 1 (2013), pp. 107-117 Interactive Visualization of Magnetic Fields Piotr Napieralski 1, Krzysztof Guzek 1 1 Institute of Information Technology, Lodz University

More information

Vision based Vehicle Tracking using a high angle camera

Vision based Vehicle Tracking using a high angle camera Vision based Vehicle Tracking using a high angle camera Raúl Ignacio Ramos García Dule Shu gramos@clemson.edu dshu@clemson.edu Abstract A vehicle tracking and grouping algorithm is presented in this work

More information

Boundary Conditions in lattice Boltzmann method

Boundary Conditions in lattice Boltzmann method Boundar Conditions in lattice Boltzmann method Goncalo Silva Department of Mechanical Engineering Instituto Superior Técnico (IST) Lisbon, Portugal Outline Introduction 1 Introduction Boundar Value Problems

More information

CFD Applications using CFD++ Paul Batten & Vedat Akdag

CFD Applications using CFD++ Paul Batten & Vedat Akdag CFD Applications using CFD++ Paul Batten & Vedat Akdag Metacomp Products available under Altair Partner Program CFD++ Introduction Accurate multi dimensional polynomial framework Robust on wide variety

More information

Introduction to the Finite Element Method

Introduction to the Finite Element Method Introduction to the Finite Element Method 09.06.2009 Outline Motivation Partial Differential Equations (PDEs) Finite Difference Method (FDM) Finite Element Method (FEM) References Motivation Figure: cross

More information

A Hybrid Load Balancing Policy underlying Cloud Computing Environment

A Hybrid Load Balancing Policy underlying Cloud Computing Environment A Hybrid Load Balancing Policy underlying Cloud Computing Environment S.C. WANG, S.C. TSENG, S.S. WANG*, K.Q. YAN* Chaoyang University of Technology 168, Jifeng E. Rd., Wufeng District, Taichung 41349

More information

Optimizing Performance of the Lattice Boltzmann Method for Complex Structures on Cache-based Architectures

Optimizing Performance of the Lattice Boltzmann Method for Complex Structures on Cache-based Architectures Optimizing Performance of the Lattice Boltzmann Method for Complex Structures on Cache-based Architectures Stefan Donath 1, Thomas Zeiser, Georg Hager, Johannes Habich, Gerhard Wellein Regionales Rechenzentrum

More information

GPU Computing with CUDA Lecture 2 - CUDA Memories. Christopher Cooper Boston University August, 2011 UTFSM, Valparaíso, Chile

GPU Computing with CUDA Lecture 2 - CUDA Memories. Christopher Cooper Boston University August, 2011 UTFSM, Valparaíso, Chile GPU Computing with CUDA Lecture 2 - CUDA Memories Christopher Cooper Boston University August, 2011 UTFSM, Valparaíso, Chile 1 Outline of lecture Recap of Lecture 1 Warp scheduling CUDA Memory hierarchy

More information

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Amanda O Connor, Bryan Justice, and A. Thomas Harris IN52A. Big Data in the Geosciences:

More information

Lecture 7 - Meshing. Applied Computational Fluid Dynamics

Lecture 7 - Meshing. Applied Computational Fluid Dynamics Lecture 7 - Meshing Applied Computational Fluid Dynamics Instructor: André Bakker http://www.bakker.org André Bakker (2002-2006) Fluent Inc. (2002) 1 Outline Why is a grid needed? Element types. Grid types.

More information

Constrained Tetrahedral Mesh Generation of Human Organs on Segmented Volume *

Constrained Tetrahedral Mesh Generation of Human Organs on Segmented Volume * Constrained Tetrahedral Mesh Generation of Human Organs on Segmented Volume * Xiaosong Yang 1, Pheng Ann Heng 2, Zesheng Tang 3 1 Department of Computer Science and Technology, Tsinghua University, Beijing

More information

walberla: A software framework for CFD applications

walberla: A software framework for CFD applications walberla: A software framework for CFD applications U. Rüde, S. Donath, C. Feichtinger, K. Iglberger, F. Deserno, M. Stürmer, C. Mihoubi, T. Preclic, D. Haspel (all LSS Erlangen), N. Thürey (LSS Erlangen/

More information

How To Write A Distance Regularized Level Set Evolution

How To Write A Distance Regularized Level Set Evolution IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 12, DECEMBER 2010 3243 Distance Regularized Level Set Evolution and Its Application to Image Segmentation Chunming Li, Chenyang Xu, Senior Member, IEEE,

More information

Express Introductory Training in ANSYS Fluent Lecture 1 Introduction to the CFD Methodology

Express Introductory Training in ANSYS Fluent Lecture 1 Introduction to the CFD Methodology Express Introductory Training in ANSYS Fluent Lecture 1 Introduction to the CFD Methodology Dimitrios Sofialidis Technical Manager, SimTec Ltd. Mechanical Engineer, PhD PRACE Autumn School 2013 - Industry

More information

MEng, BSc Computer Science with Artificial Intelligence

MEng, BSc Computer Science with Artificial Intelligence School of Computing FACULTY OF ENGINEERING MEng, BSc Computer Science with Artificial Intelligence Year 1 COMP1212 Computer Processor Effective programming depends on understanding not only how to give

More information

Computational Engineering Programs at the University of Erlangen-Nuremberg

Computational Engineering Programs at the University of Erlangen-Nuremberg Computational Engineering Programs at the University of Erlangen-Nuremberg Ulrich Ruede Lehrstuhl für Simulation, Institut für Informatik Universität Erlangen http://www10.informatik.uni-erlangen.de/ ruede

More information

High Performance Computing in CST STUDIO SUITE

High Performance Computing in CST STUDIO SUITE High Performance Computing in CST STUDIO SUITE Felix Wolfheimer GPU Computing Performance Speedup 18 16 14 12 10 8 6 4 2 0 Promo offer for EUC participants: 25% discount for K40 cards Speedup of Solver

More information

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/2015. 2015 CAE Associates High Performance Computing (HPC) CAEA elearning Series Jonathan G. Dudley, Ph.D. 06/09/2015 2015 CAE Associates Agenda Introduction HPC Background Why HPC SMP vs. DMP Licensing HPC Terminology Types of

More information

The Influence of Aerodynamics on the Design of High-Performance Road Vehicles

The Influence of Aerodynamics on the Design of High-Performance Road Vehicles The Influence of Aerodynamics on the Design of High-Performance Road Vehicles Guido Buresti Department of Aerospace Engineering University of Pisa (Italy) 1 CONTENTS ELEMENTS OF AERODYNAMICS AERODYNAMICS

More information

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data Amanda O Connor, Bryan Justice, and A. Thomas Harris IN52A. Big Data in the Geosciences:

More information

The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System

The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System Qingyu Meng, Alan Humphrey, Martin Berzins Thanks to: John Schmidt and J. Davison de St. Germain, SCI Institute Justin Luitjens

More information

XFlow CFD results for the 1st AIAA High Lift Prediction Workshop

XFlow CFD results for the 1st AIAA High Lift Prediction Workshop XFlow CFD results for the 1st AIAA High Lift Prediction Workshop David M. Holman, Dr. Monica Mier-Torrecilla, Ruddy Brionnaud Next Limit Technologies, Spain THEME Computational Fluid Dynamics KEYWORDS

More information

Part II: Finite Difference/Volume Discretisation for CFD

Part II: Finite Difference/Volume Discretisation for CFD Part II: Finite Difference/Volume Discretisation for CFD Finite Volume Metod of te Advection-Diffusion Equation A Finite Difference/Volume Metod for te Incompressible Navier-Stokes Equations Marker-and-Cell

More information

Incorporating Internal Gradient and Restricted Diffusion Effects in Nuclear Magnetic Resonance Log Interpretation

Incorporating Internal Gradient and Restricted Diffusion Effects in Nuclear Magnetic Resonance Log Interpretation The Open-Access Journal for the Basic Principles of Diffusion Theory, Experiment and Application Incorporating Internal Gradient and Restricted Diffusion Effects in Nuclear Magnetic Resonance Log Interpretation

More information

ParFUM: A Parallel Framework for Unstructured Meshes. Aaron Becker, Isaac Dooley, Terry Wilmarth, Sayantan Chakravorty Charm++ Workshop 2008

ParFUM: A Parallel Framework for Unstructured Meshes. Aaron Becker, Isaac Dooley, Terry Wilmarth, Sayantan Chakravorty Charm++ Workshop 2008 ParFUM: A Parallel Framework for Unstructured Meshes Aaron Becker, Isaac Dooley, Terry Wilmarth, Sayantan Chakravorty Charm++ Workshop 2008 What is ParFUM? A framework for writing parallel finite element

More information

The Design and Implement of Ultra-scale Data Parallel. In-situ Visualization System

The Design and Implement of Ultra-scale Data Parallel. In-situ Visualization System The Design and Implement of Ultra-scale Data Parallel In-situ Visualization System Liu Ning liuning01@ict.ac.cn Gao Guoxian gaoguoxian@ict.ac.cn Zhang Yingping zhangyingping@ict.ac.cn Zhu Dengming mdzhu@ict.ac.cn

More information

Parallel Computing for Option Pricing Based on the Backward Stochastic Differential Equation

Parallel Computing for Option Pricing Based on the Backward Stochastic Differential Equation Parallel Computing for Option Pricing Based on the Backward Stochastic Differential Equation Ying Peng, Bin Gong, Hui Liu, and Yanxin Zhang School of Computer Science and Technology, Shandong University,

More information

Efficient numerical simulation of time-harmonic wave equations

Efficient numerical simulation of time-harmonic wave equations Efficient numerical simulation of time-harmonic wave equations Prof. Tuomo Rossi Dr. Dirk Pauly Ph.Lic. Sami Kähkönen Ph.Lic. Sanna Mönkölä M.Sc. Tuomas Airaksinen M.Sc. Anssi Pennanen M.Sc. Jukka Räbinä

More information

Distributed Dynamic Load Balancing for Iterative-Stencil Applications

Distributed Dynamic Load Balancing for Iterative-Stencil Applications Distributed Dynamic Load Balancing for Iterative-Stencil Applications G. Dethier 1, P. Marchot 2 and P.A. de Marneffe 1 1 EECS Department, University of Liege, Belgium 2 Chemical Engineering Department,

More information

Tracking Distributions with an Overlap Prior

Tracking Distributions with an Overlap Prior Tracking Distributions with an Overlap Prior Ismail Ben Ayed, Shuo Li GE Healthcare London, ON, Canada {ismail.benayed, shuo.li}@ge.com Ian Ross London Health Sciences Centre London, ON, Canada ian.ross@lhsc.on.ca

More information

NVIDIA IndeX. Whitepaper. Document version 1.0 3 June 2013

NVIDIA IndeX. Whitepaper. Document version 1.0 3 June 2013 NVIDIA IndeX Whitepaper Document version 1.0 3 June 2013 NVIDIA Advanced Rendering Center Fasanenstraße 81 10623 Berlin phone +49.30.315.99.70 fax +49.30.315.99.733 arc-office@nvidia.com Copyright Information

More information

Customer Training Material. Lecture 2. Introduction to. Methodology ANSYS FLUENT. ANSYS, Inc. Proprietary 2010 ANSYS, Inc. All rights reserved.

Customer Training Material. Lecture 2. Introduction to. Methodology ANSYS FLUENT. ANSYS, Inc. Proprietary 2010 ANSYS, Inc. All rights reserved. Lecture 2 Introduction to CFD Methodology Introduction to ANSYS FLUENT L2-1 What is CFD? Computational Fluid Dynamics (CFD) is the science of predicting fluid flow, heat and mass transfer, chemical reactions,

More information

Collaborative and Interactive CFD Simulation using High Performance Computers

Collaborative and Interactive CFD Simulation using High Performance Computers Collaborative and Interactive CFD Simulation using High Performance Computers Petra Wenisch, Andre Borrmann, Ernst Rank, Christoph van Treeck Technische Universität München {wenisch, borrmann, rank, treeck}@bv.tum.de

More information

COMPUTATION OF THREE-DIMENSIONAL ELECTRIC FIELD PROBLEMS BY A BOUNDARY INTEGRAL METHOD AND ITS APPLICATION TO INSULATION DESIGN

COMPUTATION OF THREE-DIMENSIONAL ELECTRIC FIELD PROBLEMS BY A BOUNDARY INTEGRAL METHOD AND ITS APPLICATION TO INSULATION DESIGN PERIODICA POLYTECHNICA SER. EL. ENG. VOL. 38, NO. ~, PP. 381-393 (199~) COMPUTATION OF THREE-DIMENSIONAL ELECTRIC FIELD PROBLEMS BY A BOUNDARY INTEGRAL METHOD AND ITS APPLICATION TO INSULATION DESIGN H.

More information

Off-line Model Simplification for Interactive Rigid Body Dynamics Simulations Satyandra K. Gupta University of Maryland, College Park

Off-line Model Simplification for Interactive Rigid Body Dynamics Simulations Satyandra K. Gupta University of Maryland, College Park NSF GRANT # 0727380 NSF PROGRAM NAME: Engineering Design Off-line Model Simplification for Interactive Rigid Body Dynamics Simulations Satyandra K. Gupta University of Maryland, College Park Atul Thakur

More information

Introduction to GPU Programming Languages

Introduction to GPU Programming Languages CSC 391/691: GPU Programming Fall 2011 Introduction to GPU Programming Languages Copyright 2011 Samuel S. Cho http://www.umiacs.umd.edu/ research/gpu/facilities.html Maryland CPU/GPU Cluster Infrastructure

More information

Parallel Analysis and Visualization on Cray Compute Node Linux

Parallel Analysis and Visualization on Cray Compute Node Linux Parallel Analysis and Visualization on Cray Compute Node Linux David Pugmire, Oak Ridge National Laboratory and Hank Childs, Lawrence Livermore National Laboratory and Sean Ahern, Oak Ridge National Laboratory

More information

Multiphysics Software Applications in Reverse Engineering

Multiphysics Software Applications in Reverse Engineering Multiphysics Software Applications in Reverse Engineering *W. Wang 1, K. Genc 2 1 University of Massachusetts Lowell, Lowell, MA, USA 2 Simpleware, Exeter, United Kingdom *Corresponding author: University

More information

Course Overview. CSCI 480 Computer Graphics Lecture 1. Administrative Issues Modeling Animation Rendering OpenGL Programming [Angel Ch.

Course Overview. CSCI 480 Computer Graphics Lecture 1. Administrative Issues Modeling Animation Rendering OpenGL Programming [Angel Ch. CSCI 480 Computer Graphics Lecture 1 Course Overview January 14, 2013 Jernej Barbic University of Southern California http://www-bcf.usc.edu/~jbarbic/cs480-s13/ Administrative Issues Modeling Animation

More information

Introduction to the Finite Element Method (FEM)

Introduction to the Finite Element Method (FEM) Introduction to the Finite Element Method (FEM) ecture First and Second Order One Dimensional Shape Functions Dr. J. Dean Discretisation Consider the temperature distribution along the one-dimensional

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

Investigation of the Effect of Dynamic Capillary Pressure on Waterflooding in Extra Low Permeability Reservoirs

Investigation of the Effect of Dynamic Capillary Pressure on Waterflooding in Extra Low Permeability Reservoirs Copyright 013 Tech Science Press SL, vol.9, no., pp.105-117, 013 Investigation of the Effect of Dynamic Capillary Pressure on Waterflooding in Extra Low Permeability Reservoirs Tian Shubao 1, Lei Gang

More information

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1 Introduction to GP-GPUs Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1 GPU Architectures: How do we reach here? NVIDIA Fermi, 512 Processing Elements (PEs) 2 What Can It Do?

More information