Efficient Shallow Water Simulations on GPUs

Size: px
Start display at page:

Download "Efficient Shallow Water Simulations on GPUs"

Transcription

1 Efficient Shallow Water Simulations on GPUs SIAM Conference on Mathematical & Computational Issues in the Geosciences Long Beach, California, USA, André R. Brodtkorb, Ph.D., SINTEF ICT, Department of Applied Mathematics, Norway 3

2 Talk Outline Minisymposium Introduction and Motivation The Shallow Water Equations Graphics Processing Units Shallow Water Simulations on GPUs Implementing and adapting numerical schemes Accuracy of GPUs Verification and validation Performance Summary 4

3 Acknowledgements Martin L. Sætra Knut-Andreas Lie Trond R. Hagen Jostein R. Natvig Mustafa Altinakar Yan Ding Jaswant Singh 5

4 The Shallow Water Equations First described by de Saint-Venant ( ) Gravity-induced fluid motion 2D free surface Governing flow is horizontal Conservation of mass and momentum Not only for water: Simplification of atmospheric flow Avalanches... Water image from / Ian Britton 6

5 Target Application Areas Tsunamis Floods 2011: Japan (5321+) 2004: Indian Ocean ( ) Storm Surges 2010: Pakistan (2000+) 1931: China floods ( ) Dam breaks 2005: Hurricane Katrina (1836) 1530: Netherlands ( ) 1975: Banqiao Dam ( ) 1959: Malpasset (423) Images from wikipedia.org, 7

6 Why GPUs? Proposition: A GPU is faster than a CPU We can get higher quality results in the same timeframe In preparation for events: Evaluate more scenarios Creation of inundation maps Creation of Emergency Action Plans In response to ongoing events Simulate possible scenarios in real-time Determine who to evacuate based on simulation, not guesswork Inundation map from Los Angeles County Tsunami Inundation Maps, 8

7 Do we need more speed? Many existing dam break inundation maps are based on 1D simulations Approximate valleys using 1D cross sections Much bias to individual engineer skills Assumptions only hold for valleys Many dams and levees even lack emergency action plans! In US: dams without plans miles of levee systems Simulation using GPUs enables high quality 2D simulations See also M. Altinakar, P. Rhodes, Faster-than-Real-Time Operational Flood Simulation using GPGPU Programming 9

8 2011 Japan Tsunami Tsunami warnings must be issued in minutes Huge computational domains Rapid wave propagation Uncertainties wrt. Tsunami cause Warnings must be accurate Wrongful warning are dangerous! GPUs can be used to increase quality of warnings Images from US Navy (top), NASA (left), NOAA (right) 10

9 The Graphics Processing Unit (GPU) CPU GPU Cores 4 16 Float ops / clock Frequency (MHz) GigaFLOPS Memory (GiB) Performance Memory Bandwidth 11

10 GPU Programming: From Abuse to Industrial Use OpenCL DirectX DirectCompute BrookGPU AMD Brook+ AMD CTM / CAL NVIDIA CUDA ~2000 ~2005 ~2010 Graphics APIs Various Abstractions Dedicated C-based languages 12

11 Shallow Water Simulations on GPUs 13

12 The Shallow Water Equations (SWE) Vector of Conserved variables Flux Functions Bed slope source term Bed friction source term Numerical Simulation of the SWE: Hyperbolic partial differential equation Enables explicit schemes Solutions form discontinuities / shocks Require high accuracy in smooth parts without oscillations near discontinuities Solutions include dry areas Negative water depths ruin simulations Requirements to accuracy Order of spatial/temporal discretization Floating point rounding errors 14

13 The Finite Volume Scheme of Choice* Scheme of choice: A. Kurganov and G. Petrova, A Second-Order Well-Balanced Positivity Preserving Central-Upwind Scheme for the Saint-Venant System Communications in Mathematical Sciences, 5 (2007), Second order accurate fluxes Total Variation Diminishing Well-balanced (captures lake-at-rest) Good (but not perfect) match with GPU execution model * With all possible disclaimers 15

14 Kurganov-Petrova Spatial Discretization Vector of Conserved variables Bed friction source term Bed slope source term Flux Functions Continuous variables Discrete variables Slope reconstruction Flux calculation Evaluate integration points Dry states fix 16

15 Temporal Discretization 17

16 Putting it Together: A Simulation Cycle 1. Calculate fluxes 2. Calculate Dt 6. Apply boundary conditions 3. Halfstep 5. Evolve in time 4. Calculate fluxes 18

17 Mapping to the GPU Flux Kernel 87% of runtime Nine-point stencil operation Time-step size ~1% of runtime Simple parallel reduction Time integration 12% of runtime Solve the time ODE for each cell Boundary Conditions ~1% of runtime Fill inn ghost cell values Want a minimum amount of kernels (GPU programs) Want each kernel to be massively parallel Four kernels is the best we can do whilst still obeying dependencies 19

18 Domain Decomposition Traditional CUDA block decomposition Each Streaming Multiprocessor of the GPU computes on a small 2D patch Neighboring patches use overlap to exchange information Global ghost cells for boundary conditions Global ghost cells (and ghost cell expansion) used for multi-gpu simulations Many different optimization parameters: shared mem, thread occupancy, warp size, etc. 20

19 Accuracy: Single Versus Double Precision What is the relative error in mass conservation for single and double precision? What is the discrepancy between the two? Three different test cases Low water depth (wet only) High water depth (wet only) Synthetic terrain with dam break (wet-dry) Conclusions: We have loss in conservation on the order of machine epsilon Single precision gives larger error than double Errors related to the wet-dry front is more than an order of magnitude larger For our application areas, single precision is sufficient 21

20 Verification: Parabolic basin Single precision is sufficient, but do we solve the equations? Test against analytical 2D parabolic basin case (Thacker) Planar water surface oscillates 100 x 100 cells Horizontal scale: 8 km Vertical scale: 3.3 m Simulation and analytical match well But, as most schemes, growing errors along wet-dry interface 22

21 Validation: Barrage du Malpasset We model the equations correctly, but can we model real events? South-east France near Fréjus: Barrage du Malpasset Double curvature dam, 66.5 m high, 220 m crest length, 55 million m 3 Bursts at 21:13 December 2nd 1959 Reaches Mediterranean in 30 minutes (speeds up-to 70 km/h) 423 casualties, $68 million in damages Validate against experimental data from 1:400 model cells (1099 x 439 cells) 15 meter resolution Our results match experimental data very well Discrepancies at gauges 14 and 9 present in most (all?) published results Image from google earth, mes-ballades.com 23

22 Video 24

23 Bonus: MultiGPU Performance Single node with four GPUs Near-perfect weak and strong scaling on two generations of hardware (S1070, C2050) Up-to 350 million cells domain 25

24 Summary Simulation of the shallow water equations is important Devastating forces: tsunamis, dam breaks, floods, storm surges The problem maps well to GPUs Single precision is not an issue Verification and validation addressed Not a toy model any more GPUs enable more accurate results Evaluate more scenarios Simulate with higher resolution Or do both! 26

25 Thank you for your attention Contact: André R. Brodtkorb Homepage: Youtube: SINTEF homepage: 27

26 References A. R. Brodtkorb, Scientific Computing on Heterogeneous Architectures, Ph.D. thesis, University of Oslo, ISSN , No. 1031, A. R. Brodtkorb, T. R. Hagen, K.-A. Lie and J. R. Natvig, Simulation and Visualization of the Saint-Venant System using GPUs, Computing and Visualization in Science, special issue on Hot topics in Computational Engineering, 13(7), (2011), pp , DOI: /s x. M. L. Sætra and A. R. Brodtkorb, Shallow Water Simulations on Multiple GPUs, Proceedings of the Para 2010 Conference, Lecture Notes in Computer Science, Springer, A. R. Brodtkorb, M. L. Sætra, and M. Altinakar, Efficient Shallow Water Simulations on GPUs: Implementation, Visualization, Verification, and Validation, in review, Preprints and links to papers available on 28

Numerical Modeling and Simulation of Extreme Flood Inundation to Assess Vulnerability of Transportation Infrastructure Assets

Numerical Modeling and Simulation of Extreme Flood Inundation to Assess Vulnerability of Transportation Infrastructure Assets Numerical Modeling and Simulation of Extreme Flood Inundation to Assess Vulnerability of Transportation Infrastructure Assets 2015 University Transportation Center (UTC) Conference for the Southeastern

More information

LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR

LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR LBM BASED FLOW SIMULATION USING GPU COMPUTING PROCESSOR Frédéric Kuznik, frederic.kuznik@insa lyon.fr 1 Framework Introduction Hardware architecture CUDA overview Implementation details A simple case:

More information

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software

Introduction GPU Hardware GPU Computing Today GPU Computing Example Outlook Summary. GPU Computing. Numerical Simulation - from Models to Software GPU Computing Numerical Simulation - from Models to Software Andreas Barthels JASS 2009, Course 2, St. Petersburg, Russia Prof. Dr. Sergey Y. Slavyanov St. Petersburg State University Prof. Dr. Thomas

More information

GPUs for Scientific Computing

GPUs for Scientific Computing GPUs for Scientific Computing p. 1/16 GPUs for Scientific Computing Mike Giles mike.giles@maths.ox.ac.uk Oxford-Man Institute of Quantitative Finance Oxford University Mathematical Institute Oxford e-research

More information

GPU Computing with CUDA Lecture 2 - CUDA Memories. Christopher Cooper Boston University August, 2011 UTFSM, Valparaíso, Chile

GPU Computing with CUDA Lecture 2 - CUDA Memories. Christopher Cooper Boston University August, 2011 UTFSM, Valparaíso, Chile GPU Computing with CUDA Lecture 2 - CUDA Memories Christopher Cooper Boston University August, 2011 UTFSM, Valparaíso, Chile 1 Outline of lecture Recap of Lecture 1 Warp scheduling CUDA Memory hierarchy

More information

Introduction to GPU hardware and to CUDA

Introduction to GPU hardware and to CUDA Introduction to GPU hardware and to CUDA Philip Blakely Laboratory for Scientific Computing, University of Cambridge Philip Blakely (LSC) GPU introduction 1 / 37 Course outline Introduction to GPU hardware

More information

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

GPU System Architecture. Alan Gray EPCC The University of Edinburgh GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems

More information

PyFR: Bringing Next Generation Computational Fluid Dynamics to GPU Platforms

PyFR: Bringing Next Generation Computational Fluid Dynamics to GPU Platforms PyFR: Bringing Next Generation Computational Fluid Dynamics to GPU Platforms P. E. Vincent! Department of Aeronautics Imperial College London! 25 th March 2014 Overview Motivation Flux Reconstruction Many-Core

More information

Post Processing Service

Post Processing Service Post Processing Service The delay of propagation of the signal due to the ionosphere is the main source of generation of positioning errors. This problem can be bypassed using a dual-frequency receivers

More information

The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System

The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System Qingyu Meng, Alan Humphrey, Martin Berzins Thanks to: John Schmidt and J. Davison de St. Germain, SCI Institute Justin Luitjens

More information

Part II: Finite Difference/Volume Discretisation for CFD

Part II: Finite Difference/Volume Discretisation for CFD Part II: Finite Difference/Volume Discretisation for CFD Finite Volume Metod of te Advection-Diffusion Equation A Finite Difference/Volume Metod for te Incompressible Navier-Stokes Equations Marker-and-Cell

More information

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1

Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1 Introduction to GP-GPUs Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1 GPU Architectures: How do we reach here? NVIDIA Fermi, 512 Processing Elements (PEs) 2 What Can It Do?

More information

Introduction to GPU Computing

Introduction to GPU Computing Matthis Hauschild Universität Hamburg Fakultät für Mathematik, Informatik und Naturwissenschaften Technische Aspekte Multimodaler Systeme December 4, 2014 M. Hauschild - 1 Table of Contents 1. Architecture

More information

ME6130 An introduction to CFD 1-1

ME6130 An introduction to CFD 1-1 ME6130 An introduction to CFD 1-1 What is CFD? Computational fluid dynamics (CFD) is the science of predicting fluid flow, heat and mass transfer, chemical reactions, and related phenomena by solving numerically

More information

FLOODING AND DRYING IN DISCONTINUOUS GALERKIN DISCRETIZATIONS OF SHALLOW WATER EQUATIONS

FLOODING AND DRYING IN DISCONTINUOUS GALERKIN DISCRETIZATIONS OF SHALLOW WATER EQUATIONS European Conference on Computational Fluid Dynamics ECCOMAS CFD 26 P. Wesseling, E. Oñate and J. Périaux (Eds) c TU Delft, The Netherlands, 26 FLOODING AND DRING IN DISCONTINUOUS GALERKIN DISCRETIZATIONS

More information

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011 Graphics Cards and Graphics Processing Units Ben Johnstone Russ Martin November 15, 2011 Contents Graphics Processing Units (GPUs) Graphics Pipeline Architectures 8800-GTX200 Fermi Cayman Performance Analysis

More information

2D Modeling of Urban Flood Vulnerable Areas

2D Modeling of Urban Flood Vulnerable Areas 2D Modeling of Urban Flood Vulnerable Areas Sameer Dhalla, P.Eng. Dilnesaw Chekol, Ph.D. A.D. Latornell Conservation Symposium November 22, 2013 Outline 1. Toronto and Region 2. Evolution of Flood Management

More information

USSD Workshop on Dam Break Analysis Applied to Tailings Dams

USSD Workshop on Dam Break Analysis Applied to Tailings Dams USSD Workshop on Dam Break Analysis Applied to Tailings Dams Antecedents Newtonian / non-newtonian flows Available models that allow the simulation of non- Newtonian flows (tailings) Other models used

More information

Next Generation GPU Architecture Code-named Fermi

Next Generation GPU Architecture Code-named Fermi Next Generation GPU Architecture Code-named Fermi The Soul of a Supercomputer in the Body of a GPU Why is NVIDIA at Super Computing? Graphics is a throughput problem paint every pixel within frame time

More information

Parallel 3D Image Segmentation of Large Data Sets on a GPU Cluster

Parallel 3D Image Segmentation of Large Data Sets on a GPU Cluster Parallel 3D Image Segmentation of Large Data Sets on a GPU Cluster Aaron Hagan and Ye Zhao Kent State University Abstract. In this paper, we propose an inherent parallel scheme for 3D image segmentation

More information

Flood Modelling for Cities using Cloud Computing FINAL REPORT. Vassilis Glenis, Vedrana Kutija, Stephen McGough, Simon Woodman, Chris Kilsby

Flood Modelling for Cities using Cloud Computing FINAL REPORT. Vassilis Glenis, Vedrana Kutija, Stephen McGough, Simon Woodman, Chris Kilsby Summary Flood Modelling for Cities using Cloud Computing FINAL REPORT Vassilis Glenis, Vedrana Kutija, Stephen McGough, Simon Woodman, Chris Kilsby Assessment of pluvial flood risk is particularly difficult

More information

NUMERICAL SIMULATION OF REGULAR WAVES RUN-UP OVER SLOPPING BEACH BY OPEN FOAM

NUMERICAL SIMULATION OF REGULAR WAVES RUN-UP OVER SLOPPING BEACH BY OPEN FOAM NUMERICAL SIMULATION OF REGULAR WAVES RUN-UP OVER SLOPPING BEACH BY OPEN FOAM Parviz Ghadimi 1*, Mohammad Ghandali 2, Mohammad Reza Ahmadi Balootaki 3 1*, 2, 3 Department of Marine Technology, Amirkabir

More information

Interactive simulation of an ash cloud of the volcano Grímsvötn

Interactive simulation of an ash cloud of the volcano Grímsvötn Interactive simulation of an ash cloud of the volcano Grímsvötn 1 MATHEMATICAL BACKGROUND Simulating flows in the atmosphere, being part of CFD, is on of the research areas considered in the working group

More information

GPU Programming Strategies and Trends in GPU Computing

GPU Programming Strategies and Trends in GPU Computing GPU Programming Strategies and Trends in GPU Computing André R. Brodtkorb 1 Trond R. Hagen 1,2 Martin L. Sætra 2 1 SINTEF, Dept. Appl. Math., P.O. Box 124, Blindern, NO-0314 Oslo, Norway 2 Center of Mathematics

More information

MIKE 21 FLOW MODEL HINTS AND RECOMMENDATIONS IN APPLICATIONS WITH SIGNIFICANT FLOODING AND DRYING

MIKE 21 FLOW MODEL HINTS AND RECOMMENDATIONS IN APPLICATIONS WITH SIGNIFICANT FLOODING AND DRYING 1 MIKE 21 FLOW MODEL HINTS AND RECOMMENDATIONS IN APPLICATIONS WITH SIGNIFICANT FLOODING AND DRYING This note is intended as a general guideline to setting up a standard MIKE 21 model for applications

More information

Reproducible Science and Modern Scientific Software Development

Reproducible Science and Modern Scientific Software Development Reproducible Science and Modern Scientific Software Development 13 th evita Winter School in escience sponsored by Dr. Holms Hotel, Geilo, Norway January 20-25, 2013 Welcome & Introduction Dr. André R.

More information

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008

Radeon GPU Architecture and the Radeon 4800 series. Michael Doggett Graphics Architecture Group June 27, 2008 Radeon GPU Architecture and the series Michael Doggett Graphics Architecture Group June 27, 2008 Graphics Processing Units Introduction GPU research 2 GPU Evolution GPU started as a triangle rasterizer

More information

Unleashing the Performance Potential of GPUs for Atmospheric Dynamic Solvers

Unleashing the Performance Potential of GPUs for Atmospheric Dynamic Solvers Unleashing the Performance Potential of GPUs for Atmospheric Dynamic Solvers Haohuan Fu haohuan@tsinghua.edu.cn High Performance Geo-Computing (HPGC) Group Center for Earth System Science Tsinghua University

More information

MIKE 21 Flow Model FM. Parallelisation using GPU. Benchmarking report

MIKE 21 Flow Model FM. Parallelisation using GPU. Benchmarking report MIKE 21 Flow Model FM Parallelisation using Benchmarking report MIKE by DHI 2014 DHI headquarters Agern Allé 5 DK-2970 Hørsholm Denmark +45 4516 9200 Telephone +45 4516 9333 Support +45 4516 9292 Telefax

More information

(Toward) Radiative transfer on AMR with GPUs. Dominique Aubert Université de Strasbourg Austin, TX, 14.12.12

(Toward) Radiative transfer on AMR with GPUs. Dominique Aubert Université de Strasbourg Austin, TX, 14.12.12 (Toward) Radiative transfer on AMR with GPUs Dominique Aubert Université de Strasbourg Austin, TX, 14.12.12 A few words about GPUs Cache and control replaced by calculation units Large number of Multiprocessors

More information

GPU Architecture. Michael Doggett ATI

GPU Architecture. Michael Doggett ATI GPU Architecture Michael Doggett ATI GPU Architecture RADEON X1800/X1900 Microsoft s XBOX360 Xenos GPU GPU research areas ATI - Driving the Visual Experience Everywhere Products from cell phones to super

More information

NVIDIA GeForce GTX 580 GPU Datasheet

NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet NVIDIA GeForce GTX 580 GPU Datasheet 3D Graphics Full Microsoft DirectX 11 Shader Model 5.0 support: o NVIDIA PolyMorph Engine with distributed HW tessellation engines

More information

Real-time Ocean Forecasting Needs at NCEP National Weather Service

Real-time Ocean Forecasting Needs at NCEP National Weather Service Real-time Ocean Forecasting Needs at NCEP National Weather Service D.B. Rao NCEP Environmental Modeling Center December, 2005 HYCOM Annual Meeting, Miami, FL COMMERCE ENVIRONMENT STATE/LOCAL PLANNING HEALTH

More information

Data Centric Systems (DCS)

Data Centric Systems (DCS) Data Centric Systems (DCS) Architecture and Solutions for High Performance Computing, Big Data and High Performance Analytics High Performance Computing with Data Centric Systems 1 Data Centric Systems

More information

CUDA programming on NVIDIA GPUs

CUDA programming on NVIDIA GPUs p. 1/21 on NVIDIA GPUs Mike Giles mike.giles@maths.ox.ac.uk Oxford University Mathematical Institute Oxford-Man Institute for Quantitative Finance Oxford eresearch Centre p. 2/21 Overview hardware view

More information

How To Create A Flood Simulator For A Web Browser (For Free)

How To Create A Flood Simulator For A Web Browser (For Free) Interactive Web-based Flood Simulation System for Realistic Experiments of Flooding and Flood Damage Ibrahim Demir Big Data We are generating data on a petabyte scale through observations and modeling

More information

Interactive comment on A simple 2-D inundation model for incorporating flood damage in urban drainage planning by A. Pathirana et al.

Interactive comment on A simple 2-D inundation model for incorporating flood damage in urban drainage planning by A. Pathirana et al. Hydrol. Earth Syst. Sci. Discuss., 5, C2756 C2764, 2010 www.hydrol-earth-syst-sci-discuss.net/5/c2756/2010/ Author(s) 2010. This work is distributed under the Creative Commons Attribute 3.0 License. Hydrology

More information

VALAR: A BENCHMARK SUITE TO STUDY THE DYNAMIC BEHAVIOR OF HETEROGENEOUS SYSTEMS

VALAR: A BENCHMARK SUITE TO STUDY THE DYNAMIC BEHAVIOR OF HETEROGENEOUS SYSTEMS VALAR: A BENCHMARK SUITE TO STUDY THE DYNAMIC BEHAVIOR OF HETEROGENEOUS SYSTEMS Perhaad Mistry, Yash Ukidave, Dana Schaa, David Kaeli Department of Electrical and Computer Engineering Northeastern University,

More information

The Next Generation Science Standards (NGSS) Correlation to. EarthComm, Second Edition. Project-Based Space and Earth System Science

The Next Generation Science Standards (NGSS) Correlation to. EarthComm, Second Edition. Project-Based Space and Earth System Science The Next Generation Science Standards (NGSS) Achieve, Inc. on behalf of the twenty-six states and partners that collaborated on the NGSS Copyright 2013 Achieve, Inc. All rights reserved. Correlation to,

More information

Real-time Visual Tracker by Stream Processing

Real-time Visual Tracker by Stream Processing Real-time Visual Tracker by Stream Processing Simultaneous and Fast 3D Tracking of Multiple Faces in Video Sequences by Using a Particle Filter Oscar Mateo Lozano & Kuzahiro Otsuka presented by Piotr Rudol

More information

Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms

Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms Amani AlOnazi, David E. Keyes, Alexey Lastovetsky, Vladimir Rychkov Extreme Computing Research Center,

More information

Express Introductory Training in ANSYS Fluent Lecture 1 Introduction to the CFD Methodology

Express Introductory Training in ANSYS Fluent Lecture 1 Introduction to the CFD Methodology Express Introductory Training in ANSYS Fluent Lecture 1 Introduction to the CFD Methodology Dimitrios Sofialidis Technical Manager, SimTec Ltd. Mechanical Engineer, PhD PRACE Autumn School 2013 - Industry

More information

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming Overview Lecture 1: an introduction to CUDA Mike Giles mike.giles@maths.ox.ac.uk hardware view software view Oxford University Mathematical Institute Oxford e-research Centre Lecture 1 p. 1 Lecture 1 p.

More information

Chapter 5 Adaptive Mesh, Embedded Boundary Model for Flood Modeling

Chapter 5 Adaptive Mesh, Embedded Boundary Model for Flood Modeling in the Sacramento-San Joaquin Delta and Suisun Marsh June 2011 Chapter 5 Adaptive Mesh, Embedded Boundary Model for Flood Modeling Authors: Qiang Shu and Eli Ateljevich, Delta Modeling Section, Bay-Delta

More information

ultra fast SOM using CUDA

ultra fast SOM using CUDA ultra fast SOM using CUDA SOM (Self-Organizing Map) is one of the most popular artificial neural network algorithms in the unsupervised learning category. Sijo Mathew Preetha Joy Sibi Rajendra Manoj A

More information

NVIDIA Tools For Profiling And Monitoring. David Goodwin

NVIDIA Tools For Profiling And Monitoring. David Goodwin NVIDIA Tools For Profiling And Monitoring David Goodwin Outline CUDA Profiling and Monitoring Libraries Tools Technologies Directions CScADS Summer 2012 Workshop on Performance Tools for Extreme Scale

More information

NVIDIA CUDA Software and GPU Parallel Computing Architecture. David B. Kirk, Chief Scientist

NVIDIA CUDA Software and GPU Parallel Computing Architecture. David B. Kirk, Chief Scientist NVIDIA CUDA Software and GPU Parallel Computing Architecture David B. Kirk, Chief Scientist Outline Applications of GPU Computing CUDA Programming Model Overview Programming in CUDA The Basics How to Get

More information

Robust Algorithms for Current Deposition and Dynamic Load-balancing in a GPU Particle-in-Cell Code

Robust Algorithms for Current Deposition and Dynamic Load-balancing in a GPU Particle-in-Cell Code Robust Algorithms for Current Deposition and Dynamic Load-balancing in a GPU Particle-in-Cell Code F. Rossi, S. Sinigardi, P. Londrillo & G. Turchetti University of Bologna & INFN GPU2014, Rome, Sept 17th

More information

QCD as a Video Game?

QCD as a Video Game? QCD as a Video Game? Sándor D. Katz Eötvös University Budapest in collaboration with Győző Egri, Zoltán Fodor, Christian Hoelbling Dániel Nógrádi, Kálmán Szabó Outline 1. Introduction 2. GPU architecture

More information

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms Björn Rocker Hamburg, June 17th 2010 Engineering Mathematics and Computing Lab (EMCL) KIT University of the State

More information

GPU-accelerated Large Scale Analytics using MapReduce Model

GPU-accelerated Large Scale Analytics using MapReduce Model , pp.375-380 http://dx.doi.org/10.14257/ijhit.2015.8.6.36 GPU-accelerated Large Scale Analytics using MapReduce Model RadhaKishan Yadav 1, Robin Singh Bhadoria 2 and Amit Suri 3 1 Research Assistant 2

More information

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA The Evolution of Computer Graphics Tony Tamasi SVP, Content & Technology, NVIDIA Graphics Make great images intricate shapes complex optical effects seamless motion Make them fast invent clever techniques

More information

Waves disturbances caused by the movement of energy from a source through some medium.

Waves disturbances caused by the movement of energy from a source through some medium. Oceanography Chapter 10 Waves disturbances caused by the movement of energy from a source through some medium. Floating Gull- Figure 10.1 water is not moving only the energy is moving through the water.

More information

Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it

Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it Informa(on & Communica(on Technology Sec(on (ICTS) Interna(onal Centre for Theore(cal Physics (ICTP) Mul(ple Socket

More information

The Future Of Animation Is Games

The Future Of Animation Is Games The Future Of Animation Is Games 王 銓 彰 Next Media Animation, Media Lab, Director cwang@1-apple.com.tw The Graphics Hardware Revolution ( 繪 圖 硬 體 革 命 ) : GPU-based Graphics Hardware Multi-core (20 Cores

More information

MODULE VII LARGE BODY WAVE DIFFRACTION

MODULE VII LARGE BODY WAVE DIFFRACTION MODULE VII LARGE BODY WAVE DIFFRACTION 1.0 INTRODUCTION In the wave-structure interaction problems, it is classical to divide into two major classification: slender body interaction and large body interaction.

More information

QuickSpecs. NVIDIA Quadro M6000 12GB Graphics INTRODUCTION. NVIDIA Quadro M6000 12GB Graphics. Overview

QuickSpecs. NVIDIA Quadro M6000 12GB Graphics INTRODUCTION. NVIDIA Quadro M6000 12GB Graphics. Overview Overview L2K02AA INTRODUCTION Push the frontier of graphics processing with the new NVIDIA Quadro M6000 12GB graphics card. The Quadro M6000 features the top of the line member of the latest NVIDIA Maxwell-based

More information

Interactive Data Visualization with Focus on Climate Research

Interactive Data Visualization with Focus on Climate Research Interactive Data Visualization with Focus on Climate Research Michael Böttinger German Climate Computing Center (DKRZ) 1 Agenda Visualization in HPC Environments Climate System, Climate Models and Climate

More information

What is Modeling and Simulation and Software Engineering?

What is Modeling and Simulation and Software Engineering? What is Modeling and Simulation and Software Engineering? V. Sundararajan Scientific and Engineering Computing Group Centre for Development of Advanced Computing Pune 411 007 vsundar@cdac.in Definitions

More information

Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61

Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61 F# Applications to Computational Financial and GPU Computing May 16th Dr. Daniel Egloff +41 44 520 01 17 +41 79 430 03 61 Today! Why care about F#? Just another fashion?! Three success stories! How Alea.cuBase

More information

NUMERICAL ANALYSIS OF THE EFFECTS OF WIND ON BUILDING STRUCTURES

NUMERICAL ANALYSIS OF THE EFFECTS OF WIND ON BUILDING STRUCTURES Vol. XX 2012 No. 4 28 34 J. ŠIMIČEK O. HUBOVÁ NUMERICAL ANALYSIS OF THE EFFECTS OF WIND ON BUILDING STRUCTURES Jozef ŠIMIČEK email: jozef.simicek@stuba.sk Research field: Statics and Dynamics Fluids mechanics

More information

REGIONAL CLIMATE AND DOWNSCALING

REGIONAL CLIMATE AND DOWNSCALING REGIONAL CLIMATE AND DOWNSCALING Regional Climate Modelling at the Hungarian Meteorological Service ANDRÁS HORÁNYI (horanyi( horanyi.a@.a@met.hu) Special thanks: : Gabriella Csima,, Péter Szabó, Gabriella

More information

GPGPU Computing. Yong Cao

GPGPU Computing. Yong Cao GPGPU Computing Yong Cao Why Graphics Card? It s powerful! A quiet trend Copyright 2009 by Yong Cao Why Graphics Card? It s powerful! Processor Processing Units FLOPs per Unit Clock Speed Processing Power

More information

Wind resources map of Spain at mesoscale. Methodology and validation

Wind resources map of Spain at mesoscale. Methodology and validation Wind resources map of Spain at mesoscale. Methodology and validation Martín Gastón Edurne Pascal Laura Frías Ignacio Martí Uxue Irigoyen Elena Cantero Sergio Lozano Yolanda Loureiro e-mail:mgaston@cener.com

More information

Interactive Level-Set Deformation On the GPU

Interactive Level-Set Deformation On the GPU Interactive Level-Set Deformation On the GPU Institute for Data Analysis and Visualization University of California, Davis Problem Statement Goal Interactive system for deformable surface manipulation

More information

Hardware-Aware Analysis and. Presentation Date: Sep 15 th 2009 Chrissie C. Cui

Hardware-Aware Analysis and. Presentation Date: Sep 15 th 2009 Chrissie C. Cui Hardware-Aware Analysis and Optimization of Stable Fluids Presentation Date: Sep 15 th 2009 Chrissie C. Cui Outline Introduction Highlights Flop and Bandwidth Analysis Mehrstellen Schemes Advection Caching

More information

ENHANCEMENT OF TEGRA TABLET'S COMPUTATIONAL PERFORMANCE BY GEFORCE DESKTOP AND WIFI

ENHANCEMENT OF TEGRA TABLET'S COMPUTATIONAL PERFORMANCE BY GEFORCE DESKTOP AND WIFI ENHANCEMENT OF TEGRA TABLET'S COMPUTATIONAL PERFORMANCE BY GEFORCE DESKTOP AND WIFI Di Zhao The Ohio State University GPU Technology Conference 2014, March 24-27 2014, San Jose California 1 TEGRA-WIFI-GEFORCE

More information

Parallel Simplification of Large Meshes on PC Clusters

Parallel Simplification of Large Meshes on PC Clusters Parallel Simplification of Large Meshes on PC Clusters Hua Xiong, Xiaohong Jiang, Yaping Zhang, Jiaoying Shi State Key Lab of CAD&CG, College of Computer Science Zhejiang University Hangzhou, China April

More information

Large-Scale Reservoir Simulation and Big Data Visualization

Large-Scale Reservoir Simulation and Big Data Visualization Large-Scale Reservoir Simulation and Big Data Visualization Dr. Zhangxing John Chen NSERC/Alberta Innovates Energy Environment Solutions/Foundation CMG Chair Alberta Innovates Technology Future (icore)

More information

The Methodology of Application Development for Hybrid Architectures

The Methodology of Application Development for Hybrid Architectures Computer Technology and Application 4 (2013) 543-547 D DAVID PUBLISHING The Methodology of Application Development for Hybrid Architectures Vladimir Orekhov, Alexander Bogdanov and Vladimir Gaiduchok Department

More information

Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child

Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child Introducing A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child Bio Tim Child 35 years experience of software development Formerly VP Oracle Corporation VP BEA Systems Inc.

More information

TWO-DIMENSIONAL FINITE ELEMENT ANALYSIS OF FORCED CONVECTION FLOW AND HEAT TRANSFER IN A LAMINAR CHANNEL FLOW

TWO-DIMENSIONAL FINITE ELEMENT ANALYSIS OF FORCED CONVECTION FLOW AND HEAT TRANSFER IN A LAMINAR CHANNEL FLOW TWO-DIMENSIONAL FINITE ELEMENT ANALYSIS OF FORCED CONVECTION FLOW AND HEAT TRANSFER IN A LAMINAR CHANNEL FLOW Rajesh Khatri 1, 1 M.Tech Scholar, Department of Mechanical Engineering, S.A.T.I., vidisha

More information

Optimizing Application Performance with CUDA Profiling Tools

Optimizing Application Performance with CUDA Profiling Tools Optimizing Application Performance with CUDA Profiling Tools Why Profile? Application Code GPU Compute-Intensive Functions Rest of Sequential CPU Code CPU 100 s of cores 10,000 s of threads Great memory

More information

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller In-Memory Databases Algorithms and Data Structures on Modern Hardware Martin Faust David Schwalb Jens Krüger Jürgen Müller The Free Lunch Is Over 2 Number of transistors per CPU increases Clock frequency

More information

How to program efficient optimization algorithms on Graphics Processing Units - The Vehicle Routing Problem as a case study

How to program efficient optimization algorithms on Graphics Processing Units - The Vehicle Routing Problem as a case study How to program efficient optimization algorithms on Graphics Processing Units - The Vehicle Routing Problem as a case study Geir Hasle, Christian Schulz Department of, SINTEF ICT, Oslo, Norway Seminar

More information

HIGH PERFORMANCE CONSULTING COURSE OFFERINGS

HIGH PERFORMANCE CONSULTING COURSE OFFERINGS Performance 1(6) HIGH PERFORMANCE CONSULTING COURSE OFFERINGS LEARN TO TAKE ADVANTAGE OF POWERFUL GPU BASED ACCELERATOR TECHNOLOGY TODAY 2006 2013 Nvidia GPUs Intel CPUs CONTENTS Acronyms and Terminology...

More information

GPU Architectures. A CPU Perspective. Data Parallelism: What is it, and how to exploit it? Workload characteristics

GPU Architectures. A CPU Perspective. Data Parallelism: What is it, and how to exploit it? Workload characteristics GPU Architectures A CPU Perspective Derek Hower AMD Research 5/21/2013 Goals Data Parallelism: What is it, and how to exploit it? Workload characteristics Execution Models / GPU Architectures MIMD (SPMD),

More information

Evaluation of CUDA Fortran for the CFD code Strukti

Evaluation of CUDA Fortran for the CFD code Strukti Evaluation of CUDA Fortran for the CFD code Strukti Practical term report from Stephan Soller High performance computing center Stuttgart 1 Stuttgart Media University 2 High performance computing center

More information

Performance Improvement of Application on the K computer

Performance Improvement of Application on the K computer Performance Improvement of Application on the K computer November 13, 2011 Kazuo Minami Team Leader, Application Development Team Research and Development Group Next-Generation Supercomputer R & D Center

More information

Turbomachinery CFD on many-core platforms experiences and strategies

Turbomachinery CFD on many-core platforms experiences and strategies Turbomachinery CFD on many-core platforms experiences and strategies Graham Pullan Whittle Laboratory, Department of Engineering, University of Cambridge MUSAF Colloquium, CERFACS, Toulouse September 27-29

More information

Case Study on Productivity and Performance of GPGPUs

Case Study on Productivity and Performance of GPGPUs Case Study on Productivity and Performance of GPGPUs Sandra Wienke wienke@rz.rwth-aachen.de ZKI Arbeitskreis Supercomputing April 2012 Rechen- und Kommunikationszentrum (RZ) RWTH GPU-Cluster 56 Nvidia

More information

Amplification of the Radiation from Two Collocated Cellular System Antennas by the Ground Wave of an AM Broadcast Station

Amplification of the Radiation from Two Collocated Cellular System Antennas by the Ground Wave of an AM Broadcast Station Amplification of the Radiation from Two Collocated Cellular System Antennas by the Ground Wave of an AM Broadcast Station Dr. Bill P. Curry EMSciTek Consulting Co., W101 McCarron Road Glen Ellyn, IL 60137,

More information

Accelerating CFD using OpenFOAM with GPUs

Accelerating CFD using OpenFOAM with GPUs Accelerating CFD using OpenFOAM with GPUs Authors: Saeed Iqbal and Kevin Tubbs The OpenFOAM CFD Toolbox is a free, open source CFD software package produced by OpenCFD Ltd. Its user base represents a wide

More information

L20: GPU Architecture and Models

L20: GPU Architecture and Models L20: GPU Architecture and Models scribe(s): Abdul Khalifa 20.1 Overview GPUs (Graphics Processing Units) are large parallel structure of processing cores capable of rendering graphics efficiently on displays.

More information

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip. Lecture 11: Multi-Core and GPU Multi-core computers Multithreading GPUs General Purpose GPUs Zebo Peng, IDA, LiTH 1 Multi-Core System Integration of multiple processor cores on a single chip. To provide

More information

HPC technology and future architecture

HPC technology and future architecture HPC technology and future architecture Visual Analysis for Extremely Large-Scale Scientific Computing KGT2 Internal Meeting INRIA France Benoit Lange benoit.lange@inria.fr Toàn Nguyên toan.nguyen@inria.fr

More information

Introduction to GPU Programming Languages

Introduction to GPU Programming Languages CSC 391/691: GPU Programming Fall 2011 Introduction to GPU Programming Languages Copyright 2011 Samuel S. Cho http://www.umiacs.umd.edu/ research/gpu/facilities.html Maryland CPU/GPU Cluster Infrastructure

More information

ALL GROUND-WATER HYDROLOGY WORK IS MODELING. A Model is a representation of a system.

ALL GROUND-WATER HYDROLOGY WORK IS MODELING. A Model is a representation of a system. ALL GROUND-WATER HYDROLOGY WORK IS MODELING A Model is a representation of a system. Modeling begins when one formulates a concept of a hydrologic system, continues with application of, for example, Darcy's

More information

Computer Graphics Hardware An Overview

Computer Graphics Hardware An Overview Computer Graphics Hardware An Overview Graphics System Monitor Input devices CPU/Memory GPU Raster Graphics System Raster: An array of picture elements Based on raster-scan TV technology The screen (and

More information

OpenCL Optimization. San Jose 10/2/2009 Peng Wang, NVIDIA

OpenCL Optimization. San Jose 10/2/2009 Peng Wang, NVIDIA OpenCL Optimization San Jose 10/2/2009 Peng Wang, NVIDIA Outline Overview The CUDA architecture Memory optimization Execution configuration optimization Instruction optimization Summary Overall Optimization

More information

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015 GPU Hardware and Programming Models Jeremy Appleyard, September 2015 A brief history of GPUs In this talk Hardware Overview Programming Models Ask questions at any point! 2 A Brief History of GPUs 3 Once

More information

Mathematical Modeling and Engineering Problem Solving

Mathematical Modeling and Engineering Problem Solving Mathematical Modeling and Engineering Problem Solving Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University Reference: 1. Applied Numerical Methods with

More information

NUMERICAL ANALYSIS OF OPEN CHANNEL STEADY GRADUALLY VARIED FLOW USING THE SIMPLIFIED SAINT-VENANT EQUATIONS

NUMERICAL ANALYSIS OF OPEN CHANNEL STEADY GRADUALLY VARIED FLOW USING THE SIMPLIFIED SAINT-VENANT EQUATIONS TASK QUARTERLY 15 No 3 4, 317 328 NUMERICAL ANALYSIS OF OPEN CHANNEL STEADY GRADUALLY VARIED FLOW USING THE SIMPLIFIED SAINT-VENANT EQUATIONS WOJCIECH ARTICHOWICZ Department of Hydraulic Engineering, Faculty

More information

N 1. (q k+1 q k ) 2 + α 3. k=0

N 1. (q k+1 q k ) 2 + α 3. k=0 Teoretisk Fysik Hand-in problem B, SI1142, Spring 2010 In 1955 Fermi, Pasta and Ulam 1 numerically studied a simple model for a one dimensional chain of non-linear oscillators to see how the energy distribution

More information

Sample Questions for the AP Physics 1 Exam

Sample Questions for the AP Physics 1 Exam Sample Questions for the AP Physics 1 Exam Sample Questions for the AP Physics 1 Exam Multiple-choice Questions Note: To simplify calculations, you may use g 5 10 m/s 2 in all problems. Directions: Each

More information

Doppler. Doppler. Doppler shift. Doppler Frequency. Doppler shift. Doppler shift. Chapter 19

Doppler. Doppler. Doppler shift. Doppler Frequency. Doppler shift. Doppler shift. Chapter 19 Doppler Doppler Chapter 19 A moving train with a trumpet player holding the same tone for a very long time travels from your left to your right. The tone changes relative the motion of you (receiver) and

More information

Parallel Computing with MATLAB

Parallel Computing with MATLAB Parallel Computing with MATLAB Scott Benway Senior Account Manager Jiro Doke, Ph.D. Senior Application Engineer 2013 The MathWorks, Inc. 1 Acceleration Strategies Applied in MATLAB Approach Options Best

More information

Grid adaptivity for systems of conservation laws

Grid adaptivity for systems of conservation laws Grid adaptivity for systems of conservation laws M. Semplice 1 G. Puppo 2 1 Dipartimento di Matematica Università di Torino 2 Dipartimento di Scienze Matematiche Politecnico di Torino Numerical Aspects

More information

Modelling of Flood Wave Propagation with Wet-dry Front by One-dimensional Diffusive Wave Equation

Modelling of Flood Wave Propagation with Wet-dry Front by One-dimensional Diffusive Wave Equation Archives of Hydro-Engineering and Environmental Mechanics Vol. 61 (2014), No. 3 4, pp. 111 125 DOI: 10.1515/heem-2015-0007 IBW PAN, ISSN 1231 3726 Modelling of Flood Wave Propagation with Wet-dry Front

More information