Multi-GPU Load Balancing for Simulation and Rendering

Size: px

Start display at page:

Download "Multi-GPU Load Balancing for Simulation and Rendering"

Cuthbert Hicks
8 years ago
Views:

1 Multi- Load Balancing for Simulation and Rendering Yong Cao Computer Science Department, Virginia Tech, USA

2 In-situ ualization and ual Analytics Instant visualization and interaction of computing tasks Applications: Computational Fluid Dynamics Seismic Propagation Molecular Dynamics Network Security Analysis 2

Applications: Computational Fluid Dynamics Seismic

3 In-situ ualization and ual Analytics Instant visualization and interaction of computing tasks Applications: Computational Fluid Dynamics Seismic Propagation Molecular Dynamics Network Security Analysis 3

4 In-situ ualization and ual Analytics Instant visualization and interaction of computing tasks Applications: Computational Fluid Dynamics Seismic Propagation Molecular Dynamics Network Security Analysis 4

5 Generalized Execution Loop Simulation Rendering Execution: Data write Data read Memory: 5

6 Generalized Execution Loop Execution: Task 1 Task 2 Data write Data read Memory: 6

7 Parallel Execution Task Split Problem: Task (Context) Switch T1 T2 Processor 1: Processor 2: Data write Data read Memory: Disadvantage of context switch: - Overhead of another kernel launch - Flash of the cache lines - Disallow persistent threads 7

Disadvantage of context switch: - Overhead of another kernel

8 Parallel Execution: Pipelining Task 1 Task 2 Processor 1: Processor 2: t t t+1 t+1 Data write Data read Memory: + Simplified kernel for each + Better share memory and cache usage + Persistent thread for distributed scheduling 8

+ Simplified kernel for each + Better share memory and

9 Parallel Execution: Pipelining Problem: bubble in the pipeline Task 1 Task 2 Processor 1: Processor 2: t t t+1 t+1 Data write Data read Memory: 9

10 Multi- Pipeline Architecture Multi- Array Sim Sim Read Write FIFO Data Buffer Time Step 1 Time Step 2 Sim W R Sim W R Time Step n Sim W R 10

11 Adaptive Load Balancing Multi- Array Sim Sim FIFO Data Buffer Full Buffer: Shift toward Rendering Empty Buffer: Shift toward Simulation Read Read Read Sim Write Write Sim Write Sim Sim Adaptive and Distributed Scheduling 11

Buffer: Shift toward Simulation Read Read Read Sim

12 Task Partition Intra-frame partition Inter-frame partition t t t t t t t+1 t+2 t+3 t t+1 t+2 t+3 12

13 Task Partition for ual Simulation Simulation: Intra frame partition Rendering: Inter frame partition Multi- Array Sim Sim Read Write FIFO Data Buffer 13

14 Problem: Scheduling Algorithm Performance Model: n: The number of assigned s. Schedule to optimize: M i : The number of assigned Simulation s. 14

15 Case Study Application N-body Simulation with Ray-Traced rendering Performance model parameters: Simulation: number of iterations (i) number of simulated bodies (p) Rendering: number of samples for super sampling (s) Scheduling Optimization: M t = f (i t, s t, p t ) 15

number of simulated bodies (p) Rendering: number of samples for

16 Static Load-Balancing Assumption: the performance parameters do NOT change at run-time. M t = f (i t, s t, p t ) M = f (i, s, p) Data driven modeling approach: Sample the 3 dimensional (i,s,p) as a rigid grid Use tri-linear interpolation to get the result for the new inputs 16

17 Static Load-Balancing: Results Performance Parameter Sampling Load Balancing 16 Samples, 80 iterations 4 Samples, 80 iterations 17

18 Dynamic Load Balancing Assumption: Performance parameters change during the run-time. Find the indirect load-balance indicator p Execution time of the previous time step Problem: Performance different between two time steps can be dramatic. The fullness of the buffer F 18

Find the indirect load-balance indicator p Execution time of the

19 Dynamic Load Balancing: Result Stability of the Dynamic Scheduling Algorithm No parameter change (only at the beginning) Parameters change at the dotted line. 19

20 Comparison: Dynamic vs. Static Scheduling 2000 Particles 4000 Particles Performance Speedup over static load-balancing 20

21 Conclusion + Pipelining + Dynamic load balancing - Fine granularity load balancing (SM level) - Communication overhead - Programmability: Software framework, Library 21

22 Question(s): Contact Information: Yong Cao Computer Science Department Virginia Tech Website: 22

Multi-GPU Load Balancing for In-situ Visualization

Multi-GPU Load Balancing for In-situ Visualization R. Hagan and Y. Cao Department of Computer Science, Virginia Tech, Blacksburg, VA, USA Abstract Real-time visualization is an important tool for immediately