Load Balancing Techniques



Similar documents
Load Balancing and Termination Detection

Chapter 7 Load Balancing and Termination Detection

Load Balancing and Termination Detection

Load balancing Static Load Balancing

Load Balancing and Termination Detection

Load balancing; Termination detection

Static Load Balancing

Load balancing; Termination detection

Load Balancing. Load Balancing 1 / 24

Praktikum Wissenschaftliches Rechnen (Performance-optimized optimized Programming)

A number of tasks executing serially or in parallel. Distribute tasks on processors so that minimal execution time is achieved. Optimal distribution

Comparison on Different Load Balancing Algorithms of Peer to Peer Networks

Various Schemes of Load Balancing in Distributed Systems- A Review

Performance of Dynamic Load Balancing Algorithms for Unstructured Mesh Calculations

LOAD BALANCING TECHNIQUES

A Survey on Load Balancing and Scheduling in Cloud Computing

Load Balancing Algorithms for Peer to Peer and Client Server Distributed Environments

CHAPTER 5 WLDMA: A NEW LOAD BALANCING STRATEGY FOR WAN ENVIRONMENT

A Comparative Performance Analysis of Load Balancing Algorithms in Distributed System using Qualitative Parameters

Keywords Load balancing, Dispatcher, Distributed Cluster Server, Static Load balancing, Dynamic Load balancing.

Partitioning and Divide and Conquer Strategies

An Approach to Load Balancing In Cloud Computing

RESEARCH PAPER International Journal of Recent Trends in Engineering, Vol 1, No. 1, May 2009

Load Balancing in cloud computing

Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations

Resource Allocation Schemes for Gang Scheduling

Proposal of Dynamic Load Balancing Algorithm in Grid System

Comparative Study of Load Balancing Algorithms

The optimize load balancing in cluster computing..

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

LOAD BALANCING FOR MULTIPLE PARALLEL JOBS

A Service Revenue-oriented Task Scheduling Model of Cloud Computing

Load Balancing in Distributed System. Prof. Ananthanarayana V.S. Dept. Of Information Technology N.I.T.K., Surathkal

IMPROVED LOAD BALANCING MODEL BASED ON PARTITIONING IN CLOUD COMPUTING

Design and Implementation of Efficient Load Balancing Algorithm in Grid Environment

A Review of Customized Dynamic Load Balancing for a Network of Workstations

DYNAMIC LOAD BALANCING IN A DECENTRALISED DISTRIBUTED SYSTEM

International Journal of Scientific & Engineering Research, Volume 6, Issue 4, April ISSN

Performance Analysis of Load Balancing Algorithms in Distributed System

Scheduling Allowance Adaptability in Load Balancing technique for Distributed Systems

DATA STRUCTURES USING C

Multi-GPU Load Balancing for Simulation and Rendering

ParFUM: A Parallel Framework for Unstructured Meshes. Aaron Becker, Isaac Dooley, Terry Wilmarth, Sayantan Chakravorty Charm++ Workshop 2008

Load Balancing and Switch Scheduling

Tools Page 1 of 13 ON PROGRAM TRANSLATION. A priori, we have two translation mechanisms available:

5. Binary objects labeling

REAL TIME OPERATING SYSTEMS. Lesson-18:

The International Journal Of Science & Technoledge (ISSN X)

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder

Source Code Transformations Strategies to Load-balance Grid Applications

Load Balancing in Structured Peer to Peer Systems

Load Balancing in Structured Peer to Peer Systems

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research

Load Balancing on a Non-dedicated Heterogeneous Network of Workstations

OpenMosix Presented by Dr. Moshe Bar and MAASK [01]

@IJMTER-2015, All rights Reserved 355

SIMULATION OF LOAD BALANCING ALGORITHMS: A Comparative Study

Parallel Ray Tracing using MPI: A Dynamic Load-balancing Approach

Efficient Parallel Processing on Public Cloud Servers Using Load Balancing

Dynamic Load Balancing in Charm++ Abhinav S Bhatele Parallel Programming Lab, UIUC

DECENTRALIZED LOAD BALANCING IN HETEROGENEOUS SYSTEMS USING DIFFUSION APPROACH

A Survey Of Various Load Balancing Algorithms In Cloud Computing

A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster

CSE 4351/5351 Notes 7: Task Scheduling & Load Balancing

Multiprocessor Scheduling and Scheduling in Linux Kernel 2.6

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

Sparse Matrix Decomposition with Optimal Load Balancing

Scheduling Task Parallelism" on Multi-Socket Multicore Systems"

A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture

PPD: Scheduling and Load Balancing 2

Global Load Balancing and Primary Backup Approach for Fault Tolerant Scheduling in Computational Grid

An Empirical Study and Analysis of the Dynamic Load Balancing Techniques Used in Parallel Computing Systems

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes

Quick Sort. Implementation

Evaluation of Different Task Scheduling Policies in Multi-Core Systems with Reconfigurable Hardware

Chapter 11 I/O Management and Disk Scheduling

International journal of Engineering Research-Online A Peer Reviewed International Journal Articles available online

Classification On The Clouds Using MapReduce

Experiments on the local load balancing algorithms; part 1

Operatin g Systems: Internals and Design Principle s. Chapter 10 Multiprocessor and Real-Time Scheduling Seventh Edition By William Stallings

Load Balancing on Massively Parallel Networks: A Cellular Automaton Approach

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

Hadoop Cluster Applications

Characterizing the Performance of Dynamic Distribution and Load-Balancing Techniques for Adaptive Grid Hierarchies

Load Balancing using Potential Functions for Hierarchical Topologies

AN EFFICIENT LOAD BALANCING APPROACH IN CLOUD SERVER USING ANT COLONY OPTIMIZATION

Hagit Attiya and Eshcar Hillel. Computer Science Department Technion

Load balancing in a heterogeneous computer system by self-organizing Kohonen network

Load Balancing in Fault Tolerant Video Server

Transcription:

Load Balancing Techniques 1 Lecture Outline Following Topics will be discussed Static Load Balancing Dynamic Load Balancing Mapping for load balancing Minimizing Interaction 2 1

Load Balancing Techniques Objectives: The amount of computation assigned to each processor is balanced so that some processors do not sit idle while others are executing tasks. The interaction among the different processors is minimized, so that the processors spend most of the time in doing work. Many times to balance the load among processors, it may be necessary to assign tasks, that interact heavily, to different processors. 3 General classes of problems The first class consists of problems in which all the tasks are available at the beginning of the computation but the amount of time required by each task is different The second class consists of problems in which tasks are available at the beginning but as the computation progresses, the amount of time required by each task changes The third class consists of problems in which tasks are not available at the beginning but are generated dynamically 4 2

Static load-balancing Load Balancing Techniques Distribute the work among processors prior to the execution of the algorithm Example: Matrix-Matrix Computation Easy to design and implement Dynamic load-balancing Distribute the work among processors during the execution of the algorithm Algorithms that require dynamic load-balancing are somewhat more complicated (Examples: Parallel Graph Partitioning and Adaptive Finite Element Computations) 5 Static load-balancing Before the execution of any process. Some potential static load balancing techniques: Round robin algorithm passes out tasks in sequential order of processes, coming back to the first when all processes have been given a task Randomized algorithms selects processes at random to take tasks Recursive bisection recursively divides the problem into sub problems of equal computational effort while minimizing message passing Simulated annealing an optimization technique Genetic algorithm another optimization technique 6 3

Static load-balancing Since load is balanced prior to the execution, several fundamental flaws with static load balancing even if a mathematical solution exists: Very difficult to estimate accurately the execution times of various parts of a program without actually executing the parts. Communication delays that vary under different circumstances Some problems have an indeterminate number of steps to reach their solution. 7 General Strategy for Load Balancing 8 4

Static Load Balancing Ex. Array distribution schemes: One dimensional (strip) Block distributions of matrix (checkerboard) Block cyclic distributions Randomized block distributions 9 Using Stripped Data Decomposition When stripped data decomposition is used to derive concurrency, a suitable decomposition of data can itself be used to balance the load and minimize interactions. 10 5

Matrix-Matrix addition P 0 P 0 P 0 P 1 P 1 P 1 P 2 P 2 P 2 P 3 P 3 P 3 P 4 P 4 P 4 P 5 P 5 P 5 P 6 P 6 P 6 P 7 P 7 P 7 A B C 11 Block distribution (Dense matrix-matrix multiplication) 12 6

P 0 P 1 P 2 P 3 P 0 P 1 P 2 P 3 P 4 P 5 P 6 P 7 P 4 P 5 P 6 P 7 P 8 P 9 P 10 P 11 P 8 P 9 P 10 P 11 P 12 P 13 P 14 P 15 P 12 P 13 P 14 P 15 P 0 P 1 P 2 P 3 P 0 P 1 P 2 P 3 P 4 P 5 P 6 P 7 P 4 P 5 P 6 P 7 P 8 P 9 P 10 P 11 P 8 P 9 P 10 P 11 P 12 P 13 P 14 P 15 P 12 P 13 P 14 P 15 (a) will lead to load imbalances sparse matrix in Gaussian elimination. Using the block-cyclic distribution shown in (b) to distribute the computations to processors 13 Schemes for Dynamic Load Balancing Dynamic partition There are problems in which we cannot statically partition the work among the processors In these problems, a static work partitioning is either impossible (e.g. first class) or can potentially lead to serious load imbalance problems (e.g., second and third classes) The only way to develop efficient message passing programs for these classes of problem is if we allow dynamic load balancing Thus during the execution of the program, work is dynamically transferred among the processors that have a lot of work to one that has little or no work 14 7

Dynamic Load Balancing Can be classified as: Centralized Decentralized 15 Centralized dynamic load balancing Tasks handed out from a centralized location. Master-slave structure. Decentralized dynamic load balancing Tasks are passed between arbitrary processes. A collection of worker processes operate upon the problem and interact among themselves, finally reporting to a single process. A worker process may receive tasks from other worker processes and may send tasks to other worker processes (to complete or pass on at their discretion). 16 8

Centralized Dynamic Load Balancing Master process(or) holds the collection of tasks to be performed. Tasks are sent to the slave processes. When a slave process completes one task, it requests another task from the master process. Terms used : work pool, replicated worker, processor farm. 17 Centralized work pool 18 9

Termination Computation terminates when: The task queue is empty and Every process has made a request for another task without any new tasks being generated Not sufficient to terminate when task queue empty Reason: if one or more processes are still running if a running process may provide new tasks for task queue. 19 Decentralized Dynamic Load Balancing Distributed Work Pool 20 10

Process Selection Algorithms for selecting a process: Round robin algorithm process P i requests tasks from process P x, where x is given by a counter that is incremented after each request, using modulo n arithmetic (n processes), excluding x = i. Random polling algorithm process P i requests tasks from process P x, where x is a number that is selected randomly between 0 and n - 1 (excluding i). 21 Load Balancing Using a Line Structure 22 11

Load Balancing Using a Line Structure Master process (P 0 ) feeds queue with tasks at one end, and tasks are shifted down queue. When a process, P i (1 i < n), detects a task at its input from queue and process is idle, it takes task from queue. Then tasks shuffle down to the right in queue so that space held by task is filled. A new task is inserted into the left side end of queue. Eventually, all processes have a task and queue filled with new tasks. High- priority or larger tasks could be placed in queue first. 23 Code Using Time Sharing Between Communication and Computation Master process (P 0 ) for (i = 0; i < no_tasks; i++) { recv(p1, request_tag); /*request for task*/ send(&task, P1, task_tag);/*send tasks into queue*/ } recv(p1, request_tag); /*request for task*/ send(&empty, P1, task_tag); /*in case end of tasks*/ 24 12

Process P i (1 < i < n) if (buffer == empty) { send(pi-1, request_tag); /* request new task */ recv(&buffer, Pi-1, task_tag); /* task from left proc */ } if ((buffer == full) && (!busy)) { /* get next task */ task = buffer; /* get task*/ buffer = empty; /* set buffer empty */ busy = TRUE; /* set process busy */ } nrecv(pi+1, request_tag, request); /* check msg from right */ if (request && (buffer == full)) { send(&buffer, Pi+1); /* shift task forward */ buffer = empty; } if (busy) { /* continue on current task */ Do some work on task. If task finished, set busy to false. } Nonblocking nrecv() necessary to check for a request being received from right. 25 Load balancing using a tree Tasks passed from node into one of the two nodes below it when node buffer empty. 26 13

General Techniques for Choosing Right Parallel Algorithm» Maximize data locality» Minimize volume of data» Minimize frequency of Interactions» Overlapping computations with interactions.» Decision on Data replication» Minimize construction» Use highly optimized collective interaction operations» Collective data transfers and computations» Maximize Concurrency 27 Decision tree to choose a mapping strategy 28 14

Static number of tasks Dynamic number of tasks Structured communication pattern Unstructured communication pattern Roughly constant computation time per task Computation time per task varies Frequent communications between tasks Many short-lived tasks. No inter task communication Join tasks to minimize communication. Create one task per processor Cyclically map tasks to processors for computational load balancing Use static load balancing techniques Use dynamic load balancing techniques Use run time task scheduling algorithms 29 15