Parallel CBIR implementations with load balancing algorithms

Size: px
Start display at page:

Download "Parallel CBIR implementations with load balancing algorithms"

Transcription

1 J. Parallel Distrib. Comput. 66 (2006) Parallel CBIR implementations with load balancing algorithms José L. Bosque a,, Oscar D. Robles a, Luis Pastor a, Angel Rodríguez b a Dpto. de Informática, Estadística y Telemática, U. Rey Juan Carlos, C. Tulipán, s/n, Móstoles, Madrid, Spain b Dept. de Tecnología Fotónica, UPM, Campus de Montegancedo s/n, Boadilla del Monte, Spain Received 22 December 2003; received in revised form 23 February 2005; accepted 7 April 2006 Available online 13 June 2006 Abstract The purpose of content-based information retrieval (CBIR) systems is to retrieve, from real data stored in a database, information that is relevant to a query. When large volumes of data are considered, as it is very often the case with databases dealing with multimedia data, it may become necessary to look for parallel solutions in order to store and gain access to the available items in an efficient way. Among the range of parallel options available nowadays, clusters stand out as flexible and cost effective solutions, although the fact that they are composed of a number of independent machines makes it easy for them to become heterogeneous. This paper describes a heterogeneous cluster-oriented CBIR implementation. First, the cluster solution is analyzed without load balancing, and then, a new load balancing algorithm for this version of the CBIR system is presented. The load balancing algorithm described here is dynamic, distributed, global and highly scalable. Nodes are monitored through a load index which allows the estimation of their total amount of workload, as well as the global system state. Load balancing operations between pairs of nodes take place whenever a node finishes its job, resulting in a receptor-triggered scheme which minimizes the system s communication overhead. Globally, the CBIR cluster implementation together with the load balancing algorithm can cope effectively with varying degrees of heterogeneity within the cluster; the experiments presented within the paper show the validity of the overall strategy. Together, the CBIR implementation and the load balancing algorithm described in this paper span a new path for performant, cost effective CBIR systems which has not been explored before in the technical literature Elsevier Inc. All rights reserved. Keywords: Parallel implementations; CBIR systems; Load balancing algorithms 1. Introduction The tremendous improvements experimented by computers in aspects such as price, processing power and mass storage capabilities have resulted in an explosion of the amount of information available to people. But this same wealth makes finding the best information a very hard task. CBIR 1 systems try to solve this problem by offering mechanisms for selecting the data items which resemble most a specific query among all the available information [12,34], although the complexity Corresponding author. addresses: joseluis.bosque@urjc.es (J.L. Bosque), oscardavid.robles@urjc.es (O.D. Robles), luis.pastor@urjc.es (L. Pastor), arodri@dtf.fi.upm.es (A. Rodríguez). 1 Content-based information retrieval. of this task depends heavily on the volume of data stored in the system. As usual, parallel solutions can be used to alleviate this problem, given the fact that the search operations present a large degree of data parallelism. Distributed solutions on clusters offer a good cost/performance ratio to solve this problem, given their excellent scalability, fault tolerance and flexibility attributes [37,6,4]. Also, this architecture allows concurrent access to disks, considered as the main bottleneck in CBIR systems. Although homogeneous clusters could also be considered for this applications, it is difficult to keep the homogeneity of this type of systems during all of their life-cycle. Among the factors that affect their configuration stability we can mention the addition of new nodes or substitution of faulty ones, technological evolution factors, and even exploitation aspects such as disk fragmentation, etc. In consequence, clusters present additional challenges, since they /$ - see front matter 2006 Elsevier Inc. All rights reserved. doi: /j.jpdc

2 J.L. Bosque et al. / J. Parallel Distrib. Comput. 66 (2006) can easily become heterogeneous, requiring load distributions that take into consideration each node s computational features [33]. This way, one of the critical parameters to be fixed in order to keep the efficiency high for this architectures is the workload assigned to each of the cluster nodes. Even though load balancing has received a considerable amount of interest, it is still not definitely solved, particularly for heterogeneous systems [10,18,41,45]. Nevertheless, this problem is central for minimizing the applications response time and optimizing the exploitation of resources, avoiding overloading some processors while others are idling. This paper describes the architecture, implementation and performance achieved by a parallel CBIR system implemented on a heterogeneous cluster that includes load balancing. The flexibility of the architecture herein presented allows the dynamical addition or removal of nodes from the cluster between two user queries, achieving reconfigurability, scalability and an appreciable degree of fault tolerance. This approach allows a dynamic management of specific databases that can be incorporated to or removed from the CBIR system in function of the desired user query. The heterogeneity of the system is managed by a new dynamic and distributed load balancing algorithm, introducing a new load index that takes into account the computational nodes capabilities and a more accurate measure of their workload. The proposed method introduces a very small system overhead when departing from a reasonably balanced starting point. As mentioned before, the amount of data to be managed in CBIR systems is so huge nowadays that it is almost mandatory to use parallelism in order to achieve a reasonable user response times. Two alternatives were tested in a previous work: a shared-memory multiprocessor and a cluster [6]. Since the cluster implementation has given better results, it seems advisable to introduce load balancing strategies to improve the efficiency in heterogeneous clusters. The selected approach is based on a dynamic, distributed, global and highly scalable load balancing algorithm. An heterogeneous load index based on the number of running tasks and the computational power of each node is defined to determine the state of the nodes. The algorithm automatically turns itself off in global overloading or under-loading situations. Together, the CBIR implementation and the load balancing algorithm described in this paper open a new path for performant, cost effective CBIR systems which has not been explored before in the technical literature. The rest of this article is organized as follows: Section 2 presents an overview of parallel CBIR systems and load balancing algorithms. Section 3 presents an analysis of a sequential version of the CBIR algorithm and a brief description of its parallel implementation on a cluster (without load balancing). Section 4 describes the distributed load balancing algorithm applied to the parallel CBIR system and Section 5 details its implementation on a heterogeneous cluster. Section 6 shows the tests performed in order to measure the improvement achieved by the heterogeneous cluster version with load balancing and the results achieved. Finally, Section 7 presents the conclusions and ongoing work. 2. Previous work The technological development experimented during the last 20 years has turned into a spectacular increase in the volume of data managed by information systems. This fact has lead to the search for methods to automate the process of extracting structured information from these systems [12,31]. The potential importance of CBIR systems has been reflected in the variety of approaches taken while dealing with different aspects of CBIR systems. The multidisciplinary nature of this problem has often resulted in partial advances that have been integrated later on in new prototypes and commercial systems. For example, it is possible to find research work that takes into consideration man machine interaction issues [32]; the users behavior from a psychological modeling standpoint [27]; multidimensional indexing techniques [5]; multimedia database management system issues [19]; pattern recognition algorithms [17]; multimedia signal processing [39]; object representation and modeling techniques [21]; benchmarks for testing the performance of CBIR systems [16,24]; etc. In any case, most of the research effort for CBIR systems has been focused on the search for powerful representation techniques for discriminating elements among the global database. Although the data nature is a crucial factor to be taken into consideration, most often the final representation is a feature vector 2 extracted from the raw data, which reflects somehow its content. While dealing with 2D images, it is possible to find techniques using color, shape, or texture-based primitives. Other techniques use spatial relationships among the image components or a combination of the above-mentioned approaches. For higher-dimensionality input data, it is possible to find proposals dealing with 3D images or video sequences. Nowadays, one of the most promising research lines is to increase the abstraction level of the semantics associated to the primitives managed, representing high-level concepts derived from the images or the multimedia data. From the computational complexity point of view, CBIR systems are potentially expensive and have user response times growing with the ever-increasing sizes of the databases associated to them. One of the most common approaches followed to reach acceptable price/performance ratios has been to exploit the algorithms inherent parallelism at implementation time. However, the novelty of CBIR systems hinders finding references dealing with this aspect. Some contributions that can be cited are Zaki s compilation [43], and the contributions of Srakaew et al. [37] and Bosque et al. [6]. Another reason that has made difficult widespread parallel CBIR system development is that prototype analysis demands a manual image classification stage that limits in practice the number of images used in the tests. Nevertheless, the volume of data managed by current DBs, and obviously those with multimedia information, will demand parallel optimizations for commercial implementations of CBIR systems. In those cases, load balancing operations preventing the coexistence of idling and overloaded 2 Named signature or primitive.

3 1064 J.L. Bosque et al. / J. Parallel Distrib. Comput. 66 (2006) processors will be almost required, since total response times are usually considerably improved with the introduction of even simple load balancing approaches. Load balancing techniques can be classified according to different criteria [8]. First, algorithms can be labeled as static or dynamic. Static methods perform workload distribution at compilation time, not taking into consideration the system state variations. Dynamic methods are able to redistribute workload among nodes at run time, depending on changes in the system state. The work of Rajagopalan et al. [28] and Obeloer et al. [25] are agent-based techniques. These are flexible and configurable approaches but the amount of resources needed for agent implementation is considerably large. Grosu et al. [15] present a very different cooperative approach to the load balancing problem, considering it as a game in which each cluster node is a player and must minimize its job execution time. Banicescu et al. propose a load balancing library for scientific applications on distributed memory architectures. The library integrates dynamic loop scheduling as an object migration policy with the object migration mechanism provided by the data movement and control substrate which is extended with a mobile object layer [2]. Load balancing algorithms can also be classified as centralized or distributed. In the first case, there is a single central node in charge of keeping the system s information updated, making decisions and actually performing the load balancing operations. In distributed methods, every node takes part in the load balancing operations; Zaki et al. [44] show that distributed algorithms yield better results than their centralized counterparts. Last, load balancing algorithms can be classified as global or local. In the first case, a global view of the system state is kept [10]. In the second case, nodes are arranged in sets or domains, and distribution decisions are made only within each domain [9,40]. Other approaches mix this taxonomy by combining several features that could be considered mutually exclusive, like the work of Ahmad and Ghafoor [1], where a semidistributed algorithm with a two level hierarchy is presented; their work focus on static networks where communication latency is very important and depends on node placement. In this type of networks, distributed algorithms may produce instability, scalability and bottleneck problems. The improvement of dynamic network technologies solves these problems with broadcast solutions and very low latencies. The technique proposed by Ahmad and Ghafoor [1], although interesting, is not easily applicable to general, unrestricted distributed systems: it was developed for static network environments, where latency is dependent on node location and where broadcast operations are very costly in terms of system performance. Clusters, which in the present work appear as a very attractive option for CBIR systems in terms of cost/performance ratio, present very different communication features, and therefore, advise using a different approach. Although a set of projects have been developed to implement CBIR systems on clusters like the IRMA project for medical images [14] and the DISCOVIR project (distributed contentbased visual information retrieval system on peer-to-peer network) [13], none of them include a load balancing algorithm to distribute the workload of the cluster nodes and therefore they cannot manage system heterogeneity. 3. CBIR system description The experimental work presented in this paper has been performed on a test CBIR system containing information from 29.5 million color pictures. The system provides the user with a data set containing the p images considered most similar to the query one. If the result does not satisfy the user, he/she can choose one of the selected images or enter a new one that presents some kind of similarity with the desired image. The following sections describe the heart of the CBIR system, where the signature is extracted from each image (a feature vector describing the image content), as well as the processes involved in serving a user s query. More detailed analysis of the retrieval techniques involved in the CBIR system and the method s stages from the standpoint of parallel optimization can be found elsewhere [30,29,6], respectively Signature computation Many different approaches can be used for computing the images signatures, as mentioned in Section 2. In the work presented here, a primitive that represents the color information of the original image at different resolution levels has been selected. To achieve a multiresolution representation, a wavelet transform is first applied to the image [22,11] Analysis of the sequential CBIR algorithm The search for images contained in a CBIR system can be broken down into the following stages: (1) Input/query image introduction: The user first selects a pixel bidimensional image to be used as a search reference. Then the system computes its signature as described above. The whole process can be efficiently implemented using an O(i_s) order algorithm, i_s being the image s size [38]. This stage does not require high computational resources since the system deals with just one image. (2) Query and DB image s signature comparison and sorting: The signature obtained in the previous stage is compared with all of the DB images signatures using an Euclidean distance-based metric. After this process, the identifiers of the p images most similar to the input image are extracted, ranked by their similarity. Even though this process of signature comparison, selection and ranking is not very demanding from the computational point of view, it has to be performed with all of the images within the DB. (3) Results display: The following step is to assemble a mosaic made up of the selected p images which has to be presented to the user as the search result (see Fig. 1). (4) Query image update: If the user considers the search result to be unsatisfactory, he may select one of the displayed images as a new input and then return to the first stage.

4 J.L. Bosque et al. / J. Parallel Distrib. Comput. 66 (2006) Fig. 1. Visual result of a query. Upon observing the operations involved, it is possible to notice that the comparison and sorting stage involves a much larger computational load than the others. Luckily, the exploitation of data parallelism can be done just by dividing the workload among n independent nodes, since there are no dependencies. This can be accomplished by distributing off-line the CBIR image s signatures across the processing nodes. Then, each node can compare the image query s signature with every available signature. In order to ease also the storage requirements, it is possible to distribute images, signatures and computation over all of the n available nodes Parallel implementations without load balancing Global strategy A remarkable feature of the signature comparison and sorting stage is the problem s fine granularity: it is possible to perform an efficient data-oriented parallelization by combining the signature comparison and sorting stages, and distributing among the different nodes only the data needed to perform this stage, which are the signatures of the DB images assigned to each node as well as a scalar defining the total number of signatures to be returned, p. It has to be noted that the amount of communications among the corresponding processes is very small, since only the input image s signature and the p identifiers from the most similar images which have been found at each node, together with their corresponding similarity measures, have to be exchanged among the processes involved, as we will see below. The programmed optimization strategy is based on a farm model, in which a master process distributes the data to be dealt with upon a set of slave processes which analyze the data and return the partial results to the master once they have finished their computations. Since this approach makes it possible to maintain a large degree of data handling locality, it is well suited for distributed memory multiprocessors with message passing communication. Further advantages of this solution are its good price/performance ratio and its high level of scalability, whenever the number of images stored in the database is increased. In our case, the following solution has been adopted: (1) The master process computes the signature of the input image and broadcasts it to the n slave processes. (2) The slave processes then proceed to compare the signature of the input image with the signatures of the images assigned to their corresponding process node. Once each comparison has been performed, a check is then carried out to ascertain whether the result obtained is one of the best p images and, should that be the case, it is then incorporated into the set which is repeatedly sorted using a bubble sorting algorithm. (3) The slave processes forward the p image identifiers and similarity measurements to the master process after comparing and selecting the p images which are most similar within each process node. (4) The master process collects the similarity results obtained from each of the n process nodes and sorts the n p similarity results, truncating the sort so as to include only the best p.

5 1066 J.L. Bosque et al. / J. Parallel Distrib. Comput. 66 (2006) Query Image Signature + selected images identifiers (if any) Query Image Signature + selected images identifiers (if any) MASTER Query Image Signature + selected images identifiers (if any) p more similar list + Requested images to show p more similar list + Requested images to show p more similar list + Requested images to show SLAVE 1 SLAVE 2 SLAVE n Fig. 2. Process communication in the cluster implementation without load balancing. (5) Finally, the master process requests the process nodes that contain the previously selected images to forward them so that they may be presented to the user and, once available, proceeds to compose a mosaic that is then displayed to the user. Fig. 2 represents a schematic diagram of the communication between the processes involved in the unbalanced system. It must be noticed that each node of the heterogeneous cluster runs two processes: a master to attend the user queries and a slave to provide the local results achieved by each process node to the master process of the cluster node where the query has been generated. This situation is very similar to that found on a grid MPI cluster implementation The application has been programmed using the MPI libraries as communication primitives between the master and slave processes. MPI has been selected given that it currently constitutes a standard for message passing communications on parallel architectures, offering a good degree of portability among parallel platforms [23]. The MPI version used is LAM, from the Laboratory for Scientific Computing of Notre Dame University, a free distribution of MPI [36]. The pseudo-code corresponding to the implementation of the master and slave processes is shown below. Master loop Request an image to the user Compute its signature Forward the signature to each of the n slave processes using the MPI_BCAST (broadcast) primitive Receive the results of the n slave processes using the MPI_- RECV (receive) primitive Sort the partial n p comparisons selecting the top p Request the p most similar images to the slave processes where the corresponding images are stored Receive the p more similar images from the nodes containing them using the MPI_RECV primitive Compose the mosaic to be presented to the user end loop Slave j (1 j n) M being the number of images stored in process node j loop Receive the signature of the query image forwarded from the master using the MPI_BCAST (reception from a previous broadcast) primitive Initialize the P j set which shall contain the p better results of the comparisons for k = 1toM do Find the signature of the image k Compare the query signature with that of the current image obtaining the similarity measurement ms jk if ms jk P j then Eliminate the worst result of P j Incorporate the result corresponding to the image k to P j Sort P j using a bubble sorting algorithm end if end for Forward P j to the master if the master requests images to compose the mosaic then Forward the requested images endif end loop

6 J.L. Bosque et al. / J. Parallel Distrib. Comput. 66 (2006) The size of the data corresponding to each one of the p best results that are transferred from every slave to the master is around 336 bytes. Therefore, each slave process transfers 336 p bytes per query. For example, for p = 20 and n = 25, the traffic involved in the response will be less than 165 kb Description of the load balancing algorithm A dynamic, distributed, global and highly scalable load balancing algorithm has been developed for CBIR application and tested with the CBIR parallel application previously described. A more detailed description of the load balancing algorithm can also be found in [7]. A load index based on the number of running tasks and the computational power of each node is used to determine the nodes state, which is exchanged among all of the cluster nodes at regular time intervals. The initiation rule is receiver-triggered and based on workload thresholds. Finally, the distribution rule takes into account the heterogeneous nature of the cluster nodes as well as the communication time needed for the workload transmission in order to divide the amount of workload between a pair of nodes in every load balancing operation. These ideas are detailed along the following sections State rule The load balancing algorithm is based on a load index which estimates how loaded a node is in comparison to the rest of the nodes that compose the cluster. Many approaches can be taken to compute the load index. Like in any estimation process, it is necessary to find a trade-off between accuracy and cost, since keeping frequently updated node rankings according to their workload might be costly. The index is based on the number of tasks in the run-queue of each CPU [20]. These data are exchanged among all of the nodes in the cluster to update the global state information. Moreover, each node takes into account the following information about the rest of the cluster nodes: Cluster heterogeneity: each node can have a different computational power P i, so this factor is an important parameter to take into account for computing the load index. It is defined as the inverse of the time taken by node i to process a single signature. Total amount of workload for each node: it is evaluated when the application begins its execution and it is updated if there are any changes in a node. Percentage of the workload performed by each node, W i :itis defined in function of the total workload, the computational power and the number of tasks in this node. Period of time from the last update, D, and total execution time, T. 3 These figures do not take into consideration either the data corresponding to the images presented to the user or the overheads originated by the communication primitives, although the latter could be considered negligible. Therefore, the updates of the number of tasks are performed as N ave = (N last T)+ (N cur D), (1) T + D where N cur is the number of current running tasks in the node, N last is the average of the number of tasks running from the last update, T is the total execution time of the N last tasks considered, and D is the interval of time since the last update. This expression gives the average number of tasks of the node during the execution time of the application. So, the percentage of workload processed in each node, W i, is evaluated as W i = 4.2. Information rule P i T W N ave 100. (2) Given that the load balancing approach described here is dynamic, distributed and global, every node in the system needs updated information about how loaded the remaining system nodes are [42]. The selected information rule is periodic: each node broadcasts its own load index to the rest of the nodes at specific time instants. A periodic rule is necessary because each node has to compute the amount of workload processed by the rest of the cluster nodes, based on the average number of tasks per node. To evaluate the average number of tasks it is necessary that the information is updated periodically, which makes other information rules such as event driven or under demand not suitable Initiation rule The initiation rule determines the current times for performing load balancing operations. It is a receiver initiated rule, where load balancing operations involve pairs of idling and heavily loaded nodes: whenever a processor finishes its assigned workload, it looks for a busy node and asks it to share part of its remaining workload. Since each node keeps information about the amount of pending work of the remaining nodes, the selection of busy nodes is simple. The initiation rule described above minimizes the number of load balancing operations, reducing the algorithm overhead. Also, all the operations improve the system performance, because the total response time of the nodes involved in the load balancing operation are equalized, provided that there are not any additional changes in their state or they are not involved in other load balancing operations Load balancing operation The load balancing operation is broken down in three phases: first it is necessary to find an adequate node which will provide part of its workload (localization rule). Then, the amount of workload to be transferred has to be computed (distribution rule). Finally, the workload has to be actually transferred.

7 1068 J.L. Bosque et al. / J. Parallel Distrib. Comput. 66 (2006) Localization rule Whenever a node finishes its workload, it looks for a sender node to start a load balancing operation. The receiver node checks the state of the rest of the cluster nodes and computes a node list, ordered by the amount of pending work. To select the sender node, the receiver checks its own position in the list and selects the node which is in the symmetric position; for example, if nodes are ranked according to their workload, the node less loaded will look for the most loaded; the second less loaded node will look for the second most loaded, and so on. In consequence, each pair of sender receiver nodes will have between both of them a similar amount of workload. Apart from being very simple to implement, this approach gives good results since whenever a node finishes its work it is placed in one end of the list, selecting a heavily loaded (in the other end of the list). This way, the selection of the sender node is very coherent: the underloaded nodes take workload from the overloaded nodes, while the nodes in middle positions in the list do not receive a load balancing request (since it is very unlikely that a node placed in an intermediate position starts a load balancing operation). Additionally, if several nodes are looking for a sender at the same time, it is unlikely that they address their requests to the same sender. This way, situations where a loaded node receives several load balancing petitions and the rest of the loaded nodes do not receive any are avoided. Finally, this approach is not time consuming, because the nodes have always up-to-date state information to make their own list. Whenever a node receives a load balancing request, it can accept or reject it. In order to accept it, the sender node should have a minimum amount of work left. Otherwise, the sender node is near to complete its workload and the cost of the load balancing operation can be higher than finishing the remaining workload locally. In that case, the receiver node will select another node from the list using the same procedure until an adequate node is found or the end of the list is reached Distribution rule The distribution rule computes the amount of work that has to be moved from the sender to the receiver node. An appropriate rule should take into consideration the relative nodes capabilities and availabilities, so that they finish processing their jobs at the same time (provided that no additional operations change their processing conditions). The communication time to transfer the workload among the nodes is also taken into consideration because the receiver node cannot run the new assigned task until it receives the corresponding load, having an additional delay. The global equilibrium is obtained through successive operations between couples of nodes. The proposed distribution rule is based on two parameters: the number of running tasks NT i and the computational power P i of the nodes which take part in the operation. This reflects the fact that the contribution of a powerful node might be hampered by a large amount of external workload. Both parameters will be included in the nodes actual computational power, Pact i, which is obtained as Pact i = P i. (3) NT i This is a multi-phase application within two different phases: comparison and sorting. Whenever the load balancing operation is finished, the sender node has to finish the comparison phase with the remaining workload. Then, it must sort all the processed workload. The receiver should compare and sort the new workload. Additionally, the communication time has to be taken into account, because the receiver cannot continue the processing until it receives the new workload. Then, the distribution rule is determined by the following expressions: W s T s = + W W r, Pact s Pact s T r = W r + W r + W r, Pact r Pact r P c W = W s + W r, (4) where T s and T r are the response times of the sender and receiver processors, since the load balancing operation is finished. W is the total workload of the sender which has not still been processed, W s is the remaining workload in the sender node after the load balancing operation and W r the workload sent to the receiver. Pact s and Pact r are the sender and receiver current computational power. Finally, P c is the communication power expressed in units of workload per second. The communication power is obtained by computing offline the number of signatures that can be exchanged between two of the cluster nodes per second. This model takes into consideration two assumptions: The computational power for a node is the same in both the comparison and sorting phases. The response time in both phases and the communication time are linear with respect to the workload. Solving these expressions, the amount of both sender and receiver workload can be computed as 2WPact r P c W r =, 2P c Pact s + 2Pact r P c + Pact s Pact r W s = W W r. (5) The values for both workloads W r and W s take into account the heterogeneity of the nodes, their current state, the communication times and the two different phases of the application. In consequence, the load balancing algorithm described is dynamic, being able to redistribute workload among nodes at run time, depending on how the system state changes. It is also distributed, because every node takes part in the load balancing operations. And finally, it is global, because a global view of the system state is always kept. The following section describes the implementation of this algorithm on a CBIR system running in a heterogeneous cluster.

8 J.L. Bosque et al. / J. Parallel Distrib. Comput. 66 (2006) Distributed load balancing implementation on a heterogeneous cluster 5.1. Process structure of the load balance implementation DISTRIBUTION PROCESS OF NODE i RECEIVES LOCAL REQUEST FROM SLAVE LOAD DAEMON OF NODE i Two replicated processes are distributed among each one of the cluster nodes: (1) Load daemon: This process implements both the state and the information rules. (2) Distribution daemon: It collects requests from slave nodes demanding workload and proceeds with the transference. Fig. 3 shows a decomposition of all the actions that must be carried out when a slave node finishes its local workload and triggers the initiation rule. First, it demands new load to the distribution process, which obtains the demanded load and sends it to the slave node with the purpose to allow the continuation of the computations. The following section describes the structure of the group of processes and their functions. REQUEST NODE S NUMBER RECEIVES REQUEST TO DEMAND LOAD OF NODE S NUMBER SORTS THE TABLE SELECTS THE NODE S NUMBER DISTRIBUTION PROCESS OF NODE i RECEIVES THE SENDS THE SELECTED NODE S NUMBER NODE S NUMBER DISTRIBUTION PROCESS OF NODE j 5.2. Groups of processes SENDS LOAD REQUEST TO THE SELECTED NODE RECEIVES LOAD REQUEST As mentioned in Section 3.3.2, communication and synchronization between processes is based on MPI. A structure of groups of processes based on communicators [23,26,35] has been implemented, where the groups allow to establish communication structures between processes and to use global communication functions over subsets of processes. This way, each type of process belongs to his own group: MPI_COMM_MS: this group is composed by the master process and all of the slave processes. MPI_COMM_DIST: this group is formed by the distribution processes. MPI_COMM_LOAD: this group is composed by the load daemon of each of the nodes. The group concept is the more natural way to implement this process scheme, because most often the messages transmitted involve processes that belong to the same group. Fig. 4 presents this communication hierarchy. DISTRIBUTION PROCESS OF NODE i RECEIVES THE AMOUNT OF LOAD DEMANDS THE LOAD DISTRIBUTION PROCESS OF NODE i RECEIVES THE LOAD COMPUTES THE AMOUNT OF LOAD SENDS THE AMOUNT OF LOAD DISTRIBUTION PROCESS OF NODE j RECEIVES THE LOAD REQUEST SENDS THE LOAD 5.3. Load daemon The main function of this process is to compute the local load index, to send this information to the load daemons of the other nodes and to transmit all of the information available to the local distribution process, whenever it is required to do so. Also, it is in charge of initializing and managing a table that stores the state of the other nodes. The table stores the following information for each of the nodes: computational power, average number of active tasks while the application is running, percentage of completed work, time of the last update, total execution time with some load level and number of signatures to be processed. At predetermined fixed intervals, the process evaluates the load index of the node where it is running and sends the state NOTATION: STORES THE LOAD SENDS THE EXECUTION ORDER TO THE SLAVE COMMUNICATION BETWEEN PROCESSES COMPUTATION INSIDE A NODE Fig. 3. General overview of the whole load balancing algorithm. information to the other nodes. The rest of the time it remains blocked waiting for messages from other processes; its functionality depends on the received messages. Table 1 summa-

9 1070 J.L. Bosque et al. / J. Parallel Distrib. Comput. 66 (2006) MPI_COMM_WORLD MPI_COMM_MS MASTER SLAVES MPI_COMM_DIST DISTRIBUTION PROCESS MPI_COMM_LOAD LOAD DAEMON Computational Power Fig. 4. Group scheme. Table 1 Messages and associated functions of the load daemon Message identifier Associated tasks 0 Task information 1 The distribution process has finished and demands the identifier of a transmitter node 2 The distribution process informs about the number of signatures delivered to other node 3 The distribution process notifies that there are no available nodes to transfer load 4 The distribution process shows the number of signatures obtained from other node 5 Another load daemon informs about its new number of signatures because their transference to other node 6 Another load daemon reports about the new number of signatures assigned to 7 Another load daemon tells that there are no nodes to transfer load rizes the messages involved with the load daemon and the tasks associated with each one Distribution process The main function of the load distribution process is to implement the initiation rule and the load balancing operation. Whenever a particular slave finishes its local work, the distribution process is then alerted, evaluating therefore the initiation rule, finding a candidate node, establishing the negotiation and delivering the load to the slave. On the other hand, if the node receives a load balancing request, the distribution rule must be triggered and the appropriate workload is sent to the remote node. 6. Analysis of the CBIR implementation with load balancing in a heterogeneous cluster A set of experiments have been performed for testing the behavior of the parallel CBIR system implemented on the heterogeneous cluster using the above distributed load balancing algorithm. To compare the results achieved by the parallel CBIR system with and without the distributed load balancing algo Processor Fig. 5. Computational power of the cluster nodes, measured in workload units/second. rithm, the total response time of the CBIR system, with and without load balance, has been measured. Additionally, two classical load balancing algorithms have been implemented as reference: the random algorithm [3] and the Probin algorithm [9]. The random algorithm is the one of the most simple and distributed load balancing algorithms because each node makes decisions based on local information. A node is considered sender if the queue length of the CPU exceeds a predetermined and constant threshold. The receiver is selected randomly because the nodes do not share any information about the status system. The Probin algorithm is a diffusion-based algorithm, where the information is locally exchanged defining communication domains between neighbor nodes. Several levels of coordination can be established varying the domains size. The experiments have been executed on a heterogeneous cluster composed of 25 nodes, linked through a 100 MB/s Ethernet. Each of the process nodes features 4 GB of storage capacity in an IDE hard disk linked through DMA with 16.6 MB/s transfer speed. The PC s operating system is Linux v The heterogeneity is determined by the hard disk features. It has to be noted that this component determines each node s response, as shown in Fig. 5, since in this CBIR system (as in many others), I/O operations are predominant with respect to CPU operations. Two different tests have been performed for measuring the improvement achieved by the heterogeneous cluster implementation using the distributed load balancing algorithm. The first one analyzes search operations within a 30 million image database using an underloaded system. Since none of the nodes are overloaded, this test studies how heterogeneity affects the system performance, and how this performance is improved using the load balancing algorithm. The second experiment adds some artificial external tasks to a node in order to test how well the load balancing algorithm copes with the situation of strong load unbalance. In this case the underloaded nodes should wait for the overloaded one to finish the application. The load balancing algorithm must remove the unloaded nodes idle time.

10 J.L. Bosque et al. / J. Parallel Distrib. Comput. 66 (2006) Execution time (seconds) (a) Without algorithm Random algorithm Probin algorithm Proposed algorithm Speedup Number of processes (b) Number of processors Random algorithm Probin algorithm Proposed algorithm Fig. 6. Results without external tasks (speedup with respect to the algorithm without load balancing): (a) response time and (b) speedup. Table 2 Response time without external workload, measured in seconds No. nodes Without alg. Random alg. Probin alg. Proposed alg Table 3 Speedup without external tasks No. nodes Speedup Speedup Speedup random alg. Probin alg. proposed alg Tests considering cluster heterogeneity and load balancing overhead The main purposes of these tests were to detect the amount of overhead introduced by the load balancing algorithm, and how the algorithm can manage the system heterogeneity. The tests were performed on clusters with 5, 10, 15, 20 and 25 slave nodes plus a master node, in order to evaluate the algorithm scalability. The results are presented in Table 2 and in Fig. 6. Table 2 shows that the response times are always shorter with some load balancing algorithm, which means that the overhead introduced by the algorithm is smaller than the improvements achieved by using any of the implemented load balancing algorithms. From these results two main considerations can be pointed out: The tested load balancing algorithms improved always the response times between 10% and 15%. The best results were achieved by the proposed algorithm. The proposed approach proved to be more stable, while the results obtained with the other algorithms were less consistent. Fig. 6(b) and Table 3 present the speedup of these algorithms, where the speedup refers to the improvements. An interesting parameter for estimating the methods behavior is the standard deviation of the response times of the different cluster nodes, shown in Table 4 and in Fig. 7. The standard deviation of the nodes response times is a measurement directly related to idling times of nodes waiting for other nodes to finish their assignments. Table 4 Standard deviation of the cluster nodes without external tasks No. nodes Without alg. Random alg. Probin alg. Proposed alg Time (seconds) Without algorithm Random algorithm Probin algorithm Proposed algorithm Number of processors Fig. 7. Standard deviation without external tasks.

11 1072 J.L. Bosque et al. / J. Parallel Distrib. Comput. 66 (2006) Execution time (seconds) (a) Without algorithm Random algorithm Probin algorithm Proposed algorithm Number of Processors Speedup (b) Rando algorithm Probin algorithm Proposed algorithm Number of processors Fig. 8. Results with external tasks: (a) response time and (b) speedup. Table 5 Response time with external tasks on a node, measured in seconds No. nodes Without alg. Random alg. Probin alg. Proposed alg The load balancing algorithm presented here decreases the standard deviation, equilibrating the response times, while with the random algorithm a slight reduction is achieved but with the Probin algorithm this value is erratic, depending highly on the probed nodes. Finally, the proposed algorithm achieves the best values of all the load balancing algorithms tested, ranging from a reduction of the standard deviation from 86.45% with 5 nodes to 93.56% with 25 nodes with respect to the response times without a load balancing algorithm Results with system overload For these experiments the system is slightly overloaded, having one of the nodes heavily loaded. The goal of this test is to measure the algorithm s ability to distribute the work of the loaded node among the remaining cluster nodes, without affecting the system performance. The tests were performed on a heterogeneous cluster with 5, 10, 15, 20, and 25 slave nodes and a master node, using a database of 12.5 million images. Table 5 and Fig. 8 present the results achieved in this experiment. For these tests, the differences obtained between executions with or without load balancing were very strong. The reductions in response times range from 45% with 5 nodes to 38% with 25 nodes. As the number of nodes increases, the differences in response times decrease. Again, the best results are achieved with the proposed algorithm. Table 6 and Fig. 8(b) show the speedup achieved in these tests. Finally, Table 7 and Fig. 9 present the standard deviation results. In these tests, the reduction of the standard deviation ranged from 90% to 95%. An interesting point to be remarked is the lack of consistency of the results provided by the random algo- Table 6 Speedup with external tasks No. nodes Speedup Speedup Speedup random alg. Probin alg. proposed alg Table 7 Standard deviation with external tasks No. nodes Without alg. Random alg. Probin alg. Proposed alg Time (seconds) Without algorithm Random algorithm Probin algorithm Proposed algorithm Number of processors Fig. 9. Standard deviation with external tasks. rithm. This method provides only marginal improvements with respect to the algorithm without load balancing for more than nodes. The Probin algorithm has a better behavior for less than 10 nodes although the relative improvements drop dramatically

12 J.L. Bosque et al. / J. Parallel Distrib. Comput. 66 (2006) Response time (seconds.) Without algorithm Proposed algorithm Number of tasks per node Fig. 10. Response time considering a loaded node for a 25 node cluster. Table 8 Response time increasing the number of external tasks, measured in seconds for a 25 node cluster No. tasks Without alg. Proposed alg above 15 nodes. Finally, the proposed algorithm has very stable values, achieving better results when the number of nodes is increased. The method takes advantage of the availability of additional nodes having them all finishing within a short time interval Results increasing the load of a heavily loaded node The last experiment presented in this paper has been performed increasing the number of external tasks on a heavily loaded node. This way, the unbalance among different nodes is higher, and the algorithms behavior when confronted with highly overloaded nodes can be checked. Table 8 and Fig. 10 present the results achieved. From the results above it can be seen that the response times without any load balancing algorithm increase linearly with the external load. When the proposed load balancing algorithm is introduced, depending on the system s external workload status, it is possible to achieve a small increment in the system s response time. But globally, the response times remain almost invariant when the amount of external workload is increased, as it can be seen in Table 8 and Fig. 10. This behavior proves that if there are some underloaded nodes the extra workload can be split among them and the application response time can be kept constant. 7. Conclusions and future work This paper begins with an analysis of the operations involved in a typical CBIR system. From the analysis of the sequential version it can be observed a lack of data or algorithmic dependencies. This allows efficient cluster implementations of CBIR systems since it is a parallel architecture that meets very well the application needs [6]. Improvements on the cluster implementation have been made by introducing a dynamic, distributed, global and scalable load balancing algorithm which has been designed specifically for the parallel CBIR application implemented on a heterogeneous cluster. An additional important feature is that the load balancing algorithm takes into account the system heterogeneity originated both by the different node computational attributes and by external factors such as the presence of external tasks. The experiments presented here show that the amount of overhead introduced by this method is very small. In fact, this overhead is hidden by the improvements achieved whenever any degree of system heterogeneity shows up, a common situation in grid systems. All these experiments have also shown that using the load balancing algorithm results in large execution time reductions and in a more uniform distribution of the node s response times, which can be detected through strong reductions in the response times standard deviation. As it has been shown in the experiments presented here, another important aspect that should be stressed is the algorithm scalability: increasing the number of system nodes does not significantly change the execution time increments originated by the introduction of the load balancing algorithm. At this moment, considering a network with a much higher number of nodes is not possible with the available resources. In any case, it is feasible and even simple to extend the current implementation to define a hierarchical algorithm using MPI communicators. The cluster version of the CBIR system that includes the load balancing algorithm is nowadays fully operative. Finally, further work will be devoted to the evaluation of the effects in the method s performance of using more complex node load indices and initiation rules. New efforts will be made in order to refine the primitives used in the CBIR system, and to introduce fault tolerance mechanisms in order to increase the system robustness. Analysis on the system response will also be made after distributing the database of the CBIR system between different clusters. Future migration of the implemented CBIR system to a grid will also be performed. Acknowledgments This work has been partially funded by the Spanish Ministry of Education and Science (Grant TIC C02) and Government of the Community of Madrid (Grant GR/SAL/0940/2004). References [1] I. Ahmad, A. Ghafoor, Semi-distributed load balancing for massively parallel multicomputer systems, IEEE Trans. Software Engrg. 17 (10) (1991) [2] I. Banicescu, R. Carino, J. Pabico, M. Balasubramaniam, Design and implementation of a novel dynamic load balancing library for cluster computing, Parallel Comput. 31 (7) (2005) [3] K.M. Baumgartner, B.W. Wah, Computer scheduling algorithms past, present and future, Inform. Sci (1991)

13 1074 J.L. Bosque et al. / J. Parallel Distrib. Comput. 66 (2006) [4] G. Bell, J. Gray, What s next in high-performance computing?, Commun. ACM 45 (2) (2002) [5] A.P. Berman, L.G. Shapiro, A flexible image database system for contentbased retrieval, Comput. Vision Image Understanding 75 (1/2) (1999) [6] J.L. Bosque, O.D. Robles, A. Rodríguez, L. Pastor, Study of a parallel CBIR implementation using MPI, in: V. Cantoni, C. Guerra (Eds.), Proceedings on International Workshop on Computer Architectures for Machine Perception, IEEE CAMP 2000, Padova, Italy, 2000, pp , ISBN [7] J.L. Bosque, O.D. Robles, L. Pastor, Load balancing algorithms for CBIR environments, in: Proceedings on International Workshop on Computer Architectures for Machine Perception, IEEE CAMP 2003, The Center for Advanced Computer Studies, University of Louisiana at Lafayette, IEEE, New Orleans, USA, 2003, ISBN [8] T.L. Casavant, J.G. Kuhl, A taxonomy of scheduling in general-purpose distributed computing systems, in: T.L. Casavant, M. Singhal (Eds.), Readings in Distributed Computing Systems, IEEE Computer Society Press, Los Alamitos, CA, 1994, pp [9] A. Corradi, L. Leonardi, F. Zambonelli, Diffusive load-balancing policies for dynamic applications, IEEE Concurrency 7 (1) (1999) [10] S.K. Das, D.J. Harvey, R. Biswas, Parallel processing of adaptive meshes with load balancing, IEEE Trans. Parallel Distributed Systems 12 (12) (2001) [11] I. Daubechies, Ten Lectures on Wavelets, vol. 61 of CBMS-NSF Regional Conference Series in Applied Mathematics, Society for Industrial and Applied Mathematics, Philadelphia, PA, [12] A. del Bimbo, Visual Information Retrieval, Morgan Kaufmann Publishers, San Francisco, CA, 1999, ISBN [13] Department of Computer Science and Engineering, The Chinese University of Hong Kong, DISCOVIR Distributed Content-based Visual Information Retrieval System on Peer-to-Peer(P2P) Network, Web, URL miplab/ discovir/. [14] Department of Diagnostic Radiology and Department of Medical Informatics and Division of Medical Image Processing and Lehrstuhl für Informatik VI of the Aachen University of Technology RWTH Achen, IRMA: Image Retrieval in Medical Applications, Web, URL index_en.php. [15] D. Grosu, A. Chronopoulos, M. Leung, Load balancing in distributed systems: an approach using cooperative games, in: 16th International Parallel and Distributed Processing Symposium IPDPS 02, IEEE, 2002, pp [16] N.J. Gunther, G. Beretta, A benchmark for image retrieval using distributed systems over the internet: birds-i, Technical Report HPL , Imaging Systems Laboratory, Hewlett Packard, December [17] R.M. Haralick, L.G. Shapiro, Computer and Robot Vision, vol. I, Addison-Wesley, Reading, MA, 1992, ISBN: [18] C.-C. Hui, S.T. Chanson, Hydrodynamic load balancing, IEEE Trans. Parallel Distributed Systems 10 (11) (1999) [19] S. Khoshafian, A.B. Baker, Multimedia and Imaging Databases, Morgan Kaufmann, San Francisco, CA, [20] T. Kunz, The influence of different workload descriptions on a heuristic load balancing scheme, IEEE Trans. Software Engrg. 17 (7) (1991) [21] L.J. Latecki, R. Melter, A. Gross, et al., Special issue on shape representation and similarity for image databases, Pattern Recognition 35 (1). [22] S. Mallat, A theory for multiresolution signal decomposition: the wavelet representation, IEEE Trans. Pattern Anal. Mach. Intell. 11 (7) (1989) [23] MPI Forum, A message-passing interface standard, URL [24] H. Müller, W. Müller, D.M. Squire, S. Marchand-Maillet, T. Pun, Performance evaluation in content-based image retrieval: overview and proposals, Pattern Recognition Lett. 22 (2001) [25] W. Obeloer, C. Grewe, H. Pals, Load management with mobile agents, in: 24th Euromicro Conference, vol. 2, IEEE, 1998, pp [26] P.S. Pacheco, Parallel Programming with MPI, Morgan Kaufmann Publishers Inc., San Francisco, [27] J.S. Payne, L. Hepplewhite, T.J. Stonham, Evaluating content-based image retrieval techniques using perceptually based metrics, in: Proceedings of SPIE on Applications of Artificial Neural Networks in Image Processing IV, vol. 3647, SPIE, 1999, pp [28] A. Rajagopalan, S. Hariri, An agent based dynamic load balancing system, in: International Workshop on Autonomous Decentralized Systems, IEEE, 2000, pp [29] O.D. Robles, A. Rodríguez, M.L. Córdoba, A study about multiresolution primitives for content-based image retrieval using wavelets, in: M.H. Hamza (Ed.), IASTED International Conference On Visualization, Imaging, and Image Processing (VIIP 2001), IASTED, ACTA Press, Marbella, Spain, 2001, pp , ISBN [30] A. Rodríguez, O.D. Robles, L. Pastor, New features for content-based image retrieval using wavelets, in: F. Muge, R.C. Pinto, M. Piedade (Eds.), V Ibero-american Symposium on Pattern Recognition, SIARP 2000, Lisbon, Portugal, 2000, pp , ISBN [31] S. Santini, Exploratory Image Databases: Content-based Retrieval, Communications, Networking, and Multimedia, Academic Press, New York, 2001, ISBN [32] S. Santini, A. Gupta, R. Jain, Emergent semantics through interaction in image databases, IEEE Trans. Knowledge Data Engrg. 13 (3) (2001) ISSN: [33] B. Schnor, S. Petri, R. Oleyniczak, H. Langendörfer, Scheduling of parallel applications on heterogeneous workstation clusters, in: K. Yetongnon, S. Hariri (Eds.), Proceedings of the ISCA Ninth International Conference on Parallel and Distributed Computing Systems, vol. 1, ISCA, Dijon, 1996, pp [34] A.W.M. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain, Contentbased image retrieval at the end of the early years, IEEE Trans. PAMI 22 (12) (2000) [35] M. Snir, S.W. Otto, S. Huss-Lederman, D.W. Walker, J. Dongarra, MPI: The Complete Reference, The MIT Press, Cambridge, [36] J.M. Squyres, K.L. Meyer, M.D. McNally, A. Lumsdaine, LAM/MPI User Guide, University of Notre Dame, lam 6.3, URL [37] S. Srakaew, N. Alexandridis, P.P. Nga, G. Blankenship, Content-based multimedia data retrieval on cluster system environment, in: P. Sloot, M. Bubak, A. Hoekstra, B. Hertzberger (Eds.), High-Performance Computing and Networking. Seventh International Conference, HPCN Europe 1999, Springer, Berlin, 1999, pp [38] E.J. Stollnitz, T.D. DeRose, D.H. Salesin, Wavelets for Computer Graphics, Morgan Kaufmann Publishers, San Francisco, [39] Y. Wang, Z. Liu, J.-C. Huang, Multimedia content analysis, IEEE Signal Process. Mag. 16 (6) (2000) [40] M.H. Willebeek-LeMair, A.P. Reeves, Strategies for dynamic load balancing on highly parallel computers, IEEE Trans. Parallel Distributed Systems 4 (9) (1993) [41] L. Xiao, S. Chen, X. Zhang, Dynamic cluster resource allocations for jobs with known and unknown memory demands, IEEE Trans. Parallel Distributed Systems 13 (3) (2002) [42] C. Xu, F. Lau, Load Balancing in Parallel Computers: Theory and Practice, Kluwer Academic Publishers, Dordrecht, [43] M.J. Zaki, Parallel and distributed association mining: a survey, IEEE Concurrency 7 (4) (1999) [44] M.J. Zaki, S. Pathasarathy, W. Li, Customized Dynamic Load Balancing, vol. 1, Architectures and Systems, Prentice-Hall PTR, Upper Saddle River, NJ, 1999 (Chapter 24). [45] A.Y. Zomaya, Y.-H. Teh, Observations on using genetic algorithms for dynamic load-balancing, IEEE Trans. Parallel Distributed Systems 12 (9) (2001) Jose L. Bosque graduated in Computer Science and Engineering from Universidad Politécnica de Madrid in He received the Ph.D. degree in Computer Science and Engineering from Universidad Politécnica de Madrid

14 J.L. Bosque et al. / J. Parallel Distrib. Comput. 66 (2006) in His Ph.D. was centered on theorical models and algorithms for heterogeneous clusters. He has been an associate professor at the Universidad Rey Juan Carlos in Madrid, Spain, since His research interest are parallel and distributed processing, performance and scalability evaluation and load balancing. Oscar D. Robles received his degree in Computer Science and Engineering and the Ph.D. degree from the Universidad Politécnica de Madrid in 1999 and 2004, respectively. His Ph.D. was centered on Content-based image and video retrieval techniques on parallel architectures. Currently he is Associate Professor in the Rey Juan Carlos University and has published works in the fields of multimedia retrieval and parallel computer systems. His research interests include content-based multimedia retrieval, as well as computer vision and computer grapyhics. He is an Eurographics member. Luis Pastor received the B.S.EE degree from the Universidad Politécnica de Madrid in 1981, the M.S.EE degree from Drexel University in 1983, and the Ph.D. degree from the Universidad Politécnica de Madrid in Currently he is Professor in the university Rey Juan Carlos (Madrid, Spain). His research interests include image processing and synthesis, virtual reality, 3D modeling and and parallel computing. Angel Rodríguez received his degree in Computer Science and Engineering and the Ph.D. degree from the Universidad Politécnica de Madrid in 1991 and 1999, respectively. His Ph.D. was centered on the tasks of modeling and recognizing 3D objects in parallel architectures. He is an Associate Professor in the Photonics Technology Department, Universidad Politécnica de Madrid (UPM), Spain and has published works in the fields of parallel computer systems, computer vision and computer graphics. He is an IEEE and an ACM member.

An Empirical Study and Analysis of the Dynamic Load Balancing Techniques Used in Parallel Computing Systems

An Empirical Study and Analysis of the Dynamic Load Balancing Techniques Used in Parallel Computing Systems An Empirical Study and Analysis of the Dynamic Load Balancing Techniques Used in Parallel Computing Systems Ardhendu Mandal and Subhas Chandra Pal Department of Computer Science and Application, University

More information

A Comparative Performance Analysis of Load Balancing Algorithms in Distributed System using Qualitative Parameters

A Comparative Performance Analysis of Load Balancing Algorithms in Distributed System using Qualitative Parameters A Comparative Performance Analysis of Load Balancing Algorithms in Distributed System using Qualitative Parameters Abhijit A. Rajguru, S.S. Apte Abstract - A distributed system can be viewed as a collection

More information

A Study on the Application of Existing Load Balancing Algorithms for Large, Dynamic, Heterogeneous Distributed Systems

A Study on the Application of Existing Load Balancing Algorithms for Large, Dynamic, Heterogeneous Distributed Systems A Study on the Application of Existing Load Balancing Algorithms for Large, Dynamic, Heterogeneous Distributed Systems RUPAM MUKHOPADHYAY, DIBYAJYOTI GHOSH AND NANDINI MUKHERJEE Department of Computer

More information

Various Schemes of Load Balancing in Distributed Systems- A Review

Various Schemes of Load Balancing in Distributed Systems- A Review 741 Various Schemes of Load Balancing in Distributed Systems- A Review Monika Kushwaha Pranveer Singh Institute of Technology Kanpur, U.P. (208020) U.P.T.U., Lucknow Saurabh Gupta Pranveer Singh Institute

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment

A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment Panagiotis D. Michailidis and Konstantinos G. Margaritis Parallel and Distributed

More information

Load Balancing on a Non-dedicated Heterogeneous Network of Workstations

Load Balancing on a Non-dedicated Heterogeneous Network of Workstations Load Balancing on a Non-dedicated Heterogeneous Network of Workstations Dr. Maurice Eggen Nathan Franklin Department of Computer Science Trinity University San Antonio, Texas 78212 Dr. Roger Eggen Department

More information

Scheduling Allowance Adaptability in Load Balancing technique for Distributed Systems

Scheduling Allowance Adaptability in Load Balancing technique for Distributed Systems Scheduling Allowance Adaptability in Load Balancing technique for Distributed Systems G.Rajina #1, P.Nagaraju #2 #1 M.Tech, Computer Science Engineering, TallaPadmavathi Engineering College, Warangal,

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 1 CHAPTER 1 INTRODUCTION 1.1 MOTIVATION OF RESEARCH Multicore processors have two or more execution cores (processors) implemented on a single chip having their own set of execution and architectural recourses.

More information

VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5

VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5 Performance Study VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5 VMware VirtualCenter uses a database to store metadata on the state of a VMware Infrastructure environment.

More information

Keywords: Dynamic Load Balancing, Process Migration, Load Indices, Threshold Level, Response Time, Process Age.

Keywords: Dynamic Load Balancing, Process Migration, Load Indices, Threshold Level, Response Time, Process Age. Volume 3, Issue 10, October 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Load Measurement

More information

A Review of Customized Dynamic Load Balancing for a Network of Workstations

A Review of Customized Dynamic Load Balancing for a Network of Workstations A Review of Customized Dynamic Load Balancing for a Network of Workstations Taken from work done by: Mohammed Javeed Zaki, Wei Li, Srinivasan Parthasarathy Computer Science Department, University of Rochester

More information

Distributed Dynamic Load Balancing for Iterative-Stencil Applications

Distributed Dynamic Load Balancing for Iterative-Stencil Applications Distributed Dynamic Load Balancing for Iterative-Stencil Applications G. Dethier 1, P. Marchot 2 and P.A. de Marneffe 1 1 EECS Department, University of Liege, Belgium 2 Chemical Engineering Department,

More information

Principles and characteristics of distributed systems and environments

Principles and characteristics of distributed systems and environments Principles and characteristics of distributed systems and environments Definition of a distributed system Distributed system is a collection of independent computers that appears to its users as a single

More information

Universidad Simón Bolívar

Universidad Simón Bolívar Cardinale, Yudith Figueira, Carlos Hernández, Emilio Baquero, Eduardo Berbín, Luis Bouza, Roberto Gamess, Eric García, Pedro Universidad Simón Bolívar In 1999, a couple of projects from USB received funding

More information

The Optimistic Total Order Protocol

The Optimistic Total Order Protocol From Spontaneous Total Order to Uniform Total Order: different degrees of optimistic delivery Luís Rodrigues Universidade de Lisboa ler@di.fc.ul.pt José Mocito Universidade de Lisboa jmocito@lasige.di.fc.ul.pt

More information

- An Essential Building Block for Stable and Reliable Compute Clusters

- An Essential Building Block for Stable and Reliable Compute Clusters Ferdinand Geier ParTec Cluster Competence Center GmbH, V. 1.4, March 2005 Cluster Middleware - An Essential Building Block for Stable and Reliable Compute Clusters Contents: Compute Clusters a Real Alternative

More information

A Dynamic Approach for Load Balancing using Clusters

A Dynamic Approach for Load Balancing using Clusters A Dynamic Approach for Load Balancing using Clusters ShwetaRajani 1, RenuBagoria 2 Computer Science 1,2,Global Technical Campus, Jaipur 1,JaganNath University, Jaipur 2 Email: shwetarajani28@yahoo.in 1

More information

Improved Hybrid Dynamic Load Balancing Algorithm for Distributed Environment

Improved Hybrid Dynamic Load Balancing Algorithm for Distributed Environment International Journal of Scientific and Research Publications, Volume 3, Issue 3, March 2013 1 Improved Hybrid Dynamic Load Balancing Algorithm for Distributed Environment UrjashreePatil*, RajashreeShedge**

More information

DECENTRALIZED LOAD BALANCING IN HETEROGENEOUS SYSTEMS USING DIFFUSION APPROACH

DECENTRALIZED LOAD BALANCING IN HETEROGENEOUS SYSTEMS USING DIFFUSION APPROACH DECENTRALIZED LOAD BALANCING IN HETEROGENEOUS SYSTEMS USING DIFFUSION APPROACH P.Neelakantan Department of Computer Science & Engineering, SVCET, Chittoor pneelakantan@rediffmail.com ABSTRACT The grid

More information

Reliable Systolic Computing through Redundancy

Reliable Systolic Computing through Redundancy Reliable Systolic Computing through Redundancy Kunio Okuda 1, Siang Wun Song 1, and Marcos Tatsuo Yamamoto 1 Universidade de São Paulo, Brazil, {kunio,song,mty}@ime.usp.br, http://www.ime.usp.br/ song/

More information

Optimal Load Balancing in a Beowulf Cluster. Daniel Alan Adams. A Thesis. Submitted to the Faculty WORCESTER POLYTECHNIC INSTITUTE

Optimal Load Balancing in a Beowulf Cluster. Daniel Alan Adams. A Thesis. Submitted to the Faculty WORCESTER POLYTECHNIC INSTITUTE Optimal Load Balancing in a Beowulf Cluster by Daniel Alan Adams A Thesis Submitted to the Faculty of WORCESTER POLYTECHNIC INSTITUTE in partial fulfillment of the requirements for the Degree of Master

More information

Load Distribution in Large Scale Network Monitoring Infrastructures

Load Distribution in Large Scale Network Monitoring Infrastructures Load Distribution in Large Scale Network Monitoring Infrastructures Josep Sanjuàs-Cuxart, Pere Barlet-Ros, Gianluca Iannaccone, and Josep Solé-Pareta Universitat Politècnica de Catalunya (UPC) {jsanjuas,pbarlet,pareta}@ac.upc.edu

More information

An Approach to Load Balancing In Cloud Computing

An Approach to Load Balancing In Cloud Computing An Approach to Load Balancing In Cloud Computing Radha Ramani Malladi Visiting Faculty, Martins Academy, Bangalore, India ABSTRACT: Cloud computing is a structured model that defines computing services,

More information

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next

More information

A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture

A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture Yangsuk Kee Department of Computer Engineering Seoul National University Seoul, 151-742, Korea Soonhoi

More information

CHAPTER 5 WLDMA: A NEW LOAD BALANCING STRATEGY FOR WAN ENVIRONMENT

CHAPTER 5 WLDMA: A NEW LOAD BALANCING STRATEGY FOR WAN ENVIRONMENT 81 CHAPTER 5 WLDMA: A NEW LOAD BALANCING STRATEGY FOR WAN ENVIRONMENT 5.1 INTRODUCTION Distributed Web servers on the Internet require high scalability and availability to provide efficient services to

More information

Optimal Service Pricing for a Cloud Cache

Optimal Service Pricing for a Cloud Cache Optimal Service Pricing for a Cloud Cache K.SRAVANTHI Department of Computer Science & Engineering (M.Tech.) Sindura College of Engineering and Technology Ramagundam,Telangana G.LAKSHMI Asst. Professor,

More information

Praktikum Wissenschaftliches Rechnen (Performance-optimized optimized Programming)

Praktikum Wissenschaftliches Rechnen (Performance-optimized optimized Programming) Praktikum Wissenschaftliches Rechnen (Performance-optimized optimized Programming) Dynamic Load Balancing Dr. Ralf-Peter Mundani Center for Simulation Technology in Engineering Technische Universität München

More information

Dynamic load balancing of parallel cellular automata

Dynamic load balancing of parallel cellular automata Dynamic load balancing of parallel cellular automata Marc Mazzariol, Benoit A. Gennart, Roger D. Hersch Ecole Polytechnique Fédérale de Lausanne, EPFL * ABSTRACT We are interested in running in parallel

More information

A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster

A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster Acta Technica Jaurinensis Vol. 3. No. 1. 010 A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster G. Molnárka, N. Varjasi Széchenyi István University Győr, Hungary, H-906

More information

PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design

PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions Slide 1 Outline Principles for performance oriented design Performance testing Performance tuning General

More information

Load Balancing in cloud computing

Load Balancing in cloud computing Load Balancing in cloud computing 1 Foram F Kherani, 2 Prof.Jignesh Vania Department of computer engineering, Lok Jagruti Kendra Institute of Technology, India 1 kheraniforam@gmail.com, 2 jigumy@gmail.com

More information

Operatin g Systems: Internals and Design Principle s. Chapter 10 Multiprocessor and Real-Time Scheduling Seventh Edition By William Stallings

Operatin g Systems: Internals and Design Principle s. Chapter 10 Multiprocessor and Real-Time Scheduling Seventh Edition By William Stallings Operatin g Systems: Internals and Design Principle s Chapter 10 Multiprocessor and Real-Time Scheduling Seventh Edition By William Stallings Operating Systems: Internals and Design Principles Bear in mind,

More information

Comparative Study of Load Balancing Algorithms

Comparative Study of Load Balancing Algorithms IOSR Journal of Engineering (IOSRJEN) e-issn: 2250-3021, p-issn: 2278-8719 Vol. 3, Issue 3 (Mar. 2013), V2 PP 45-50 Comparative Study of Load Balancing Algorithms Jyoti Vashistha 1, Anant Kumar Jayswal

More information

Study of Various Load Balancing Techniques in Cloud Environment- A Review

Study of Various Load Balancing Techniques in Cloud Environment- A Review International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-04 E-ISSN: 2347-2693 Study of Various Load Balancing Techniques in Cloud Environment- A Review Rajdeep

More information

Energy Efficient MapReduce

Energy Efficient MapReduce Energy Efficient MapReduce Motivation: Energy consumption is an important aspect of datacenters efficiency, the total power consumption in the united states has doubled from 2000 to 2005, representing

More information

Comparison on Different Load Balancing Algorithms of Peer to Peer Networks

Comparison on Different Load Balancing Algorithms of Peer to Peer Networks Comparison on Different Load Balancing Algorithms of Peer to Peer Networks K.N.Sirisha *, S.Bhagya Rekha M.Tech,Software Engineering Noble college of Engineering & Technology for Women Web Technologies

More information

Keywords Load balancing, Dispatcher, Distributed Cluster Server, Static Load balancing, Dynamic Load balancing.

Keywords Load balancing, Dispatcher, Distributed Cluster Server, Static Load balancing, Dynamic Load balancing. Volume 5, Issue 7, July 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Hybrid Algorithm

More information

Storage Systems Autumn 2009. Chapter 6: Distributed Hash Tables and their Applications André Brinkmann

Storage Systems Autumn 2009. Chapter 6: Distributed Hash Tables and their Applications André Brinkmann Storage Systems Autumn 2009 Chapter 6: Distributed Hash Tables and their Applications André Brinkmann Scaling RAID architectures Using traditional RAID architecture does not scale Adding news disk implies

More information

RESEARCH PAPER International Journal of Recent Trends in Engineering, Vol 1, No. 1, May 2009

RESEARCH PAPER International Journal of Recent Trends in Engineering, Vol 1, No. 1, May 2009 An Algorithm for Dynamic Load Balancing in Distributed Systems with Multiple Supporting Nodes by Exploiting the Interrupt Service Parveen Jain 1, Daya Gupta 2 1,2 Delhi College of Engineering, New Delhi,

More information

CYCLIC: A LOCALITY-PRESERVING LOAD-BALANCING ALGORITHM FOR PDES ON SHARED MEMORY MULTIPROCESSORS

CYCLIC: A LOCALITY-PRESERVING LOAD-BALANCING ALGORITHM FOR PDES ON SHARED MEMORY MULTIPROCESSORS Computing and Informatics, Vol. 31, 2012, 1255 1278 CYCLIC: A LOCALITY-PRESERVING LOAD-BALANCING ALGORITHM FOR PDES ON SHARED MEMORY MULTIPROCESSORS Antonio García-Dopico, Antonio Pérez Santiago Rodríguez,

More information

Proposal of Dynamic Load Balancing Algorithm in Grid System

Proposal of Dynamic Load Balancing Algorithm in Grid System www.ijcsi.org 186 Proposal of Dynamic Load Balancing Algorithm in Grid System Sherihan Abu Elenin Faculty of Computers and Information Mansoura University, Egypt Abstract This paper proposed dynamic load

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.

More information

Mobile Storage and Search Engine of Information Oriented to Food Cloud

Mobile Storage and Search Engine of Information Oriented to Food Cloud Advance Journal of Food Science and Technology 5(10): 1331-1336, 2013 ISSN: 2042-4868; e-issn: 2042-4876 Maxwell Scientific Organization, 2013 Submitted: May 29, 2013 Accepted: July 04, 2013 Published:

More information

Load Balancing in the Cloud Computing Using Virtual Machine Migration: A Review

Load Balancing in the Cloud Computing Using Virtual Machine Migration: A Review Load Balancing in the Cloud Computing Using Virtual Machine Migration: A Review 1 Rukman Palta, 2 Rubal Jeet 1,2 Indo Global College Of Engineering, Abhipur, Punjab Technical University, jalandhar,india

More information

Figure 1. The cloud scales: Amazon EC2 growth [2].

Figure 1. The cloud scales: Amazon EC2 growth [2]. - Chung-Cheng Li and Kuochen Wang Department of Computer Science National Chiao Tung University Hsinchu, Taiwan 300 shinji10343@hotmail.com, kwang@cs.nctu.edu.tw Abstract One of the most important issues

More information

Distributed Database for Environmental Data Integration

Distributed Database for Environmental Data Integration Distributed Database for Environmental Data Integration A. Amato', V. Di Lecce2, and V. Piuri 3 II Engineering Faculty of Politecnico di Bari - Italy 2 DIASS, Politecnico di Bari, Italy 3Dept Information

More information

Load Balancing in Fault Tolerant Video Server

Load Balancing in Fault Tolerant Video Server Load Balancing in Fault Tolerant Video Server # D. N. Sujatha*, Girish K*, Rashmi B*, Venugopal K. R*, L. M. Patnaik** *Department of Computer Science and Engineering University Visvesvaraya College of

More information

High Performance Cluster Support for NLB on Window

High Performance Cluster Support for NLB on Window High Performance Cluster Support for NLB on Window [1]Arvind Rathi, [2] Kirti, [3] Neelam [1]M.Tech Student, Department of CSE, GITM, Gurgaon Haryana (India) arvindrathi88@gmail.com [2]Asst. Professor,

More information

A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems*

A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems* A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems* Junho Jang, Saeyoung Han, Sungyong Park, and Jihoon Yang Department of Computer Science and Interdisciplinary Program

More information

Multilevel Load Balancing in NUMA Computers

Multilevel Load Balancing in NUMA Computers FACULDADE DE INFORMÁTICA PUCRS - Brazil http://www.pucrs.br/inf/pos/ Multilevel Load Balancing in NUMA Computers M. Corrêa, R. Chanin, A. Sales, R. Scheer, A. Zorzo Technical Report Series Number 049 July,

More information

Preserving Message Integrity in Dynamic Process Migration

Preserving Message Integrity in Dynamic Process Migration Preserving Message Integrity in Dynamic Process Migration E. Heymann, F. Tinetti, E. Luque Universidad Autónoma de Barcelona Departamento de Informática 8193 - Bellaterra, Barcelona, Spain e-mail: e.heymann@cc.uab.es

More information

16.1 MAPREDUCE. For personal use only, not for distribution. 333

16.1 MAPREDUCE. For personal use only, not for distribution. 333 For personal use only, not for distribution. 333 16.1 MAPREDUCE Initially designed by the Google labs and used internally by Google, the MAPREDUCE distributed programming model is now promoted by several

More information

An Ontology Based Method to Solve Query Identifier Heterogeneity in Post- Genomic Clinical Trials

An Ontology Based Method to Solve Query Identifier Heterogeneity in Post- Genomic Clinical Trials ehealth Beyond the Horizon Get IT There S.K. Andersen et al. (Eds.) IOS Press, 2008 2008 Organizing Committee of MIE 2008. All rights reserved. 3 An Ontology Based Method to Solve Query Identifier Heterogeneity

More information

Fault Tolerance in Hadoop for Work Migration

Fault Tolerance in Hadoop for Work Migration 1 Fault Tolerance in Hadoop for Work Migration Shivaraman Janakiraman Indiana University Bloomington ABSTRACT Hadoop is a framework that runs applications on large clusters which are built on numerous

More information

Dynamic Load Balancing in a Network of Workstations

Dynamic Load Balancing in a Network of Workstations Dynamic Load Balancing in a Network of Workstations 95.515F Research Report By: Shahzad Malik (219762) November 29, 2000 Table of Contents 1 Introduction 3 2 Load Balancing 4 2.1 Static Load Balancing

More information

Load Balancing In Concurrent Parallel Applications

Load Balancing In Concurrent Parallel Applications Load Balancing In Concurrent Parallel Applications Jeff Figler Rochester Institute of Technology Computer Engineering Department Rochester, New York 14623 May 1999 Abstract A parallel concurrent application

More information

How To Understand The Concept Of A Distributed System

How To Understand The Concept Of A Distributed System Distributed Operating Systems Introduction Ewa Niewiadomska-Szynkiewicz and Adam Kozakiewicz ens@ia.pw.edu.pl, akozakie@ia.pw.edu.pl Institute of Control and Computation Engineering Warsaw University of

More information

Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003

Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003 Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003 Josef Pelikán Charles University in Prague, KSVI Department, Josef.Pelikan@mff.cuni.cz Abstract 1 Interconnect quality

More information

WHITE PAPER Guide to 50% Faster VMs No Hardware Required

WHITE PAPER Guide to 50% Faster VMs No Hardware Required WHITE PAPER Guide to 50% Faster VMs No Hardware Required Think Faster. Visit us at Condusiv.com GUIDE TO 50% FASTER VMS NO HARDWARE REQUIRED 2 Executive Summary As much as everyone has bought into the

More information

Proposal and Development of a Reconfigurable Parallel Job Scheduling Algorithm

Proposal and Development of a Reconfigurable Parallel Job Scheduling Algorithm Proposal and Development of a Reconfigurable Parallel Job Scheduling Algorithm Luís Fabrício Wanderley Góes, Carlos Augusto Paiva da Silva Martins Graduate Program in Electrical Engineering PUC Minas {lfwgoes,capsm}@pucminas.br

More information

Operating System Multilevel Load Balancing

Operating System Multilevel Load Balancing Operating System Multilevel Load Balancing M. Corrêa, A. Zorzo Faculty of Informatics - PUCRS Porto Alegre, Brazil {mcorrea, zorzo}@inf.pucrs.br R. Scheer HP Brazil R&D Porto Alegre, Brazil roque.scheer@hp.com

More information

Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies

Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies Somesh S Chavadi 1, Dr. Asha T 2 1 PG Student, 2 Professor, Department of Computer Science and Engineering,

More information

Performance evaluation of Web Information Retrieval Systems and its application to e-business

Performance evaluation of Web Information Retrieval Systems and its application to e-business Performance evaluation of Web Information Retrieval Systems and its application to e-business Fidel Cacheda, Angel Viña Departament of Information and Comunications Technologies Facultad de Informática,

More information

2. Research and Development on the Autonomic Operation. Control Infrastructure Technologies in the Cloud Computing Environment

2. Research and Development on the Autonomic Operation. Control Infrastructure Technologies in the Cloud Computing Environment R&D supporting future cloud computing infrastructure technologies Research and Development on Autonomic Operation Control Infrastructure Technologies in the Cloud Computing Environment DEMPO Hiroshi, KAMI

More information

How To Compare Load Sharing And Job Scheduling In A Network Of Workstations

How To Compare Load Sharing And Job Scheduling In A Network Of Workstations A COMPARISON OF LOAD SHARING AND JOB SCHEDULING IN A NETWORK OF WORKSTATIONS HELEN D. KARATZA Department of Informatics Aristotle University of Thessaloniki 546 Thessaloniki, GREECE Email: karatza@csd.auth.gr

More information

Computing Load Aware and Long-View Load Balancing for Cluster Storage Systems

Computing Load Aware and Long-View Load Balancing for Cluster Storage Systems 215 IEEE International Conference on Big Data (Big Data) Computing Load Aware and Long-View Load Balancing for Cluster Storage Systems Guoxin Liu and Haiying Shen and Haoyu Wang Department of Electrical

More information

Real Time Network Server Monitoring using Smartphone with Dynamic Load Balancing

Real Time Network Server Monitoring using Smartphone with Dynamic Load Balancing www.ijcsi.org 227 Real Time Network Server Monitoring using Smartphone with Dynamic Load Balancing Dhuha Basheer Abdullah 1, Zeena Abdulgafar Thanoon 2, 1 Computer Science Department, Mosul University,

More information

How To Balance In Cloud Computing

How To Balance In Cloud Computing A Review on Load Balancing Algorithms in Cloud Hareesh M J Dept. of CSE, RSET, Kochi hareeshmjoseph@ gmail.com John P Martin Dept. of CSE, RSET, Kochi johnpm12@gmail.com Yedhu Sastri Dept. of IT, RSET,

More information

Apache Hama Design Document v0.6

Apache Hama Design Document v0.6 Apache Hama Design Document v0.6 Introduction Hama Architecture BSPMaster GroomServer Zookeeper BSP Task Execution Job Submission Job and Task Scheduling Task Execution Lifecycle Synchronization Fault

More information

Cellular Computing on a Linux Cluster

Cellular Computing on a Linux Cluster Cellular Computing on a Linux Cluster Alexei Agueev, Bernd Däne, Wolfgang Fengler TU Ilmenau, Department of Computer Architecture Topics 1. Cellular Computing 2. The Experiment 3. Experimental Results

More information

Feedback guided load balancing in a distributed memory environment

Feedback guided load balancing in a distributed memory environment Feedback guided load balancing in a distributed memory environment Constantinos Christofi August 18, 2011 MSc in High Performance Computing The University of Edinburgh Year of Presentation: 2011 Abstract

More information

The Sierra Clustered Database Engine, the technology at the heart of

The Sierra Clustered Database Engine, the technology at the heart of A New Approach: Clustrix Sierra Database Engine The Sierra Clustered Database Engine, the technology at the heart of the Clustrix solution, is a shared-nothing environment that includes the Sierra Parallel

More information

Networking in the Hadoop Cluster

Networking in the Hadoop Cluster Hadoop and other distributed systems are increasingly the solution of choice for next generation data volumes. A high capacity, any to any, easily manageable networking layer is critical for peak Hadoop

More information

Performance Analysis of Load Balancing Algorithms in Distributed System

Performance Analysis of Load Balancing Algorithms in Distributed System Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 4, Number 1 (2014), pp. 59-66 Research India Publications http://www.ripublication.com/aeee.htm Performance Analysis of Load Balancing

More information

OpenMosix Presented by Dr. Moshe Bar and MAASK [01]

OpenMosix Presented by Dr. Moshe Bar and MAASK [01] OpenMosix Presented by Dr. Moshe Bar and MAASK [01] openmosix is a kernel extension for single-system image clustering. openmosix [24] is a tool for a Unix-like kernel, such as Linux, consisting of adaptive

More information

Everything you need to know about flash storage performance

Everything you need to know about flash storage performance Everything you need to know about flash storage performance The unique characteristics of flash make performance validation testing immensely challenging and critically important; follow these best practices

More information

A Clustered Approach for Load Balancing in Distributed Systems

A Clustered Approach for Load Balancing in Distributed Systems SSRG International Journal of Mobile Computing & Application (SSRG-IJMCA) volume 2 Issue 1 Jan to Feb 2015 A Clustered Approach for Load Balancing in Distributed Systems Shweta Rajani 1, Niharika Garg

More information

The Classical Architecture. Storage 1 / 36

The Classical Architecture. Storage 1 / 36 1 / 36 The Problem Application Data? Filesystem Logical Drive Physical Drive 2 / 36 Requirements There are different classes of requirements: Data Independence application is shielded from physical storage

More information

Fair Scheduling Algorithm with Dynamic Load Balancing Using In Grid Computing

Fair Scheduling Algorithm with Dynamic Load Balancing Using In Grid Computing Research Inventy: International Journal Of Engineering And Science Vol.2, Issue 10 (April 2013), Pp 53-57 Issn(e): 2278-4721, Issn(p):2319-6483, Www.Researchinventy.Com Fair Scheduling Algorithm with Dynamic

More information

Overlapping Data Transfer With Application Execution on Clusters

Overlapping Data Transfer With Application Execution on Clusters Overlapping Data Transfer With Application Execution on Clusters Karen L. Reid and Michael Stumm reid@cs.toronto.edu stumm@eecg.toronto.edu Department of Computer Science Department of Electrical and Computer

More information

Influence of Load Balancing on Quality of Real Time Data Transmission*

Influence of Load Balancing on Quality of Real Time Data Transmission* SERBIAN JOURNAL OF ELECTRICAL ENGINEERING Vol. 6, No. 3, December 2009, 515-524 UDK: 004.738.2 Influence of Load Balancing on Quality of Real Time Data Transmission* Nataša Maksić 1,a, Petar Knežević 2,

More information

Distributed Systems LEEC (2005/06 2º Sem.)

Distributed Systems LEEC (2005/06 2º Sem.) Distributed Systems LEEC (2005/06 2º Sem.) Introduction João Paulo Carvalho Universidade Técnica de Lisboa / Instituto Superior Técnico Outline Definition of a Distributed System Goals Connecting Users

More information

In-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps. Yu Su, Yi Wang, Gagan Agrawal The Ohio State University

In-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps. Yu Su, Yi Wang, Gagan Agrawal The Ohio State University In-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps Yu Su, Yi Wang, Gagan Agrawal The Ohio State University Motivation HPC Trends Huge performance gap CPU: extremely fast for generating

More information

A Trust Evaluation Model for QoS Guarantee in Cloud Systems *

A Trust Evaluation Model for QoS Guarantee in Cloud Systems * A Trust Evaluation Model for QoS Guarantee in Cloud Systems * Hyukho Kim, Hana Lee, Woongsup Kim, Yangwoo Kim Dept. of Information and Communication Engineering, Dongguk University Seoul, 100-715, South

More information

LaPIe: Collective Communications adapted to Grid Environments

LaPIe: Collective Communications adapted to Grid Environments LaPIe: Collective Communications adapted to Grid Environments Luiz Angelo Barchet-Estefanel Thesis Supervisor: M Denis TRYSTRAM Co-Supervisor: M Grégory MOUNIE ID-IMAG Laboratory Grenoble - France LaPIe:

More information

The Complete Performance Solution for Microsoft SQL Server

The Complete Performance Solution for Microsoft SQL Server The Complete Performance Solution for Microsoft SQL Server Powerful SSAS Performance Dashboard Innovative Workload and Bottleneck Profiling Capture of all Heavy MDX, XMLA and DMX Aggregation, Partition,

More information

Group Based Load Balancing Algorithm in Cloud Computing Virtualization

Group Based Load Balancing Algorithm in Cloud Computing Virtualization Group Based Load Balancing Algorithm in Cloud Computing Virtualization Rishi Bhardwaj, 2 Sangeeta Mittal, Student, 2 Assistant Professor, Department of Computer Science, Jaypee Institute of Information

More information

Running a Workflow on a PowerCenter Grid

Running a Workflow on a PowerCenter Grid Running a Workflow on a PowerCenter Grid 2010-2014 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)

More information

Cluster, Grid, Cloud Concepts

Cluster, Grid, Cloud Concepts Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of

More information

Grid Computing Approach for Dynamic Load Balancing

Grid Computing Approach for Dynamic Load Balancing International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-1 E-ISSN: 2347-2693 Grid Computing Approach for Dynamic Load Balancing Kapil B. Morey 1*, Sachin B. Jadhav

More information

A Distributed Render Farm System for Animation Production

A Distributed Render Farm System for Animation Production A Distributed Render Farm System for Animation Production Jiali Yao, Zhigeng Pan *, Hongxin Zhang State Key Lab of CAD&CG, Zhejiang University, Hangzhou, 310058, China {yaojiali, zgpan, zhx}@cad.zju.edu.cn

More information

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN 1 PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster Construction

More information

BSPCloud: A Hybrid Programming Library for Cloud Computing *

BSPCloud: A Hybrid Programming Library for Cloud Computing * BSPCloud: A Hybrid Programming Library for Cloud Computing * Xiaodong Liu, Weiqin Tong and Yan Hou Department of Computer Engineering and Science Shanghai University, Shanghai, China liuxiaodongxht@qq.com,

More information

Los Angeles, CA, USA 90089-2561 [kunfu, rzimmerm]@usc.edu

Los Angeles, CA, USA 90089-2561 [kunfu, rzimmerm]@usc.edu !"$#% &' ($)+*,#% *.- Kun Fu a and Roger Zimmermann a a Integrated Media Systems Center, University of Southern California Los Angeles, CA, USA 90089-56 [kunfu, rzimmerm]@usc.edu ABSTRACT Presently, IP-networked

More information

This is an author-deposited version published in : http://oatao.univ-toulouse.fr/ Eprints ID : 12518

This is an author-deposited version published in : http://oatao.univ-toulouse.fr/ Eprints ID : 12518 Open Archive TOULOUSE Archive Ouverte (OATAO) OATAO is an open access repository that collects the work of Toulouse researchers and makes it freely available over the web where possible. This is an author-deposited

More information

G22.3250-001. Porcupine. Robert Grimm New York University

G22.3250-001. Porcupine. Robert Grimm New York University G22.3250-001 Porcupine Robert Grimm New York University Altogether Now: The Three Questions! What is the problem?! What is new or different?! What are the contributions and limitations? Porcupine from

More information

An Active Packet can be classified as

An Active Packet can be classified as Mobile Agents for Active Network Management By Rumeel Kazi and Patricia Morreale Stevens Institute of Technology Contact: rkazi,pat@ati.stevens-tech.edu Abstract-Traditionally, network management systems

More information

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Kenneth B. Kent University of New Brunswick Faculty of Computer Science Fredericton, New Brunswick, Canada ken@unb.ca Micaela Serra

More information