A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment
|
|
- Clifford Allison
- 8 years ago
- Views:
Transcription
1 A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment Panagiotis D. Michailidis and Konstantinos G. Margaritis Parallel and Distributed Processing Laboratory Department of Applied Informatics, University of Macedonia 16 Egnatia str., P.O. Box 191, 4006 Thessaloniki, Greece Abstract. In this paper, we present three parallel approximate string matching methods on a parallel architecture with heterogeneous workstations to gain supercomputer power at low cost. The first method is the static master-worker with uniform distribution strategy, the second one is the dynamic master-worker with allocation of subtexts and the third one is the dynamic master-worker with allocation of text pointers. Further, we propose a hybrid parallel method that combines the advantages of static and dynamic parallel methods in order to reduce the load imbalance and communication overhead. This hybrid method is based on the following optimal distribution strategy: the text collection is distributed proportional to workstation s speed. We evaluated the performance of four methods with clusters 1, 2, 4, 6 and 8 heterogeneous workstations. The experimental results demonstrate that the dynamic allocation of text pointers and hybrid methods achieve better performance than the two original ones. 1 Introduction Approximate string matching is one of the main problems in classical string algorithms, with applications to information and multimedia retrieval, computational biology, pattern recognition, Web search engines and text mining. It is defined as follows: given a large text collection t = t 1 t 2...t n of length n, ashort pattern p = p 1 p 2...p m of length m and a maximal number of errors allowed k, we want to find all text positions where the pattern matches the text up to k errors. Errors can be substituting, deleting, or inserting a character. In the on-line version of the problem, it is possible to preprocess the pattern but not the text collection. The classical solution involves dynamic programming and needs O(mn) time[14]. Recently, a number of sequential algorithms improved the classical time consuming one; see for instance the surveys [7,11]. Some of them are sublinear in the sense that they do not inspect all the characters of the text collection. D. Kranzlmüller et al. (Eds.): Euro PVM/MPI 2002, LNCS 2474, pp , c Springer-Verlag Berlin Heidelberg 2002
2 A Performance Study of Load Balancing Strategies 433 We are particularly interested in information retrieval, where current free text collections is normally so very large that even the fastest on-line sequential algorithms are not practical, and therefore the parallel and distributed processing becomes necessary. There are two basic methods to improve the performance of approximate string matching on large text collections: one is based on the finegrain parallelization of the approximate string matching algorithm [2,12,13,6,4,] and the other is based on the distribution of the computation of character comparisons on supercomputers or network of workstations. As far as the second method, is concerned distributed implementations of approximate string matching algorithm are not available in the literature. However, we are aware of few attempts for implementing other similar problems on a cluster of workstations. In [3] a exact string matching implementation have been proposed and results are reported on a transputer based architecture. In [9,10] a exact string matching algorithm was parallelized and modeled on a homogeneous platform giving positive experimental results. Finally, in [,16] presented parallelizations of a biological sequence analysis algorithm on a homogenous cluster of workstations and on an Intel ipsc/860 parallel computer respectively. However, the general efficient algorithms for the master-worker paradigm on heterogeneous clusters have been widely developed in [1]. The main contribution of this work is three low-cost parallel approximate string matching approaches that can search in very large free textbases on inexpensive cluster of heterogeneous PCs or workstations running Linux operating system. These approaches are based on master-worker model using static and dynamic allocation of the text collection. Further, we propose a hybrid parallel approach that combines the advantages of three previous parallel approaches in order to reduce the load imbalance and communication overhead. This hybrid approach is based on the following optimal distribution strategy: the text collection is distributed proportional to workstation s speed. The four approaches are implemented using the MPI library [1] over a cluster of heterogeneous workstations. To the best of our knowledge, this is the first attempt the implementation of approximate string matching application using static and dynamic load balancing strategies on a network of heterogeneous workstations. 2 MPI Master-Worker Implementations of Approximate String Matching We follow master-worker programming model to develop our parallel and distributed approximate string matching implementations under MPI library [1]. 2.1 Static Master-Worker Implementation In order to present the static master-worker implementation we make the following assumptions: First, the workstations are numbered from 0 to p 1, second, the documents of our text collection are distributed among the various workstations and stored on their local disks and finally, the pattern and the number
3 434 Panagiotis D. Michailidis and Konstantinos G. Margaritis of errors k are stored in the main memory to all workstations. The partitioning strategy of this approach is to partition the entire text collection into a number of the subtext collections according to the number of workstations allocated. The size of each subtext collection should be equal to the size of the text collection divided by the number of allocated workstations. Therefore, the static master-worker implementation that is called P1 is composed of four phases. In first phase, the master broadcasts the pattern string and the number of errors k to all workers. In second phase, each worker reads its subtext collection from the local disk in the main memory. In third phase, each worker performs character comparisons using a local sequential approximate string matching algorithm to generate the number of occurrences. In fourth phase, the master collects the number of occurrences from each worker. The advantage of this simple approach is low communication overhead. This advantage was achieved, a priori, by the search computation, assigning each worker to search its own subtext independently without have to communicate with the other workers or the master. However, the main disadvantage is the possible load imbalance because of the poor partitioning technique. In other words, there is a significant idle time for faster or more lightly loaded workstations on a heterogeneous environment. 2.2 Dynamic Master-Worker Implementations In this subsection, we implement two versions of the dynamic master-worker model. The first version is based on the dynamic allocation of subtexts and the second one is based on the dynamic allocation of text pointers. Dynamic Allocation of Subtexts The dynamic master-worker strategy that we adopted is a known parallelization strategy and is known as workstation farm. Before, we present the dynamic implementation we make the following assumption: the entire text collection is stored on the local disk of the master workstation. The dynamic master-worker implementation that is called is composed of six phases. In first phase, the master broadcasts the pattern string and the number of errors k to all workers. In second phase, the master reads from the local disk the several chunks of the text collection. The size of each chunk (sb) is an important parameter which can be affect the overall performance. More specifically, this parameter is directly related to the I/O and communication factors. We selected several sizes of each chunk in order to find the best performance as we presented in our experiments [8]. In third phase, the master sends the first chunks of the text collection to corresponding worker workstations. In fourth phase, each worker workstation performs a sequential approximate string matching algorithm between the corresponding chunk of text and the pattern in order to generate the number of occurrences. In fifth phase, each worker sends the number of occurrences back to master workstation. In sixth phase, if there are still any chunks of the text collection left, the master reads and distributes next chunks of the text collection to workers and loops back to fourth phase.
4 A Performance Study of Load Balancing Strategies 43 The advantage of this dynamic approach is low load imbalance, while the disadvantage is higher inter-workstation communication overhead. Dynamic Allocation of Text Pointers Before, we present the dynamic implementation with the text pointers we make the following assumptions: First, the complete text collection is stored on the local disks of all workstations and second, the master workstation has a text pointer that shows the current position in the text collection. The dynamic allocation of text pointers that is called is composed of six phases. In first phase, the master broadcasts the pattern string and the number of errors k to all workers. In second phase, the master sends the first text pointers to corresponding workers. In third phase, each worker reads from the local disk the sb characters of text starting from the pointer that receives. In fourth phase, each worker performs a sequential approximate string matching procedure between the corresponding chunk of text and the pattern in order to generate the number of occurrences. In fifth phase, each worker sends the result back to master. In sixth phase, if the text pointer does not reach the end of the text, then master updates the text pointers for the next position of next chunks of text and sends the pointers to workers and loops back to third phase. The advantage of this simple implementation is that reduces the inter workstation communication overhead since each workstation in this scheme has an entire copy of the text collection on the local disk. However, this scheme requires more local space (or disk) requirements, but the size of the local disk in parallel and distributed architectures is large enough. 2.3 Hybrid Master-Worker Implementation Here, we develop a hybrid master-worker implementation that combines the advantages of static and dynamic approaches in order to reduce the load imbalance and communication overhead. This implementation is based on the optimal distribution strategy of the text collection that is performed statically. In the following subsection, we describe the optimal text distribution strategy and its implementation. Text Distribution and Load Balancing To avoid the slowest workstations to determine the parallel string matching time, the load should be distributed proportionally to the capacity of each workstation. The goal is to assign the same amount of time, which may not correspond to the same amount of the text collection. A balanced distribution is achieved by a static load distribution made prior to the execution of the parallel operation. To achieve a good balanced distribution among heterogeneous workstations, the amount of text distributed to each workstation should be proportional to its processing capacity compared to the entire network: S i l i = p 1 j=0 S (1) j
5 436 Panagiotis D. Michailidis and Konstantinos G. Margaritis where S j is the speed of the workstation j. Therefore, the amount of the text collection that is distributed to each workstation M i (1 i p) isl i n, wheren is the length of the complete text collection. The hybrid implementation that is called is same as the P1 implementation but we use the optimal distribution method instead of the uniform distribution one. The four entire parallel implementations are constructed so that alternative sequential approximate string matching algorithms can be substituted quite easily [7,11]. In this paper, we use the classical SEL dynamic programming algorithm [14]. 3 Experimental Results In this section, we discuss the experimental results for the performance of four parallel and distributed algorithms. These algorithms are implemented in C programming language using the MPI library [1]. 3.1 Experimental Environment The target platform for our experimental study is a cluster of heterogeneous workstations connected with 100 Mb/s Fast Ethernet network. More specifically, the cluster consists of 4 Pentium MMX 166 MHz with 32 MB RAM and 6 Pentium 100 MHz with 64 MB RAM. A Pentium MMX is used as master workstation. The average speeds of the two types of workstations, Pentium MMX and Pentium, for the four implementations are listed in Table 1. TheMPIimplementation used on the network is MPICH version 1.2. During all experiments, the cluster of workstations was dedicated. Finally, to get reliable performance results 10 executions occurred for each experiment and the reported values are the average ones. The text collection we used was composed of documents, which were portion of the various web pages. 3.2 Experimental Results In this subsection, we present the experimental results concluding from two sets of experiments. For the first experimental setup, we study the performance of four master-worker implementations P1,, and. For the second experimental setup, we examine the scalability issue of our implementations by doubling the text collection. Table 1. Average speeds (in chars per sec) of the two types of workstations Application Pentium MMX Pentium P1,
6 A Performance Study of Load Balancing Strategies 437 Comparing the Four Types of Approximate String Matching Implementations Before we present the results for four methods, we determined from the extensive experimental study [8] that the block size nearly sb=100,000 characters produces optimal performance for two dynamic master-worker methods and, later experiments are all performed using this optimal value for the and. Further, from [8] we observed that the worst performance is obtained for very small and large values of block size. This is because small values of block size increase the inter-workstation communication, while large values of block size produce poorly balanced load. Figures 1 and 2 show the execution times and the speedup factors with respect to the number of workstations respectively. It is important to note that the execution times and the speedups, which are plotted in Figures 1 and 2 are result of average for five pattern lengths (m=, 10, 20, 30 and 60) and four values of the number of errors (k=1, 3, 6 and 9). The speedup of a heterogeneous computation is defined as the ratio of the sequential execution time on the fastest workstation to the parallel execution time across the heterogeneous cluster. To have a fair comparison in terms of speedup, one defines the system computing power, which considers the power available instead of the number of workstations. The system computing power defines as follows: p 1 i=0 S i/s 0 for p workstations used, where S 0 is the speed of the master workstation. As we have expected, performance results show that the P1 implementation using static load balancing strategy is less effective than the other three implementations in case of heterogeneous network. This fact due to the presence of waiting time associated to communications. In other words, the slowest workstation is always the latest one in string matching computation. Further, the implementation using dynamic allocation of subtexts produces better results than the P1 one in case of heterogeneous cluster. Finally, the experimental results show that the and implementations seem to have the best performance compared with the others in case of heterogeneous cluster. These implementations give smaller execution times and higher speedups than in case of using SEL search algorithm, n=13mb and k=3 SEL search algorithm, n=13mb and m= P P Time (in seconds) 60 0 Time (in seconds) Fig. 1. Experimental execution times (in seconds) for text size of 13MB and k=3 using several pattern lengths (left) and m=10 using several values of k (right)
7 438 Panagiotis D. Michailidis and Konstantinos G. Margaritis SEL search algorithm, n=13mb and k=3 SEL search algorithm, n=13mb and m=10 6. P1 6. P Speedup 3. Speedup Fig. 2. Speedup of parallel approximate string matching with respect to the number of workstations for text size of 13MB and k=3 using several pattern lengths (left) and m=10 using several values of k (right) the P1 and ones when the network becomes heterogeneous, i.e. after the 3rd workstation. We now examine the performance of the, and parallel implementations. From the results, we see a clear reduction in the computation time of the algorithm when we use the three parallel implementations. For instance, with k=3 and several pattern lengths, we reduce the average computation time from 9.08 seconds in the sequential version to , and seconds in the distributed implementations, and respectively using 8 workstations. In other words, from the Figure 1 we observe that for constant total text size there is an expected inverse relation between the parallel execution times and the number of workstations. Further, the three master-worker implementations achieve reasonable speedups for all workstations. For example, with k=3 and several pattern lengths, we had an increasing speedup curves up to about.2,.86 and.91 in distributed methods, and respectively on the 8 workstations which had the computing power of.92,.93 and.97. Scalability Issue To study the scalability of three proposed parallel implementations, and, we setup the experiments in the following way. We simple double the old text size two times. This new text collection is around 27MB. Results from these experiments have been depicted in Figures 3 and 4. Theresults show that the three parallel implementations still scales well though the problem size has been increased two times (i.e. doubling the text collection). The average execution times for k=3 and several pattern lengths similarly decrease to , and seconds for the, and implementations respectively when the number of workstations have been added to 8. Moreover, speedup factors of three methods also linearly increase when the workstations are increased. Finally, the best performance results are obtained with the and load balancing methods.
8 A Performance Study of Load Balancing Strategies 439 SEL search algorithm, n=27mb and k=3 SEL search algorithm, n=27mb and m= Time (in seconds) Time (in seconds) Fig. 3. Experimental execution times (in seconds) for text size of 27MB and k=3 using several pattern lengths (left) and m=10 using several values of k (right) SEL search algorithm, n=27mb and k=3 SEL search algorithm, n=27mb and m= Speedup 3. Speedup Fig. 4. Speedup of parallel approximate string matching with respect to the number of workstations for text size of 27MB and k=3 using several pattern lengths (left) and m=10 using several values of k (right) 4 Conclusions In this paper, we have presented four parallel and distributed approximate string matching implementations and the performance results on a low-cost cluster of heterogeneous workstations. We have observed from this extensive study that the and implementations produce better performance results in terms execution times and speedups than the others. Higher gains in performance are expected for a larger number of varying speed workstations in the network. Variants of the approximate string matching algorithm can directly be implemented on a cluster of heterogeneous workstations using the four text distribution methods reported here. We plan to develop a theoretical performance model in order to confirm the experimental behaviour of four implementations on a heterogeneous cluster. Further, this model can be used to predict the execution time and similar performance metrics for the four approximate string matching implementations on larger clusters and problem sizes.
9 440 Panagiotis D. Michailidis and Konstantinos G. Margaritis References 1. O. Beaumont, A. Legrand and Y. Robert, The master-slave paradigm with heterogeneous processors, Report LIP RR , H. D. Cheng and K. S. Fu, VLSI architectures for string matching and pattern matching, Pattern Recognition, vol. 20, no. 1, pp , J. Cringean, R. England, G. Manson and P. Willett, Network design for the implementation of text searching using a multicomputer, Information Processing and Management, vol. 27, no. 4, pp , D. Lavenier, Speeding up genome computations with a systolic accelerator, SIAM News, vol. 31, no. 8, pp. 6-7, D. Lavenier and J. L. Pacherie, Parallel processing for scanning genomic databases, in Proc. PARCO 97, pp , K. G. Margaritis and D. J. Evans, A VLSI processor array for flexible string matching, Parallel Algorithms and Applications, vol. 11, no. 1-2, pp. 4-60, P. D. Michailidis and K. G. Margaritis, On-line approximate string searching algorithms: Survey and experimental results, International Journal of Computer Mathematics, vol. 79, no. 8, pp , , P. D. Michailidis and K. G. Margaritis, Performance evaluation of load balancing strategies for approximate string matching on a cluster of heterogeneous workstations, Tech. Report, Dept. of Applied Informatics, University of Macedonia, , P. D. Michailidis and K. G. Margaritis, String matching problem on a cluster of personal computers: Experimental results, in Proc. of the 1th International Conference Systems for Automation of Engineering and Research, pp. 71-7, P. D. Michailidis and K. G. Margaritis, String matching problem on a cluster of personal computers: Performance modeling, in Proc. of the 1th International Conference Systems for Automation of Engineering and Research, pp , G. Navarro, A guided tour to approximate string matching, ACM Computer Surveys, vol. 33, no. 1, pp , , N. Ranganathan and R. Sastry, VLSI architectures for pattern matching, International Journal of Pattern Recognition and Artificial Intelligence, vol.8,no.4, pp , R. Sastry, N. Ranganathan and K. Remedios, CASM: a VLSI chip for approximate string matching, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, no. 8, pp , P. H. Sellers, The theory and computations of evolutionaly distances: pattern recognition, Journal of Algorithms, vol. 1, pp , , M. Snir, S. Otto, S. Huss-Lederman, D. W. Walker and J. Dongarra, MPI: The complete reference, The MIT Press, Cambridge, Massachusetts, , T. K. Yap, O. Frieder and R. L. Martino, Parallel computation in biological sequence analysis, IEEE Transactions on Parallel and Distributed Systems, vol. 9, no. 3, pp ,
Load Balancing on a Non-dedicated Heterogeneous Network of Workstations
Load Balancing on a Non-dedicated Heterogeneous Network of Workstations Dr. Maurice Eggen Nathan Franklin Department of Computer Science Trinity University San Antonio, Texas 78212 Dr. Roger Eggen Department
More informationBuilding an Inexpensive Parallel Computer
Res. Lett. Inf. Math. Sci., (2000) 1, 113-118 Available online at http://www.massey.ac.nz/~wwiims/rlims/ Building an Inexpensive Parallel Computer Lutz Grosz and Andre Barczak I.I.M.S., Massey University
More informationParallel Computing of Kernel Density Estimates with MPI
Parallel Computing of Kernel Density Estimates with MPI Szymon Lukasik Department of Automatic Control, Cracow University of Technology, ul. Warszawska 24, 31-155 Cracow, Poland Szymon.Lukasik@pk.edu.pl
More informationParallel Processing over Mobile Ad Hoc Networks of Handheld Machines
Parallel Processing over Mobile Ad Hoc Networks of Handheld Machines Michael J Jipping Department of Computer Science Hope College Holland, MI 49423 jipping@cs.hope.edu Gary Lewandowski Department of Mathematics
More informationReliable Systolic Computing through Redundancy
Reliable Systolic Computing through Redundancy Kunio Okuda 1, Siang Wun Song 1, and Marcos Tatsuo Yamamoto 1 Universidade de São Paulo, Brazil, {kunio,song,mty}@ime.usp.br, http://www.ime.usp.br/ song/
More informationMOSIX: High performance Linux farm
MOSIX: High performance Linux farm Paolo Mastroserio [mastroserio@na.infn.it] Francesco Maria Taurino [taurino@na.infn.it] Gennaro Tortone [tortone@na.infn.it] Napoli Index overview on Linux farm farm
More informationImplementation and Performance Evaluation of M-VIA on AceNIC Gigabit Ethernet Card
Implementation and Performance Evaluation of M-VIA on AceNIC Gigabit Ethernet Card In-Su Yoon 1, Sang-Hwa Chung 1, Ben Lee 2, and Hyuk-Chul Kwon 1 1 Pusan National University School of Electrical and Computer
More informationA Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems*
A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems* Junho Jang, Saeyoung Han, Sungyong Park, and Jihoon Yang Department of Computer Science and Interdisciplinary Program
More informationA Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture
A Robust Dynamic Load-balancing Scheme for Data Parallel Application on Message Passing Architecture Yangsuk Kee Department of Computer Engineering Seoul National University Seoul, 151-742, Korea Soonhoi
More informationLOAD BALANCING FOR MULTIPLE PARALLEL JOBS
European Congress on Computational Methods in Applied Sciences and Engineering ECCOMAS 2000 Barcelona, 11-14 September 2000 ECCOMAS LOAD BALANCING FOR MULTIPLE PARALLEL JOBS A. Ecer, Y. P. Chien, H.U Akay
More informationOverlapping Data Transfer With Application Execution on Clusters
Overlapping Data Transfer With Application Execution on Clusters Karen L. Reid and Michael Stumm reid@cs.toronto.edu stumm@eecg.toronto.edu Department of Computer Science Department of Electrical and Computer
More informationA Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster
Acta Technica Jaurinensis Vol. 3. No. 1. 010 A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster G. Molnárka, N. Varjasi Széchenyi István University Győr, Hungary, H-906
More informationCHAPTER 1 INTRODUCTION
1 CHAPTER 1 INTRODUCTION 1.1 MOTIVATION OF RESEARCH Multicore processors have two or more execution cores (processors) implemented on a single chip having their own set of execution and architectural recourses.
More informationMultilevel Load Balancing in NUMA Computers
FACULDADE DE INFORMÁTICA PUCRS - Brazil http://www.pucrs.br/inf/pos/ Multilevel Load Balancing in NUMA Computers M. Corrêa, R. Chanin, A. Sales, R. Scheer, A. Zorzo Technical Report Series Number 049 July,
More informationParallel Ray Tracing using MPI: A Dynamic Load-balancing Approach
Parallel Ray Tracing using MPI: A Dynamic Load-balancing Approach S. M. Ashraful Kadir 1 and Tazrian Khan 2 1 Scientific Computing, Royal Institute of Technology (KTH), Stockholm, Sweden smakadir@csc.kth.se,
More informationImproved Single and Multiple Approximate String Matching
Improved Single and Multiple Approximate String Matching Kimmo Fredriksson Department of Computer Science, University of Joensuu, Finland Gonzalo Navarro Department of Computer Science, University of Chile
More informationIndex Terms : Load rebalance, distributed file systems, clouds, movement cost, load imbalance, chunk.
Load Rebalancing for Distributed File Systems in Clouds. Smita Salunkhe, S. S. Sannakki Department of Computer Science and Engineering KLS Gogte Institute of Technology, Belgaum, Karnataka, India Affiliated
More informationEfficient Parallel Execution of Sequence Similarity Analysis Via Dynamic Load Balancing
Efficient Parallel Execution of Sequence Similarity Analysis Via Dynamic Load Balancing James D. Jackson Philip J. Hatcher Department of Computer Science Kingsbury Hall University of New Hampshire Durham,
More informationA STUDY OF TASK SCHEDULING IN MULTIPROCESSOR ENVIROMENT Ranjit Rajak 1, C.P.Katti 2, Nidhi Rajak 3
A STUDY OF TASK SCHEDULING IN MULTIPROCESSOR ENVIROMENT Ranjit Rajak 1, C.P.Katti, Nidhi Rajak 1 Department of Computer Science & Applications, Dr.H.S.Gour Central University, Sagar, India, ranjit.jnu@gmail.com
More informationScalability and Classifications
Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static
More informationOperating System Multilevel Load Balancing
Operating System Multilevel Load Balancing M. Corrêa, A. Zorzo Faculty of Informatics - PUCRS Porto Alegre, Brazil {mcorrea, zorzo}@inf.pucrs.br R. Scheer HP Brazil R&D Porto Alegre, Brazil roque.scheer@hp.com
More informationA Review of Customized Dynamic Load Balancing for a Network of Workstations
A Review of Customized Dynamic Load Balancing for a Network of Workstations Taken from work done by: Mohammed Javeed Zaki, Wei Li, Srinivasan Parthasarathy Computer Science Department, University of Rochester
More informationHow To Balance In Cloud Computing
A Review on Load Balancing Algorithms in Cloud Hareesh M J Dept. of CSE, RSET, Kochi hareeshmjoseph@ gmail.com John P Martin Dept. of CSE, RSET, Kochi johnpm12@gmail.com Yedhu Sastri Dept. of IT, RSET,
More informationGroup Based Load Balancing Algorithm in Cloud Computing Virtualization
Group Based Load Balancing Algorithm in Cloud Computing Virtualization Rishi Bhardwaj, 2 Sangeeta Mittal, Student, 2 Assistant Professor, Department of Computer Science, Jaypee Institute of Information
More informationHow To Improve Performance On A Single Chip Computer
: Redundant Arrays of Inexpensive Disks this discussion is based on the paper:» A Case for Redundant Arrays of Inexpensive Disks (),» David A Patterson, Garth Gibson, and Randy H Katz,» In Proceedings
More informationFPGA area allocation for parallel C applications
1 FPGA area allocation for parallel C applications Vlad-Mihai Sima, Elena Moscu Panainte, Koen Bertels Computer Engineering Faculty of Electrical Engineering, Mathematics and Computer Science Delft University
More informationProposal and Development of a Reconfigurable Parallel Job Scheduling Algorithm
Proposal and Development of a Reconfigurable Parallel Job Scheduling Algorithm Luís Fabrício Wanderley Góes, Carlos Augusto Paiva da Silva Martins Graduate Program in Electrical Engineering PUC Minas {lfwgoes,capsm}@pucminas.br
More informationDistributed Dynamic Load Balancing for Iterative-Stencil Applications
Distributed Dynamic Load Balancing for Iterative-Stencil Applications G. Dethier 1, P. Marchot 2 and P.A. de Marneffe 1 1 EECS Department, University of Liege, Belgium 2 Chemical Engineering Department,
More informationCellular Computing on a Linux Cluster
Cellular Computing on a Linux Cluster Alexei Agueev, Bernd Däne, Wolfgang Fengler TU Ilmenau, Department of Computer Architecture Topics 1. Cellular Computing 2. The Experiment 3. Experimental Results
More informationLoad Balancing MPI Algorithm for High Throughput Applications
Load Balancing MPI Algorithm for High Throughput Applications Igor Grudenić, Stjepan Groš, Nikola Bogunović Faculty of Electrical Engineering and, University of Zagreb Unska 3, 10000 Zagreb, Croatia {igor.grudenic,
More informationSource Code Transformations Strategies to Load-balance Grid Applications
Source Code Transformations Strategies to Load-balance Grid Applications Romaric David, Stéphane Genaud, Arnaud Giersch, Benjamin Schwarz, and Éric Violard LSIIT-ICPS, Université Louis Pasteur, Bd S. Brant,
More informationA Flexible Cluster Infrastructure for Systems Research and Software Development
Award Number: CNS-551555 Title: CRI: Acquisition of an InfiniBand Cluster with SMP Nodes Institution: Florida State University PIs: Xin Yuan, Robert van Engelen, Kartik Gopalan A Flexible Cluster Infrastructure
More informationParallel Scalable Algorithms- Performance Parameters
www.bsc.es Parallel Scalable Algorithms- Performance Parameters Vassil Alexandrov, ICREA - Barcelona Supercomputing Center, Spain Overview Sources of Overhead in Parallel Programs Performance Metrics for
More informationFigure 1. The cloud scales: Amazon EC2 growth [2].
- Chung-Cheng Li and Kuochen Wang Department of Computer Science National Chiao Tung University Hsinchu, Taiwan 300 shinji10343@hotmail.com, kwang@cs.nctu.edu.tw Abstract One of the most important issues
More informationA Simple Method for Dynamic Scheduling in a Heterogeneous Computing System
Journal of Computing and Information Technology - CIT 10, 2002, 2, 85 92 85 A Simple Method for Dynamic Scheduling in a Heterogeneous Computing System Janez Brest and Viljem Žumer Faculty of Electrical
More informationA Comparison of General Approaches to Multiprocessor Scheduling
A Comparison of General Approaches to Multiprocessor Scheduling Jing-Chiou Liou AT&T Laboratories Middletown, NJ 0778, USA jing@jolt.mt.att.com Michael A. Palis Department of Computer Science Rutgers University
More informationApproximate Search Engine Optimization for Directory Service
Approximate Search Engine Optimization for Directory Service Kai-Hsiang Yang and Chi-Chien Pan and Tzao-Lin Lee Department of Computer Science and Information Engineering, National Taiwan University, Taipei,
More informationReconfigurable Architecture Requirements for Co-Designed Virtual Machines
Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Kenneth B. Kent University of New Brunswick Faculty of Computer Science Fredericton, New Brunswick, Canada ken@unb.ca Micaela Serra
More informationOptimization of Cluster Web Server Scheduling from Site Access Statistics
Optimization of Cluster Web Server Scheduling from Site Access Statistics Nartpong Ampornaramveth, Surasak Sanguanpong Faculty of Computer Engineering, Kasetsart University, Bangkhen Bangkok, Thailand
More informationThe Efficiency Analysis of the Object Oriented Realization of the Client-Server Systems Based on the CORBA Standard 1
S C H E D A E I N F O R M A T I C A E VOLUME 20 2011 The Efficiency Analysis of the Object Oriented Realization of the Client-Server Systems Based on the CORBA Standard 1 Zdzis law Onderka AGH University
More informationAn Ants Algorithm to Improve Energy Efficient Based on Secure Autonomous Routing in WSN
An Ants Algorithm to Improve Energy Efficient Based on Secure Autonomous Routing in WSN *M.A.Preethy, PG SCHOLAR DEPT OF CSE #M.Meena,M.E AP/CSE King College Of Technology, Namakkal Abstract Due to the
More informationCombining Scalability and Efficiency for SPMD Applications on Multicore Clusters*
Combining Scalability and Efficiency for SPMD Applications on Multicore Clusters* Ronal Muresano, Dolores Rexachs and Emilio Luque Computer Architecture and Operating System Department (CAOS) Universitat
More informationPerformance Metrics and Scalability Analysis. Performance Metrics and Scalability Analysis
Performance Metrics and Scalability Analysis 1 Performance Metrics and Scalability Analysis Lecture Outline Following Topics will be discussed Requirements in performance and cost Performance metrics Work
More informationA Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems
A Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Aysan Rasooli Department of Computing and Software McMaster University Hamilton, Canada Email: rasooa@mcmaster.ca Douglas G. Down
More informationA Load Balancing Technique for Some Coarse-Grained Multicomputer Algorithms
A Load Balancing Technique for Some Coarse-Grained Multicomputer Algorithms Thierry Garcia and David Semé LaRIA Université de Picardie Jules Verne, CURI, 5, rue du Moulin Neuf 80000 Amiens, France, E-mail:
More informationPerformance Modeling and Analysis of a Database Server with Write-Heavy Workload
Performance Modeling and Analysis of a Database Server with Write-Heavy Workload Manfred Dellkrantz, Maria Kihl 2, and Anders Robertsson Department of Automatic Control, Lund University 2 Department of
More informationPerformance Characteristics of a Cost-Effective Medium-Sized Beowulf Cluster Supercomputer
Res. Lett. Inf. Math. Sci., 2003, Vol.5, pp 1-10 Available online at http://iims.massey.ac.nz/research/letters/ 1 Performance Characteristics of a Cost-Effective Medium-Sized Beowulf Cluster Supercomputer
More informationRevoScaleR Speed and Scalability
EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution
More informationLocality-Preserving Dynamic Load Balancing for Data-Parallel Applications on Distributed-Memory Multiprocessors
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 18, 1037-1048 (2002) Short Paper Locality-Preserving Dynamic Load Balancing for Data-Parallel Applications on Distributed-Memory Multiprocessors PANGFENG
More informationDynamic load balancing of parallel cellular automata
Dynamic load balancing of parallel cellular automata Marc Mazzariol, Benoit A. Gennart, Roger D. Hersch Ecole Polytechnique Fédérale de Lausanne, EPFL * ABSTRACT We are interested in running in parallel
More informationPerformance of Scientific Processing in Networks of Workstations: Matrix Multiplication Example
Performance of Scientific Processing in Networks of Workstations: Matrix Multiplication Example Fernando G. Tinetti Centro de Técnicas Analógico-Digitales (CeTAD) 1 Laboratorio de Investigación y Desarrollo
More informationReal Time Network Server Monitoring using Smartphone with Dynamic Load Balancing
www.ijcsi.org 227 Real Time Network Server Monitoring using Smartphone with Dynamic Load Balancing Dhuha Basheer Abdullah 1, Zeena Abdulgafar Thanoon 2, 1 Computer Science Department, Mosul University,
More informationIntroduction to Cloud Computing
Introduction to Cloud Computing Parallel Processing I 15 319, spring 2010 7 th Lecture, Feb 2 nd Majd F. Sakr Lecture Motivation Concurrency and why? Different flavors of parallel computing Get the basic
More informationObservations on Data Distribution and Scalability of Parallel and Distributed Image Processing Applications
Observations on Data Distribution and Scalability of Parallel and Distributed Image Processing Applications Roman Pfarrhofer and Andreas Uhl uhl@cosy.sbg.ac.at R. Pfarrhofer & A. Uhl 1 Carinthia Tech Institute
More informationBENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next
More informationAdvances in Smart Systems Research : ISSN 2050-8662 : http://nimbusvault.net/publications/koala/assr/ Vol. 3. No. 3 : pp.
Advances in Smart Systems Research : ISSN 2050-8662 : http://nimbusvault.net/publications/koala/assr/ Vol. 3. No. 3 : pp.49-54 : isrp13-005 Optimized Communications on Cloud Computer Processor by Using
More informationLoad Balancing on a Grid Using Data Characteristics
Load Balancing on a Grid Using Data Characteristics Jonathan White and Dale R. Thompson Computer Science and Computer Engineering Department University of Arkansas Fayetteville, AR 72701, USA {jlw09, drt}@uark.edu
More informationEnergy Efficient MapReduce
Energy Efficient MapReduce Motivation: Energy consumption is an important aspect of datacenters efficiency, the total power consumption in the united states has doubled from 2000 to 2005, representing
More informationBenchmarking Hadoop & HBase on Violin
Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages
More informationHow To Compare Load Sharing And Job Scheduling In A Network Of Workstations
A COMPARISON OF LOAD SHARING AND JOB SCHEDULING IN A NETWORK OF WORKSTATIONS HELEN D. KARATZA Department of Informatics Aristotle University of Thessaloniki 546 Thessaloniki, GREECE Email: karatza@csd.auth.gr
More informationScalable Parallel Clustering for Data Mining on Multicomputers
Scalable Parallel Clustering for Data Mining on Multicomputers D. Foti, D. Lipari, C. Pizzuti and D. Talia ISI-CNR c/o DEIS, UNICAL 87036 Rende (CS), Italy {pizzuti,talia}@si.deis.unical.it Abstract. This
More informationKeywords: Dynamic Load Balancing, Process Migration, Load Indices, Threshold Level, Response Time, Process Age.
Volume 3, Issue 10, October 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Load Measurement
More informationControl 2004, University of Bath, UK, September 2004
Control, University of Bath, UK, September ID- IMPACT OF DEPENDENCY AND LOAD BALANCING IN MULTITHREADING REAL-TIME CONTROL ALGORITHMS M A Hossain and M O Tokhi Department of Computing, The University of
More informationA Performance Comparison of Five Algorithms for Graph Isomorphism
A Performance Comparison of Five Algorithms for Graph Isomorphism P. Foggia, C.Sansone, M. Vento Dipartimento di Informatica e Sistemistica Via Claudio, 21 - I 80125 - Napoli, Italy {foggiapa, carlosan,
More informationA Study on Workload Imbalance Issues in Data Intensive Distributed Computing
A Study on Workload Imbalance Issues in Data Intensive Distributed Computing Sven Groot 1, Kazuo Goda 1, and Masaru Kitsuregawa 1 University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8505, Japan Abstract.
More informationResource Allocation Schemes for Gang Scheduling
Resource Allocation Schemes for Gang Scheduling B. B. Zhou School of Computing and Mathematics Deakin University Geelong, VIC 327, Australia D. Walsh R. P. Brent Department of Computer Science Australian
More informationSurvey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in Cloud Prof. Pranalini S. Ketkar Ankita Bhimrao Patkure IT Department, DCOER, PG Scholar, Computer Department DCOER, Pune University Pune university
More informationParallel sorting on Intel Single-Chip Cloud computer
Parallel sorting on Intel Single-Chip Cloud computer Kenan Avdic, Nicolas Melot, Jörg Keller 2, and Christoph Kessler Linköpings Universitet, Dept. of Computer and Inf. Science, 5883 Linköping, Sweden
More informationA Statistically Customisable Web Benchmarking Tool
Electronic Notes in Theoretical Computer Science 232 (29) 89 99 www.elsevier.com/locate/entcs A Statistically Customisable Web Benchmarking Tool Katja Gilly a,, Carlos Quesada-Granja a,2, Salvador Alcaraz
More informationMeasuring MPI Send and Receive Overhead and Application Availability in High Performance Network Interfaces
Measuring MPI Send and Receive Overhead and Application Availability in High Performance Network Interfaces Douglas Doerfler and Ron Brightwell Center for Computation, Computers, Information and Math Sandia
More informationDavid Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems
David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems About me David Rioja Redondo Telecommunication Engineer - Universidad de Alcalá >2 years building and managing clusters UPM
More informationUsing Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
More informationEvolutionary Prefetching and Caching in an Independent Storage Units Model
Evolutionary Prefetching and Caching in an Independent Units Model Athena Vakali Department of Informatics Aristotle University of Thessaloniki, Greece E-mail: avakali@csdauthgr Abstract Modern applications
More informationParallel Computing. Benson Muite. benson.muite@ut.ee http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage
Parallel Computing Benson Muite benson.muite@ut.ee http://math.ut.ee/ benson https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage 3 November 2014 Hadoop, Review Hadoop Hadoop History Hadoop Framework
More informationWITH the availability of large data sets in application
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 16, NO. 10, OCTOBER 2004 1 Shared Memory Parallelization of Data Mining Algorithms: Techniques, Programming Interface, and Performance Ruoming
More informationA Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique
A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique Jyoti Malhotra 1,Priya Ghyare 2 Associate Professor, Dept. of Information Technology, MIT College of
More informationOn-Demand Supercomputing Multiplies the Possibilities
Microsoft Windows Compute Cluster Server 2003 Partner Solution Brief Image courtesy of Wolfram Research, Inc. On-Demand Supercomputing Multiplies the Possibilities Microsoft Windows Compute Cluster Server
More informationUnderstanding Data Locality in VMware Virtual SAN
Understanding Data Locality in VMware Virtual SAN July 2014 Edition T E C H N I C A L M A R K E T I N G D O C U M E N T A T I O N Table of Contents Introduction... 2 Virtual SAN Design Goals... 3 Data
More informationPARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN
1 PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster Construction
More informationEmail Spam Detection Using Customized SimHash Function
International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume 1, Issue 8, December 2014, PP 35-40 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) www.arcjournals.org Email
More informationV:Drive - Costs and Benefits of an Out-of-Band Storage Virtualization System
V:Drive - Costs and Benefits of an Out-of-Band Storage Virtualization System André Brinkmann, Michael Heidebuer, Friedhelm Meyer auf der Heide, Ulrich Rückert, Kay Salzwedel, and Mario Vodisek Paderborn
More informationMizan: A System for Dynamic Load Balancing in Large-scale Graph Processing
/35 Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing Zuhair Khayyat 1 Karim Awara 1 Amani Alonazi 1 Hani Jamjoom 2 Dan Williams 2 Panos Kalnis 1 1 King Abdullah University of
More informationA Fast Pattern Matching Algorithm with Two Sliding Windows (TSW)
Journal of Computer Science 4 (5): 393-401, 2008 ISSN 1549-3636 2008 Science Publications A Fast Pattern Matching Algorithm with Two Sliding Windows (TSW) Amjad Hudaib, Rola Al-Khalid, Dima Suleiman, Mariam
More informationEnhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications
Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications Ahmed Abdulhakim Al-Absi, Dae-Ki Kang and Myong-Jong Kim Abstract In Hadoop MapReduce distributed file system, as the input
More informationKey Words: Dynamic Load Balancing, and Distributed System
DYNAMIC ROTATING LOAD BALANCING ALGORITHM IN DISTRIBUTED SYSTEMS ROSE SULEIMAN AL DAHOUD ALI ISSA OTOUM Al-Zaytoonah University Al-Zaytoonah University Neelain University rosesuleiman@yahoo.com aldahoud@alzaytoonah.edu.jo
More informationRelating Empirical Performance Data to Achievable Parallel Application Performance
Published in Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'99), Vol. III, Las Vegas, Nev., USA, June 28-July 1, 1999, pp. 1627-1633.
More informationMining Association Rules on Grid Platforms
UNIVERSITY OF TUNIS EL MANAR FACULTY OF SCIENCES OF TUNISIA Mining Association Rules on Grid Platforms Raja Tlili raja_tlili@yahoo.fr Yahya Slimani yahya.slimani@fst.rnu.tn CoreGrid 11 Plan Introduction
More informationMining Interesting Medical Knowledge from Big Data
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 1, Ver. II (Jan Feb. 2016), PP 06-10 www.iosrjournals.org Mining Interesting Medical Knowledge from
More informationA Comparative Performance Analysis of Load Balancing Algorithms in Distributed System using Qualitative Parameters
A Comparative Performance Analysis of Load Balancing Algorithms in Distributed System using Qualitative Parameters Abhijit A. Rajguru, S.S. Apte Abstract - A distributed system can be viewed as a collection
More informationHadoop Scheduler w i t h Deadline Constraint
Hadoop Scheduler w i t h Deadline Constraint Geetha J 1, N UdayBhaskar 2, P ChennaReddy 3,Neha Sniha 4 1,4 Department of Computer Science and Engineering, M S Ramaiah Institute of Technology, Bangalore,
More informationChapter 12: Multiprocessor Architectures. Lesson 01: Performance characteristics of Multiprocessor Architectures and Speedup
Chapter 12: Multiprocessor Architectures Lesson 01: Performance characteristics of Multiprocessor Architectures and Speedup Objective Be familiar with basic multiprocessor architectures and be able to
More informationHPC Wales Skills Academy Course Catalogue 2015
HPC Wales Skills Academy Course Catalogue 2015 Overview The HPC Wales Skills Academy provides a variety of courses and workshops aimed at building skills in High Performance Computing (HPC). Our courses
More informationGrid Computing Approach for Dynamic Load Balancing
International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-1 E-ISSN: 2347-2693 Grid Computing Approach for Dynamic Load Balancing Kapil B. Morey 1*, Sachin B. Jadhav
More informationPGGA: A Predictable and Grouped Genetic Algorithm for Job Scheduling. Abstract
PGGA: A Predictable and Grouped Genetic Algorithm for Job Scheduling Maozhen Li and Bin Yu School of Engineering and Design, Brunel University, Uxbridge, UB8 3PH, UK {Maozhen.Li, Bin.Yu}@brunel.ac.uk Man
More informationScheduling and Load Balancing in the Parallel ROOT Facility (PROOF)
Scheduling and Load Balancing in the Parallel ROOT Facility (PROOF) Gerardo Ganis CERN E-mail: Gerardo.Ganis@cern.ch CERN Institute of Informatics, University of Warsaw E-mail: Jan.Iwaszkiewicz@cern.ch
More informationLos Angeles, CA, USA 90089-2561 [kunfu, rzimmerm]@usc.edu
!"$#% &' ($)+*,#% *.- Kun Fu a and Roger Zimmermann a a Integrated Media Systems Center, University of Southern California Los Angeles, CA, USA 90089-56 [kunfu, rzimmerm]@usc.edu ABSTRACT Presently, IP-networked
More informationTowards a Load Balancing in a Three-level Cloud Computing Network
Towards a Load Balancing in a Three-level Cloud Computing Network Shu-Ching Wang, Kuo-Qin Yan * (Corresponding author), Wen-Pin Liao and Shun-Sheng Wang Chaoyang University of Technology Taiwan, R.O.C.
More information22S:295 Seminar in Applied Statistics High Performance Computing in Statistics
22S:295 Seminar in Applied Statistics High Performance Computing in Statistics Luke Tierney Department of Statistics & Actuarial Science University of Iowa August 30, 2007 Luke Tierney (U. of Iowa) HPC
More informationNetwork Attached Storage. Jinfeng Yang Oct/19/2015
Network Attached Storage Jinfeng Yang Oct/19/2015 Outline Part A 1. What is the Network Attached Storage (NAS)? 2. What are the applications of NAS? 3. The benefits of NAS. 4. NAS s performance (Reliability
More informationEfficient Iceberg Query Evaluation for Structured Data using Bitmap Indices
Proc. of Int. Conf. on Advances in Computer Science, AETACS Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices Ms.Archana G.Narawade a, Mrs.Vaishali Kolhe b a PG student, D.Y.Patil
More information