Performance Study of Parallel Programming Paradigms on a Multicore Clusters using Ant Colony Optimization for Job-flow scheduling problems
|
|
- Mercy Lane
- 8 years ago
- Views:
Transcription
1 Performance Study of Parallel Programming Paradigms on a Multicore Clusters using Ant Colony Optimization for Job-flow scheduling problems Nagaveni V # Dr. G T Raju * # Department of Computer Science and Engineering Bharathiar University Coimbatore, , Tamilnadu, India * Department of Computer Science and Engineering RNS Institute of Technology, Bangalore , Karnataka, India Abstract: Parallel Programming technique is an important technique for various intensive and wide ranges of applications which uses multicore and manycore architectures. These architectures are most suitable for clusters than grids, supported by different network topologies. Message Passing and Hybrid Programming are the two techniques employed for the multicore clusters. Ant Colony Optimization(ACO) is one of the important heuristic based approach used to optimize the code in which computational speed is high compared to others.aco is tested using Job flow scheduling problems as a benchmark and compared with both the approaches and then concluded that parallel ACO gives better performance. Keywords: Parallel Programming, Message Passing, Hybrid Programming, ACO, Job flow Scheduling problems I.INTRODUCTION The research on distributed and parallel systems is one of the most important areas in computer science nowadays. In particular, the use of multicore architectures used in clusters, grids and clouds, supported by different types of networks with different characteristics and topologies, has been considered for the development of parallel algorithms and also for the execution of different intensive processes computation. Multicore processor is formed by the integration of two or more computational cores within the same chip. These cores are simpler and slower, when combined allows enhancing the global performance of the processor with an efficient use of energy. Incorporation of this type of processors to ordinary clusters gives an architecture called as multicore cluster. These types of architectures establish communications between processing units which are heterogeneous and it can be divided into two groups: Internode and Intranode. Internode communications are done between the cores that are in different nodes and they communicate by exchanging messages through the interconnection network. Intranode communications are done between the cores that are within the same node and they communicate through the different memory levels of the node. Parallel programming paradigms are classified into three: shared memory, message passing and hybrid programming based on the way tasks communicate and synchronize. In shared memory architectures, such as multicores, tasks communicate and synchronize by reading and writing variables in a shared address space. OpenMP is the most widely used library to program shared memory. Message passing is the most commonly chosen paradigm for distributed architectures, such as traditional clusters. MPI is the most widely used library to program under this paradigm. Multicore clusters are hybrid architectures that combine distributed memory with shared memory, so the scientific community has a great interest in analyzing hybrid parallel programming paradigms that allow communications both through message passing and shared memory. The performance of two parallel algorithms designed for the same application but using different programming paradigms is compared over a multicore cluster. The application selected is a divide and conquer technique by means of Ant Colony Optimization algorithm, and it is selected as its computational complexity is O(n3). The Ant Colony Optimization algorithm is explained, together with sequential and the parallel algorithms used. In Section 4, the experimental work carried out is described, where as in Section 5, the results obtained are presented and analyzed. Finally, in Section 6 the conclusions and future lines of work are presented. Job flow scheduling is one of the well-known and important algorithmic paradigm suitable for execution on multi-core clusters The JSP is known to be a strong NPhard problem. Hence the JSP included m objectives must also be an NP-hard problem. Mathematical programming approaches for solving multi-objective scheduling problem are computationally intractable for practical problems. II. RELATED WORK Numerous works have been done till now that analyze and compare parallel programming paradigms for multicore clusters. To mention but a few, there are different comparisons between message passing and combinations of message passing with shared memory. But all those results vary depending on the type of the problem solved, the algorithms used and the features of the hardware architectures used,which makes research in this area even more significant
2 III. SEQUENTIAL ANT COLONY OPTIMIZATION ALGORITHM The ant colony optimization algorithm (SACO) is a probabilistic technique for solving computational problems which can be reduced for finding good paths through graphs. This Sequential ant colony optimization algorithm (ACO) is a member of the ant colony algorithms family, in swarm intelligence methods, and it constitutes some metaheuristic optimizations. The original idea has been diversified to solve a wider class of numerical problems, and as a result, several problems have emerged, drawing on various aspects of the behavior of ants. A. Convergence For some versions of the algorithm, it is possible to prove that it is convergent (i.e., it is able to find the global optimum in finite time). The first evidence of a convergence ant colony algorithm was made in 2000, the graph-based ant system algorithm, and then other algorithms like Ant colony System has been devised. Like most metaheuristics, it is very difficult to estimate the theoretical speed of convergence. 1. Example pseudo-code and formulae procedure ACO_MetaHeuristic while(not_termination) generatesolutions() daemonactions() pheromoneupdate() end while end procedure B. Edge selection An ant is a simple computational agent in the ant colony optimization algorithm. It iteratively constructs a solution for the problem at hand. The intermediate solutions are referred to as solution states. At each iteration of the algorithm, each ant moves from a state x to state y, corresponding to a more complete intermediate solution. C. Pheromone update When all the ants have completed a solution, the trails are updated by local pheromone update algorithm. IV. PARALLEL ANT COLONY OPTIMIZATION(PACO): PACO is a population-based approach based on the social behavior of ants in which we find the use of parallel computing to reduce computation time, improve solution quality or both. Most parallel ACO implementations can be classified into two general approaches. The first one is the parallel execution of the ants construction phase in a single colony. Initiated by Bullnheimer ET AL.,it aims to accelerate computations by distributing ants to computing elements. The second one, introduced by Stützle, is the execution of multiple ant colonies. In this case, entire ant colonies are attributed to processors in order to speedup computations as well as to potentially improve solution quality by introducing cooperation schemes between colonies. Recently, a more detailed classification was proposed by Pedemonte ET AL. It shows that most existing works are based on designing parallel ACO algorithms at a relatively high level of abstraction which may be suitable for conventional parallel computers. However, as research on parallel architectures is rapidly evolving, new types of hardware have recently become available for high performance computing. Among them, we find multicore processors which provide great computing power at an affordable cost but are more difficult to program. D. Parallelization strategies In PACO algorithms, artificial ants cooperate while exploring the search space, searching good solutions for the problem through a communication mediated by artificial pheromone trails. The construction solution process is incremental, where a solution is built by adding solution components to an initially empty solution under construction. The main steps of the Ant System (AS) algorithm can be described as: initialization phase, ants solutions construction, ants solutions evaluation and pheromone trails updating. In Algorithm 2 a pseudo-code of AS is given. Algorithm 2: Pseudo-code of Ant System. // Initialization phase Pheromone trails τ; Heuristic information η; // Iterative phase while termination criteria not met do Ants solutions construction; Ants solutions evaluation; Pheromone trails updating; Return the best solution; After setting the parameters, the first step of the algorithm is the initialization procedure, which initializes the heuristic information and the pheromone trails. When all ants construct a complete path (feasible solution), the solutions are evaluated. Then, the pheromone trails are updated considering the quality of the candidate solutions found; also a certain level of evaporation is applied. When the iterative phase is complete, that is, when the termination criteria are met, the algorithm returns the best solution generated. E. Hardware-oriented parallel ACO Even though they mostly follow the parallel ants and multiple ant colonies approaches, hardware-oriented approaches are dedicated to specific and untraditional parallel architectures. Scheuermann ET AL. [21, 22] designed parallel implementations of ACO on Field Programmable Gate Arrays (FPGA). Fig 1. Taxonomy for parallel ACO
3 In addition to a complete survey, Pedemonte ET AL. proposed taxonomy for Parallel ACO which is illustrated in Fig. 1. Although it provides a comprehensive view of the field, its relatively high level of abstraction does not capture some important features that are crucial for obtaining efficient implementations on modern high performance computing architectures. Fig 2. Architectures used for parallel ACO. Fig.2 provides a conceptual view of parallel ACO that relates more closely to real parallel architectures. By bringing together the high-level concepts of parallel ACO and the lower-level parallel computing models, it aims to serve as a methodological framework for the design of efficient ACO implementations. F. A new architecture-oriented taxonomy for parallel ACO The efficient implementation of a parallel metaheuristic in optimization software generally requires the consideration of the underlying architecture. Inspired by Talbi, we distinguish the following main parallel architectures: clusters/networks of workstations, symmetric multiprocessors / multicore processors and grids. Clusters and Networks of Workstations (COWs/NOWs) are distributed-memory architectures where each processor has its own memory. Information exchanges between processors require explicit message passing which implies programming efforts and communication costs. NOWs may be seen as an heterogeneous group of computers whereas COWs are homogeneous, unified computing devices. G. Message Passing as Parallel Programming Paradigm This solution uses a master-slave model of P processes as parallelization strategy. The distances of each newly created node are distributed among all processes following a circular order (the node to which it is assigned is the owner). H. Combination of Message Passing and Shared Memory as Parallel Programming Paradigm This solution is based on the one described in Section3.2.1, but unlike it, each process generates T threads when computation begins. Then, the iterations belonging to different process loops are distributed among the threads that have been generated. V. CASE STUDIES Two case studies are presented to illustrate how the proposed framework relates to real implementations. In order to cover the two main parallelization strategies for ACO, both parallel ants and multicolony approaches are proposed. In the first case, SMP and multicore processors are considered as underlying architectures. This section is then concluded with a general discussion about how this taxonomy applies to most other combinations of ACO algorithms and parallel architectures. I. Multi-Colony parallel ACO on a SMP and multicore architecture This approach deals with the management of multiple colonies which use a global shared memory to exchange information. The whole algorithm executes on a single system and a single node so there is no parallelism at these levels. The colonies are executed in parallel and spawn multiple parallel ants. Therefore, colonies are associated to processes and ants to threads. At the programming level, this can be implemented either with multiple operating system processes and multiple threads or with multiple nested threads. In this implementation, we choose the latter as the available SMP node supports nested threads with a shared memory available to all processors. Parallelizing ACO in multiple search processes is quite simple: we only need to create a parallel region at the beginning of the sequential algorithm. This way, we can create as many threads as we have colonies. To illustrate the scheme of multiple interacting colonies in a sharedmemory model, the simple case of a common best global solution located in the shared memory is implemented. In a shared-memory context, there is no such thing as an explicit broadcast communication step. It is replaced by the use of the global best solution as a dedicated structure in the shared memory. However, it is now used differently and more frequently. At each information exchange step, each thread compares its local value of the best solution with the global best solution. If it has lower cost, it then becomes the new global best known solution. VI. IMPLEMENTATION Tests were carried out on a cluster of multicores with four blades and two quad core Intel Xeon e GHz processors each. Each blade has 10 Gb RAM memory (shared between both processors) and 2 x 6Mb L2 cache for each pair of cores. The operating system is GNU/Linux Fedora 11 (64 bits). A. Algorithms Used The algorithms used in this work were developed using C language (gcc compiler version 4.4.2) with the MPI (mpicccompilerversion1.4.3) library for message passing and OpenMP for thread management. The algorithms are detailed below: MP: this algorithm is based on the solution described in Section3.2.1, where P is the number of cores used. HP: this algorithm is based on the solution described in Section3.2.2, where P is the number of blades used and T is the number of cores in each blade
4 B. Tests Carried Out Based on the features of the architecture, both algorithms were tested using all the cores with different numbers of nodes: two, three and four; this means that P = {8, 16 and 24} for MP. In the case of HP, one process per node was used; this means that P = {2, 3, 4} and T = {8}. Various problem sizes were used: N = {3000, 5000, 7000 and 9000}. Each particular test was run five times, and the average execution time was calculated for each of them. VII. RESULTS To assess the behavior of the algorithms developed when escalating the problem and/or the architecture, the speedup and efficiency of the tests carried out are analyzed. The speedup metric is used to analyze the algorithm performance in the parallel architecture as indicated in Equation(1). SpeedUp=SequentialTime/ParallelTime (1) To assess how good the speedup obtained is, the efficiency metric is calculated. Equation (2) indicates how to calculate this metric, where p is the total number of cores used. Efficiency=SpeedUp/p (2) Figure 3 shows the Speed increased, achieved by the algorithms MP and HP when using eight, sixteen and twenty four cores architecture for different problem sizes(n). Speed Up cores 16 cores 24 cores Size of the problem Fig 3. Speedup obtained with the parallel algorithm for various problem sizes using different numbers of cores of the architecture. 8 cores cores 24 cores 0.5 Efficiency Sizeof the problem Fig 4. Efficiency achieved by algorithms MP and HP when using two, three and four blades of the architecture for different problem sizes(n). The chart of figure 4 shows that both algorithms increase their efficiency as the size of the problem increases and, on the other hand, as it is to be expected in most parallel systems, the efficiency decreases when the total number of nodes used increases. The efficiency levels obtained with both algorithms are low due to the number of communication and synchronization operations carried out and the idle time that processes and threads might have. Despite this, it can be seen that the best efficiency levels are obtained by HP. The difference in favor of HP is due to several factors: First, HP reduces latency and maximizes the bandwidth of the interconnection network, since, by using a single, multi-threaded process instead of multiple processes for each node, it groups all task messages corresponding to a node in a single, larger message. It also removes competition for the network at node level. Finally, since the distance matrix d is divided in less parts and the work assigned to each of these portions is distributed dynamically among the threads in each process, HP achieves a more balanced work distribution versus the fully static strategy used by MP. VIII.CONCLUSION AND FUTURE WORK The performance of two parallel programming paradigms(message passing and hybrid) are compared for current cluster architectures, taking a study case of job flow scheduling problems by means of the Ant Colony Optimization algorithm. The algorithms were tested using various work and architecture sizes. The results obtained show that the hybrid parallelization better leverages the hardware features offered by the support architecture, which in turn yields a better performance. Future lines of work include the development and optimization of hybrid solutions for other types of applications and their comparison with solutions based on message passing or shared memory or both. REFERENCES [1] Enzo Rucci, Franco Chichizola, Marcelo Naiouf and Armando De Giusti Performance comparison of parallel programming paradigms erucci, francoch, mnaiouf, degiusti@lidi.info.unlp.edu.ar [2] Thomas Rauber, Gudula Ringer Parallel Programming [3] A.M. Abdelbar, \Is there a computational advantage to representing evaporation rate in ant colony optimization as a gaussian random variable?," Proceedings Fourteenth International Conference on Genetic and Evolutionary Computation Conference (GECCO-12), pp [4] A.M. Abdelbar, and D.C. Wunsch, \Improving the performance of MAX-MIN ant system on the TSP using stubborn ants," Proceedings Fourteenth International Conference on Genetic and Evolutionary Computation (GECCO-12) Conference Companion, pp , [5] M. Dorigo, and T. Stutzle, \Ant colony optimization: overview and recent advances, In: M. Gendreau, and Y. Potvin, eds., Handbook of Metaheuristics, 2 nd edition, Springer-Verlag, New York, pp. 227{263, [6] J.Dongarra, I.Foster, G.Fox, W.Gropp, K.Kennedy, L.Torczon, and A.White, The Source book of Parallel Computing. Morgan Kauffman, [7] M. Miller, Web-Based applications that change the way you work and collaborate online. Que,
5 [8] A. Ezzat, A.M. Abdelbar, \A Less Exploitative Variation of the Enhanced Ant Colony System Applied to SOP", IEEE Congress on Evolutionary Computation, 2013 [9] J. Falcou, J. S erot, T. Chateau, and J.-T. Laprest e, Quaff: efficient C++ design for parallel skeletons, Parallel Computing,vol. 32, no. 7-8, pp , [10] J. Reinders, Intel Threading Building Blocks: Outfitting C++ for Multi-core Processor Parallelism. O Reilly, July [11] Milwaukee, Wisconsin, USA.Choobineh, F, F., Mohebbi, E., and Khoo, H. (2006) Amulti-objective tabu seach for a single-machine scheduling problem with sequencedependent setup times,european Journal of Operational Research, 175, [12] Dorigo, M. and Stützle, T., Ant colonyoptimization,the MIT Press,,Massa-chusetts, Cambridge,MA.Eren, T., and Güner, E. (2008) A bicriteria flowshopscheduling with a learning effect, Applied Mathematical Modelling, 32, [13] Gao, J., Gen, M., Sun, L., and Zhao, X. (2007) Ahybrid of genetic algorithm and bottleneck shifting for multiobjective flexible job shop scheduling problems,computer & Industrial Engineering, 53, [14]. Hoogeven, H. (2004) Multicriteria scheduling, European Journal of Operational Research, 167(3),
PERFORMANCE COMPARISON OF PARALLEL PROGRAMMING PARADIGMS ON A MULTICORE CLUSTER
PERFORMANCE COMPARISON OF PARALLEL PROGRAMMING PARADIGMS ON A MULTICORE CLUSTER Enzo Rucci, Franco Chichizola, Marcelo Naiouf and Armando De Giusti Institute of Research in Computer Science LIDI (III-LIDI)
More informationJournal of Theoretical and Applied Information Technology 20 th July 2015. Vol.77. No.2 2005-2015 JATIT & LLS. All rights reserved.
EFFICIENT LOAD BALANCING USING ANT COLONY OPTIMIZATION MOHAMMAD H. NADIMI-SHAHRAKI, ELNAZ SHAFIGH FARD, FARAMARZ SAFI Department of Computer Engineering, Najafabad branch, Islamic Azad University, Najafabad,
More informationA DISTRIBUTED APPROACH TO ANT COLONY OPTIMIZATION
A DISTRIBUTED APPROACH TO ANT COLONY OPTIMIZATION Eng. Sorin Ilie 1 Ph. D Student University of Craiova Software Engineering Department Craiova, Romania Prof. Costin Bădică Ph. D University of Craiova
More informationAn ant colony optimization for single-machine weighted tardiness scheduling with sequence-dependent setups
Proceedings of the 6th WSEAS International Conference on Simulation, Modelling and Optimization, Lisbon, Portugal, September 22-24, 2006 19 An ant colony optimization for single-machine weighted tardiness
More informationAn ACO Approach to Solve a Variant of TSP
An ACO Approach to Solve a Variant of TSP Bharat V. Chawda, Nitesh M. Sureja Abstract This study is an investigation on the application of Ant Colony Optimization to a variant of TSP. This paper presents
More informationHYBRID ACO-IWD OPTIMIZATION ALGORITHM FOR MINIMIZING WEIGHTED FLOWTIME IN CLOUD-BASED PARAMETER SWEEP EXPERIMENTS
HYBRID ACO-IWD OPTIMIZATION ALGORITHM FOR MINIMIZING WEIGHTED FLOWTIME IN CLOUD-BASED PARAMETER SWEEP EXPERIMENTS R. Angel Preethima 1, Margret Johnson 2 1 Student, Computer Science and Engineering, Karunya
More informationAutomatic Mapping Tasks to Cores - Evaluating AMTHA Algorithm in Multicore Architectures
ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 1 Automatic Mapping Tasks to Cores - Evaluating AMTHA Algorithm in Multicore Architectures Laura De Giusti 1, Franco Chichizola 1, Marcelo Naiouf 1, Armando
More informationAn Improved ACO Algorithm for Multicast Routing
An Improved ACO Algorithm for Multicast Routing Ziqiang Wang and Dexian Zhang School of Information Science and Engineering, Henan University of Technology, Zheng Zhou 450052,China wzqagent@xinhuanet.com
More informationvii TABLE OF CONTENTS CHAPTER TITLE PAGE DECLARATION DEDICATION ACKNOWLEDGEMENT ABSTRACT ABSTRAK
vii TABLE OF CONTENTS CHAPTER TITLE PAGE DECLARATION DEDICATION ACKNOWLEDGEMENT ABSTRACT ABSTRAK TABLE OF CONTENTS LIST OF TABLES LIST OF FIGURES LIST OF ABBREVIATIONS LIST OF SYMBOLS LIST OF APPENDICES
More informationA Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures
11 th International LS-DYNA Users Conference Computing Technology A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures Yih-Yih Lin Hewlett-Packard Company Abstract In this paper, the
More informationFinding Liveness Errors with ACO
Hong Kong, June 1-6, 2008 1 / 24 Finding Liveness Errors with ACO Francisco Chicano and Enrique Alba Motivation Motivation Nowadays software is very complex An error in a software system can imply the
More informationAdvances in Smart Systems Research : ISSN 2050-8662 : http://nimbusvault.net/publications/koala/assr/ Vol. 3. No. 3 : pp.
Advances in Smart Systems Research : ISSN 2050-8662 : http://nimbusvault.net/publications/koala/assr/ Vol. 3. No. 3 : pp.49-54 : isrp13-005 Optimized Communications on Cloud Computer Processor by Using
More informationScalability and Classifications
Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static
More informationSTUDY OF PROJECT SCHEDULING AND RESOURCE ALLOCATION USING ANT COLONY OPTIMIZATION 1
STUDY OF PROJECT SCHEDULING AND RESOURCE ALLOCATION USING ANT COLONY OPTIMIZATION 1 Prajakta Joglekar, 2 Pallavi Jaiswal, 3 Vandana Jagtap Maharashtra Institute of Technology, Pune Email: 1 somanprajakta@gmail.com,
More informationCLOUD DATABASE ROUTE SCHEDULING USING COMBANATION OF PARTICLE SWARM OPTIMIZATION AND GENETIC ALGORITHM
CLOUD DATABASE ROUTE SCHEDULING USING COMBANATION OF PARTICLE SWARM OPTIMIZATION AND GENETIC ALGORITHM *Shabnam Ghasemi 1 and Mohammad Kalantari 2 1 Deparment of Computer Engineering, Islamic Azad University,
More informationA Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment
A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment Panagiotis D. Michailidis and Konstantinos G. Margaritis Parallel and Distributed
More informationInternational Journal of Emerging Technology & Research
International Journal of Emerging Technology & Research An Implementation Scheme For Software Project Management With Event-Based Scheduler Using Ant Colony Optimization Roshni Jain 1, Monali Kankariya
More information. 1/ CHAPTER- 4 SIMULATION RESULTS & DISCUSSION CHAPTER 4 SIMULATION RESULTS & DISCUSSION 4.1: ANT COLONY OPTIMIZATION BASED ON ESTIMATION OF DISTRIBUTION ACS possesses
More informationParallel Programming Survey
Christian Terboven 02.09.2014 / Aachen, Germany Stand: 26.08.2014 Version 2.3 IT Center der RWTH Aachen University Agenda Overview: Processor Microarchitecture Shared-Memory
More informationA Survey on Load Balancing Techniques Using ACO Algorithm
A Survey on Load Balancing Techniques Using ACO Algorithm Preeti Kushwah Department of Computer Science & Engineering, Acropolis Institute of Technology and Research Indore bypass road Mangliya square
More informationModified Ant Colony Optimization for Solving Traveling Salesman Problem
International Journal of Engineering & Computer Science IJECS-IJENS Vol:3 No:0 Modified Ant Colony Optimization for Solving Traveling Salesman Problem Abstract-- This paper presents a new algorithm for
More informationApplied Soft Computing
Applied Soft Computing 11 (2011) 5181 5197 Contents lists available at ScienceDirect Applied Soft Computing j ourna l ho mepage: www.elsevier.com/locate/asoc A survey on parallel ant colony optimization
More informationHigh Performance Computing for Operation Research
High Performance Computing for Operation Research IEF - Paris Sud University claude.tadonki@u-psud.fr INRIA-Alchemy seminar, Thursday March 17 Research topics Fundamental Aspects of Algorithms and Complexity
More informationBSPCloud: A Hybrid Programming Library for Cloud Computing *
BSPCloud: A Hybrid Programming Library for Cloud Computing * Xiaodong Liu, Weiqin Tong and Yan Hou Department of Computer Engineering and Science Shanghai University, Shanghai, China liuxiaodongxht@qq.com,
More informationUsing Ant Colony Optimization for Infrastructure Maintenance Scheduling
Using Ant Colony Optimization for Infrastructure Maintenance Scheduling K. Lukas, A. Borrmann & E. Rank Chair for Computation in Engineering, Technische Universität München ABSTRACT: For the optimal planning
More informationA Review of Customized Dynamic Load Balancing for a Network of Workstations
A Review of Customized Dynamic Load Balancing for a Network of Workstations Taken from work done by: Mohammed Javeed Zaki, Wei Li, Srinivasan Parthasarathy Computer Science Department, University of Rochester
More informationAN EFFICIENT LOAD BALANCING APPROACH IN CLOUD SERVER USING ANT COLONY OPTIMIZATION
AN EFFICIENT LOAD BALANCING APPROACH IN CLOUD SERVER USING ANT COLONY OPTIMIZATION Shanmuga Priya.J 1, Sridevi.A 2 1 PG Scholar, Department of Information Technology, J.J College of Engineering and Technology
More informationMPI and Hybrid Programming Models. William Gropp www.cs.illinois.edu/~wgropp
MPI and Hybrid Programming Models William Gropp www.cs.illinois.edu/~wgropp 2 What is a Hybrid Model? Combination of several parallel programming models in the same program May be mixed in the same source
More informationA Flexible Cluster Infrastructure for Systems Research and Software Development
Award Number: CNS-551555 Title: CRI: Acquisition of an InfiniBand Cluster with SMP Nodes Institution: Florida State University PIs: Xin Yuan, Robert van Engelen, Kartik Gopalan A Flexible Cluster Infrastructure
More informationIntroduction to Cloud Computing
Introduction to Cloud Computing Parallel Processing I 15 319, spring 2010 7 th Lecture, Feb 2 nd Majd F. Sakr Lecture Motivation Concurrency and why? Different flavors of parallel computing Get the basic
More informationCHAPTER 1 INTRODUCTION
1 CHAPTER 1 INTRODUCTION 1.1 MOTIVATION OF RESEARCH Multicore processors have two or more execution cores (processors) implemented on a single chip having their own set of execution and architectural recourses.
More informationPerformance Analysis of Cloud Computing using Ant Colony Optimization Approach
Performance Analysis of Cloud Computing using Ant Colony Optimization Approach Ranjan Kumar 1, G. Sahoo 2, K. Muherjee 3 P.G. Student, Department of Computer Science & Engineering, Birla Institute of Technology,
More informationAn ACO-based Approach for Scheduling Task Graphs with Communication Costs
An ACO-based Approach for Scheduling Task Graphs with Communication Costs Markus Bank Udo Hönig Wolfram Schiffmann FernUniversität Hagen Lehrgebiet Rechnerarchitektur 58084 Hagen, Germany {Markus.Bank,
More informationOpenMP & MPI CISC 879. Tristan Vanderbruggen & John Cavazos Dept of Computer & Information Sciences University of Delaware
OpenMP & MPI CISC 879 Tristan Vanderbruggen & John Cavazos Dept of Computer & Information Sciences University of Delaware 1 Lecture Overview Introduction OpenMP MPI Model Language extension: directives-based
More informationStudy on Cloud Computing Resource Scheduling Strategy Based on the Ant Colony Optimization Algorithm
www.ijcsi.org 54 Study on Cloud Computing Resource Scheduling Strategy Based on the Ant Colony Optimization Algorithm Linan Zhu 1, Qingshui Li 2, and Lingna He 3 1 College of Mechanical Engineering, Zhejiang
More informationInterconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003
Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003 Josef Pelikán Charles University in Prague, KSVI Department, Josef.Pelikan@mff.cuni.cz Abstract 1 Interconnect quality
More informationINTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET)
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6367(Print), ISSN 0976 6367(Print) ISSN 0976 6375(Online)
More informationA Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster
Acta Technica Jaurinensis Vol. 3. No. 1. 010 A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster G. Molnárka, N. Varjasi Széchenyi István University Győr, Hungary, H-906
More informationSWARM: A Parallel Programming Framework for Multicore Processors. David A. Bader, Varun N. Kanade and Kamesh Madduri
SWARM: A Parallel Programming Framework for Multicore Processors David A. Bader, Varun N. Kanade and Kamesh Madduri Our Contributions SWARM: SoftWare and Algorithms for Running on Multicore, a portable
More informationPraktikum Wissenschaftliches Rechnen (Performance-optimized optimized Programming)
Praktikum Wissenschaftliches Rechnen (Performance-optimized optimized Programming) Dynamic Load Balancing Dr. Ralf-Peter Mundani Center for Simulation Technology in Engineering Technische Universität München
More informationA RANDOMIZED LOAD BALANCING ALGORITHM IN GRID USING MAX MIN PSO ALGORITHM
International Journal of Research in Computer Science eissn 2249-8265 Volume 2 Issue 3 (212) pp. 17-23 White Globe Publications A RANDOMIZED LOAD BALANCING ALGORITHM IN GRID USING MAX MIN ALGORITHM C.Kalpana
More informationParallel Ant Systems for the Capacitated Vehicle Routing Problem
Parallel Ant Systems for the Capacitated Vehicle Routing Problem Karl F. Doerner 1, Richard F. Hartl 1, Guenter Kiechle 1, Maria Lucka 2, and Marc Reimann 1,3 1 Department of Management Science, University
More informationHigh Performance Computing in CST STUDIO SUITE
High Performance Computing in CST STUDIO SUITE Felix Wolfheimer GPU Computing Performance Speedup 18 16 14 12 10 8 6 4 2 0 Promo offer for EUC participants: 25% discount for K40 cards Speedup of Solver
More informationParallel Ray Tracing using MPI: A Dynamic Load-balancing Approach
Parallel Ray Tracing using MPI: A Dynamic Load-balancing Approach S. M. Ashraful Kadir 1 and Tazrian Khan 2 1 Scientific Computing, Royal Institute of Technology (KTH), Stockholm, Sweden smakadir@csc.kth.se,
More informationSwarm Intelligence Algorithms Parameter Tuning
Swarm Intelligence Algorithms Parameter Tuning Milan TUBA Faculty of Computer Science Megatrend University of Belgrade Bulevar umetnosti 29, N. Belgrade SERBIA tuba@ieee.org Abstract: - Nature inspired
More informationLoad Balancing on a Non-dedicated Heterogeneous Network of Workstations
Load Balancing on a Non-dedicated Heterogeneous Network of Workstations Dr. Maurice Eggen Nathan Franklin Department of Computer Science Trinity University San Antonio, Texas 78212 Dr. Roger Eggen Department
More informationMulticore Parallel Computing with OpenMP
Multicore Parallel Computing with OpenMP Tan Chee Chiang (SVU/Academic Computing, Computer Centre) 1. OpenMP Programming The death of OpenMP was anticipated when cluster systems rapidly replaced large
More informationA STUDY OF TASK SCHEDULING IN MULTIPROCESSOR ENVIROMENT Ranjit Rajak 1, C.P.Katti 2, Nidhi Rajak 3
A STUDY OF TASK SCHEDULING IN MULTIPROCESSOR ENVIROMENT Ranjit Rajak 1, C.P.Katti, Nidhi Rajak 1 Department of Computer Science & Applications, Dr.H.S.Gour Central University, Sagar, India, ranjit.jnu@gmail.com
More informationGPU System Architecture. Alan Gray EPCC The University of Edinburgh
GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems
More informationA Performance Comparison of GA and ACO Applied to TSP
A Performance Comparison of GA and ACO Applied to TSP Sabry Ahmed Haroun Laboratoire LISER, ENSEM, UH2C Casablanca, Morocco. Benhra Jamal Laboratoire LISER, ENSEM, UH2C Casablanca, Morocco. El Hassani
More informationACO FOR OPTIMAL SENSOR LAYOUT
Stefka Fidanova 1, Pencho Marinov 1 and Enrique Alba 2 1 Institute for Parallel Processing, Bulgarian Academy of Science, Acad. G. Bonchev str. bl.25a, 1113 Sofia, Bulgaria 2 E.T.S.I. Informatica, Grupo
More informationCluster Computing at HRI
Cluster Computing at HRI J.S.Bagla Harish-Chandra Research Institute, Chhatnag Road, Jhunsi, Allahabad 211019. E-mail: jasjeet@mri.ernet.in 1 Introduction and some local history High performance computing
More informationMuACOsm A New Mutation-Based Ant Colony Optimization Algorithm for Learning Finite-State Machines
MuACOsm A New Mutation-Based Ant Colony Optimization Algorithm for Learning Finite-State Machines Daniil Chivilikhin and Vladimir Ulyantsev National Research University of IT, Mechanics and Optics St.
More informationFPGA-based Multithreading for In-Memory Hash Joins
FPGA-based Multithreading for In-Memory Hash Joins Robert J. Halstead, Ildar Absalyamov, Walid A. Najjar, Vassilis J. Tsotras University of California, Riverside Outline Background What are FPGAs Multithreaded
More informationA Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique
A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique Jyoti Malhotra 1,Priya Ghyare 2 Associate Professor, Dept. of Information Technology, MIT College of
More informationA Comparative Study of Scheduling Algorithms for Real Time Task
, Vol. 1, No. 4, 2010 A Comparative Study of Scheduling Algorithms for Real Time Task M.Kaladevi, M.C.A.,M.Phil., 1 and Dr.S.Sathiyabama, M.Sc.,M.Phil.,Ph.D, 2 1 Assistant Professor, Department of M.C.A,
More informationHybrid Algorithm using the advantage of ACO and Cuckoo Search for Job Scheduling
Hybrid Algorithm using the advantage of ACO and Cuckoo Search for Job Scheduling R.G. Babukartik 1, P. Dhavachelvan 1 1 Department of Computer Science, Pondicherry University, Pondicherry, India {r.g.babukarthik,
More informationACO Hypercube Framework for Solving a University Course Timetabling Problem
ACO Hypercube Framework for Solving a University Course Timetabling Problem José Miguel Rubio, Franklin Johnson and Broderick Crawford Abstract We present a resolution technique of the University course
More informationIntelligent Heuristic Construction with Active Learning
Intelligent Heuristic Construction with Active Learning William F. Ogilvie, Pavlos Petoumenos, Zheng Wang, Hugh Leather E H U N I V E R S I T Y T O H F G R E D I N B U Space is BIG! Hubble Ultra-Deep Field
More informationMulti-Threading Performance on Commodity Multi-Core Processors
Multi-Threading Performance on Commodity Multi-Core Processors Jie Chen and William Watson III Scientific Computing Group Jefferson Lab 12000 Jefferson Ave. Newport News, VA 23606 Organization Introduction
More informationultra fast SOM using CUDA
ultra fast SOM using CUDA SOM (Self-Organizing Map) is one of the most popular artificial neural network algorithms in the unsupervised learning category. Sijo Mathew Preetha Joy Sibi Rajendra Manoj A
More informationOptimizing Resource Consumption in Computational Cloud Using Enhanced ACO Algorithm
Optimizing Resource Consumption in Computational Cloud Using Enhanced ACO Algorithm Preeti Kushwah, Dr. Abhay Kothari Department of Computer Science & Engineering, Acropolis Institute of Technology and
More informationComparing the OpenMP, MPI, and Hybrid Programming Paradigm on an SMP Cluster
Comparing the OpenMP, MPI, and Hybrid Programming Paradigm on an SMP Cluster Gabriele Jost and Haoqiang Jin NAS Division, NASA Ames Research Center, Moffett Field, CA 94035-1000 {gjost,hjin}@nas.nasa.gov
More informationBiological inspired algorithm for Storage Area Networks (ACOSAN)
Biological inspired algorithm for Storage Area Networks (ACOSAN) Anabel Fraga Vázquez 1 1 Universidad Carlos III de Madrid Av. Universidad, 30, Leganés, Madrid, SPAIN afraga@inf.uc3m.es Abstract. The routing
More informationThe Fastest Way to Parallel Programming for Multicore, Clusters, Supercomputers and the Cloud.
White Paper 021313-3 Page 1 : A Software Framework for Parallel Programming* The Fastest Way to Parallel Programming for Multicore, Clusters, Supercomputers and the Cloud. ABSTRACT Programming for Multicore,
More informationA hybrid ACO algorithm for the Capacitated Minimum Spanning Tree Problem
A hybrid ACO algorithm for the Capacitated Minimum Spanning Tree Problem Marc Reimann Marco Laumanns Institute for Operations Research, Swiss Federal Institute of Technology Zurich, Clausiusstrasse 47,
More informationA Multi-Objective Extremal Optimisation Approach Applied to RFID Antenna Design
A Multi-Objective Extremal Optimisation Approach Applied to RFID Antenna Design Pedro Gómez-Meneses, Marcus Randall and Andrew Lewis Abstract Extremal Optimisation (EO) is a recent nature-inspired meta-heuristic
More informationA Service Revenue-oriented Task Scheduling Model of Cloud Computing
Journal of Information & Computational Science 10:10 (2013) 3153 3161 July 1, 2013 Available at http://www.joics.com A Service Revenue-oriented Task Scheduling Model of Cloud Computing Jianguang Deng a,b,,
More informationApplication Partitioning on Programmable Platforms Using the Ant Colony Optimization
Application Partitioning on Programmable Platforms Using the Ant Colony Optimization Gang Wang, Member, IEEE, Wenrui Gong, Student Member, IEEE, and Ryan Kastner, Member, IEEE Manuscript received Month
More informationFPGA area allocation for parallel C applications
1 FPGA area allocation for parallel C applications Vlad-Mihai Sima, Elena Moscu Panainte, Koen Bertels Computer Engineering Faculty of Electrical Engineering, Mathematics and Computer Science Delft University
More informationAnt Colony Optimization and Constraint Programming
Ant Colony Optimization and Constraint Programming Christine Solnon Series Editor Narendra Jussien WILEY Table of Contents Foreword Acknowledgements xi xiii Chapter 1. Introduction 1 1.1. Overview of the
More informationOn-line scheduling algorithm for real-time multiprocessor systems with ACO
International Journal of Intelligent Information Systems 2015; 4(2-1): 13-17 Published online January 28, 2015 (http://www.sciencepublishinggroup.com/j/ijiis) doi: 10.11648/j.ijiis.s.2015040201.13 ISSN:
More informationObtaining Optimal Software Effort Estimation Data Using Feature Subset Selection
Obtaining Optimal Software Effort Estimation Data Using Feature Subset Selection Abirami.R 1, Sujithra.S 2, Sathishkumar.P 3, Geethanjali.N 4 1, 2, 3 Student, Department of Computer Science and Engineering,
More informationCellular Computing on a Linux Cluster
Cellular Computing on a Linux Cluster Alexei Agueev, Bernd Däne, Wolfgang Fengler TU Ilmenau, Department of Computer Architecture Topics 1. Cellular Computing 2. The Experiment 3. Experimental Results
More informationImplementing Ant Colony Optimization for Test Case Selection and Prioritization
Implementing Ant Colony Optimization for Test Case Selection and Prioritization Bharti Suri Assistant Professor, Computer Science Department USIT, GGSIPU Delhi, India Shweta Singhal Student M.Tech (IT)
More informationAn Improved Ant Colony Optimization Algorithm for Software Project Planning and Scheduling
An Improved Ant Colony Optimization Algorithm for Software Project Planning and Scheduling Avinash Mahadik Department Of Computer Engineering Alard College Of Engineering And Management,Marunje, Pune Email-avinash.mahadik5@gmail.com
More informationOpenMP Programming on ScaleMP
OpenMP Programming on ScaleMP Dirk Schmidl schmidl@rz.rwth-aachen.de Rechen- und Kommunikationszentrum (RZ) MPI vs. OpenMP MPI distributed address space explicit message passing typically code redesign
More informationResearch on the Performance Optimization of Hadoop in Big Data Environment
Vol.8, No.5 (015), pp.93-304 http://dx.doi.org/10.1457/idta.015.8.5.6 Research on the Performance Optimization of Hadoop in Big Data Environment Jia Min-Zheng Department of Information Engineering, Beiing
More informationLecture 2 Parallel Programming Platforms
Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple
More informationEvolving an Edge Selection Formula for Ant Colony Optimization
Evolving an Edge Selection Formula for Ant Colony Optimization Andrew Runka Dept. of Computer Science Brock University St. Catharines, Ontario, Canada ar03gg@brocku.ca ABSTRACT This project utilizes the
More informationSolving Traveling Salesman Problem by Using Improved Ant Colony Optimization Algorithm
Solving Traveling Salesman Problem by Using Improved Ant Colony Optimization Algorithm Zar Chi Su Su Hlaing and May Aye Khine, Member, IACSIT Abstract Ant colony optimization () is a heuristic algorithm
More informationImproving System Scalability of OpenMP Applications Using Large Page Support
Improving Scalability of OpenMP Applications on Multi-core Systems Using Large Page Support Ranjit Noronha and Dhabaleswar K. Panda Network Based Computing Laboratory (NBCL) The Ohio State University Outline
More informationTEST CASE SELECTION & PRIORITIZATION USING ANT COLONY OPTIMIZATION
TEST CASE SELECTION & PRIORITIZATION USING ANT COLONY OPTIMIZATION Bharti Suri Computer Science Department Assistant Professor, USIT, GGSIPU New Delhi, India bhartisuri@gmail.com Shweta Singhal Information
More informationA Study of Local Optima in the Biobjective Travelling Salesman Problem
A Study of Local Optima in the Biobjective Travelling Salesman Problem Luis Paquete, Marco Chiarandini and Thomas Stützle FG Intellektik, Technische Universität Darmstadt, Alexanderstr. 10, Darmstadt,
More informationA Hybrid Technique for Software Project Scheduling and Human Resource Allocation
A Hybrid Technique for Software Project Scheduling and Human Resource Allocation A. Avinash, Dr. K. Ramani Department of Information Technology Sree Vidyanikethan Engineering College, Tirupati Abstract
More informationEffective Load Balancing for Cloud Computing using Hybrid AB Algorithm
Effective Load Balancing for Cloud Computing using Hybrid AB Algorithm 1 N. Sasikala and 2 Dr. D. Ramesh PG Scholar, Department of CSE, University College of Engineering (BIT Campus), Tiruchirappalli,
More informationMethodology for predicting the energy consumption of SPMD application on virtualized environments *
Methodology for predicting the energy consumption of SPMD application on virtualized environments * Javier Balladini, Ronal Muresano +, Remo Suppi +, Dolores Rexachs + and Emilio Luque + * Computer Engineering
More informationCombining Scalability and Efficiency for SPMD Applications on Multicore Clusters*
Combining Scalability and Efficiency for SPMD Applications on Multicore Clusters* Ronal Muresano, Dolores Rexachs and Emilio Luque Computer Architecture and Operating System Department (CAOS) Universitat
More informationEvaluation of Different Task Scheduling Policies in Multi-Core Systems with Reconfigurable Hardware
Evaluation of Different Task Scheduling Policies in Multi-Core Systems with Reconfigurable Hardware Mahyar Shahsavari, Zaid Al-Ars, Koen Bertels,1, Computer Engineering Group, Software & Computer Technology
More informationCOMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook)
COMP 422, Lecture 3: Physical Organization & Communication Costs in Parallel Machines (Sections 2.4 & 2.5 of textbook) Vivek Sarkar Department of Computer Science Rice University vsarkar@rice.edu COMP
More informationCSE 4351/5351 Notes 7: Task Scheduling & Load Balancing
CSE / Notes : Task Scheduling & Load Balancing Task Scheduling A task is a (sequential) activity that uses a set of inputs to produce a set of outputs. A task (precedence) graph is an acyclic, directed
More information14.10.2014. Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)
Overview Kyrre Glette kyrrehg@ifi INF3490 Swarm Intelligence Particle Swarm Optimization Introduction to swarm intelligence principles Particle Swarm Optimization (PSO) 3 Swarms in nature Fish, birds,
More informationPerformance Evaluation of Task Scheduling in Cloud Environment Using Soft Computing Algorithms
387 Performance Evaluation of Task Scheduling in Cloud Environment Using Soft Computing Algorithms 1 R. Jemina Priyadarsini, 2 Dr. L. Arockiam 1 Department of Computer science, St. Joseph s College, Trichirapalli,
More informationLOAD BALANCING IN CLOUD COMPUTING
LOAD BALANCING IN CLOUD COMPUTING Neethu M.S 1 PG Student, Dept. of Computer Science and Engineering, LBSITW (India) ABSTRACT Cloud computing is emerging as a new paradigm for manipulating, configuring,
More informationParallel Compression and Decompression of DNA Sequence Reads in FASTQ Format
, pp.91-100 http://dx.doi.org/10.14257/ijhit.2014.7.4.09 Parallel Compression and Decompression of DNA Sequence Reads in FASTQ Format Jingjing Zheng 1,* and Ting Wang 1, 2 1,* Parallel Software and Computational
More informationPERFORMANCE ANALYSIS OF KERNEL-BASED VIRTUAL MACHINE
PERFORMANCE ANALYSIS OF KERNEL-BASED VIRTUAL MACHINE Sudha M 1, Harish G M 2, Nandan A 3, Usha J 4 1 Department of MCA, R V College of Engineering, Bangalore : 560059, India sudha.mooki@gmail.com 2 Department
More informationCHAPTER 6 MAJOR RESULTS AND CONCLUSIONS
133 CHAPTER 6 MAJOR RESULTS AND CONCLUSIONS The proposed scheduling algorithms along with the heuristic intensive weightage factors, parameters and ß and their impact on the performance of the algorithms
More informationReliable Systolic Computing through Redundancy
Reliable Systolic Computing through Redundancy Kunio Okuda 1, Siang Wun Song 1, and Marcos Tatsuo Yamamoto 1 Universidade de São Paulo, Brazil, {kunio,song,mty}@ime.usp.br, http://www.ime.usp.br/ song/
More informationParallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes
Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Eric Petit, Loïc Thebault, Quang V. Dinh May 2014 EXA2CT Consortium 2 WPs Organization Proto-Applications
More informationHigh Performance Computing in the Multi-core Area
High Performance Computing in the Multi-core Area Arndt Bode Technische Universität München Technology Trends for Petascale Computing Architectures: Multicore Accelerators Special Purpose Reconfigurable
More information