Ant Colony Optimization for Data Grid Replication Services Technical Report RR-06-08. DIIS. UNIZAR.

Ant Colony Optimization for Data Grid Replication Services Technical Report RR-06-08. DIIS. UNIZAR. Víctor Méndez Muñoz 1 and Felix García Carballeira 2 1 Universidad de Zaragoza, CPS, Edificio Ada Byron, María de Luna, 1. 50018 Zaragoza, Spain vmendez@unizar.es, eureka@nodo50.org, 2 Universidad Carlos III de Madrid, EPS, Edificio Sabatini, Av. de la Universidad, 30, 28911 Leganés. Madrid. Spain fgcarbal@inf.uc3m.es Abstract. Nowadays data Grid replication is still the critical point for improving the performance of data intensive applications. Furthermore, data replication provides fault tolerance and load balancing. Most of the techniques for data replication use Replica Location Services (RLS) to resolve the logical name of files to its physical locations. An example of such service is Giggle, which can be found in the OGSA/Globus architecture. Classical algorithms also need some catalog and optimization services. For example, the EGEE Data Grid project, based in Globus open source components, implements for this purpose the Replica Optimization Service (ROS) and the Replica Metadata Catalog (RMC). Our previous work have provided an enhanced Giggle framework, that simplify the location service using a flat catalogue structure, that combined with appropriate heuristic, obtain much better performances than traditional approaches. With this aim, we propose the use of Emergent Artificial Intelligence (EAI) techniques to data replication. In previous works, the authors have applied the Particle Swarm Optimization to data replication. In this paper we apply Ant Colony Optimization (ACO) techniques, another technique of the EAI, to data replication. Theses techniques are based on the soft flat location service described on previous work. The simulation results presented in this paper confirm that ACO using the enhanced Giggle, is another Emergent Artificial Intelligent approach, that improve performance compared with traditional solutions. 1 Introduction Grid middleware and applications are using local scheduling and data co-scheduling, but no global scheduling is today working in real life. While research efforts are focusing on this ways, data Grid replication systems are still the critical part in final makespan. The assumed framework for data replication, based on OGSA[1] and Giggle[2] de facto standards, are giving directions for greedy algorithms with complicated scaling features, as we explain on the following related work section.

Next years Grids are expected to grow in complexity and data size in orders of magnitude, so it is urgent to take up data replication as the critical Grid issue. Our previous work[3] propose an enhanced Giggle framework, that simplified the location service in a flat catalog structure, that combined with suitable heuristic, realize much better performances than traditional approaches. The resulting location service is shown on figure 1 and based on Globus middleware[4]. Fig. 1. Flat RLS architecture The Giggle location service join with a replica optimization and catalog services for real Globus infrastructures, like for example EGEE middleware using ROS and RMC. Our flat Giggle works with simplified versions of this services, supporting optimization service with Emergent Artificial Intelligent(EAI) algorithms,. Given a logical identifier of a file(lfn), the RLS must to provide the physical locations of the replicas for the file(pfn). The RLS consists of two components. Local Replica Catalogs (LRCs) manages consistent information about logical to physical mappings on each site or node. Virtual Organisations are usually geographical and user affinity communities around a big data producer, in the scale of Tera Bytes a day, with the aim of extract information from this read-only remote data, by running jobs on the Grid. On this context replication is used for fault tolerance as well as to provide load balancing by distributed replicas of data. Intuitively our locally independent paradigm fits well with user and geographical affinities, that works well in similar environments like web-proxing[5][6]. The mentioned paper[3] describes particle swarm optimization algorithm, that fills the implicit interface, using the soft replica optimization and catalog services. The present paper study ACO, another EAI approach, as an alternative to PSO and in different scenarios.

Next section of this paper describes the related work on others data Grid replica state of the art algorithms, and we analyse the research branches to get some theoretical conclusions. After related work section we explain the proposed ACO algorithm, that is the main contribution of the paper. On the fourth section we shown the evaluation methodology, and on fifth we present experimental results of our improved approach to the data Grid. Finally we summarise some conclusions and future work. 2 Related Work Chervenak et al.[2] describe performance results for five implementation approaches based on some canonical Giggle configurations. They use prototype implementations that show good scalability but does not include network simulation, the prototype is focused on disks throughput, but both disks and network could be system lack depending on study issue class. There are many approaches that propose an economical algorithm for replica selection where the costs of a file transfers are evaluated as: cost(f, i, j) = f(bandwidht i,j, size f ) (1) Lamehamedi and Deelman approach[7] uses both hierarchical and flat propagation graphs spanning the overall set of replicas to overlay replicas on the Data Grid and minimising inter-replica communications cost. Beginning on the hierarchical Giggle topology they introduce a flat-tree structure with redundant interconnections for its nodes; closer the node is to the root, more interconnections it has. The flat-tree was originally introduced by Leisersons[8] to improve the performance of interconnection networks in parallel computing systems. Lamehamedi et al. identifies on this approach that flat-tree on a ring topology suits best than hierarchical with multiple servers or peer replica applications. For simulation framework they use a network simulation[9], without considerer the disks throughput, so results are limited by the premise that the system lack is on the network. Anyway they obtain rough network resource consumption evaluation comparing with the pure hierarchical RLS. Another economic approach[10][11] understand the Grid as a market where data files represent the goods. They are purchased by Computing Elements for jobs and by Storage Element in order to make an investment that will improve their revenues in the future. The files are sold by Storage Elements to Compute Elements and to other Storage Elements. Compute Elements try to minimise the file purchase cost and the Storage Elements have the goal of maximising profits. When a replication decision is taken, the file transfer cost is the price for the good, like the function 1 show above. The Replica Optimiser may replicate or not based on whether the replication(with associated file transfer and file deletion) will result in to reduce the expected future access cost for the local Computing Elements. Replica Optimiser keeps track of the file requests it receives and uses an evaluation function: E(f, r, n), defined in [12] that returns the predicted

number of times a file f, will be request in the next n, based on the past r request history base line. The prediction function E is calculated for a new file request received on Replica Optimiser for file f. E is also calculated for every file in the storage node. If there is no file with less value than the value of new file request f, then no replication occurs. Otherwise least value file is selected for deletion an new replica is created for f. The research group that propose this approach also present OptorSim [13] [14], the first Grid simulator that holds network and in some way disk costs. The first version was time driven but second version is event driven and it also has others scheduling improvements[15]. Results [10] present some specific realistic cases where the economic model shows marked performance improvements over traditional methods. A Peer-to-Peer replica location service based on a distributed hash table[16] is fill on Giggle with Peer-to-Peer-RLI(P-RLI). P-RLI uses the Chord algorithm to self-organise P-RLI and it exploits the Chord overlay network to replicate P-RLI mappings. The Chord algorithm also route adaptively the P-RLI logical names with LRC sites. The replication of mappings provides a high level of reliability in the P-RLI, the consistency is stronger than in simple RLI nodes. The P-RLS performance is tested on a 16-node cluster scale with the network size. It is also tested with a simulation for larger network of P-RLI nodes, evaluating the distribution of mappings in the P-RLS network. The simulation for this test section is not a complete simulation of the P-RLS system, but rather, it focuses on how keys are mapped to the P-RLI nodes and how queries for mappings are resolved in the network. Other approaches are focus on the impact of data replication management on job scheduling behaviour[17][18], both works of Kavitha Ranganathan et al. The first one with Ian Foster, of 2003, explain the importance of a simultaneous evaluation of scheduling with Data Grid replication algorithms, despite of isolated scheduling evaluations. They also enunciate that a scalable Grid simulating evaluation tool must avoid the use of one thread by job request, and propose Chicsim. The second work with Thomas Phan, of 2005, try to deal with co-scheduling job assignment and data replication in depth, and propose a genetic algorithm(ga), to find the best combination of job scheduling and data replication variables. The proposed algorithms on this approach are focusing on the combination of computing and data Grid, but the proposed data replication strategies are the canonical ones, and others very simplified that other state of the art approaches have improved. Furthermore, nowadays the bottleneck is on the data replication, and may be study separately, which is not the same as scheduling that may be join with data replication. Nowadays there are many approaches with similar methods and similar performances as state of the art above. Other descentralized adaptive replication mechanism[19] organise nodes into overlay network and distribute location information, but do not route requests. Each node that participates in the distribution network build, in time, a view of the whole system and can answer queries locally without forwarding request. Unfortunately this is not common

on large scale scientific datasets, that suppose the most of the operative Grid infrastructures. 3 The Ant Colony Optimization algorithm ACO is a branch of the EAI. Dr.Dorigo, Dr.Maniezzo and Dr.Colorni propose on the beginning of the 90 the systems based on ants behaviour[20]. After that, ant systems where algorithmically enunciated for optimization in problems like the salesman traveller and others[21][22]. Ants are social beings with hight structured colonies based on very simple individual behaviour. For computational purposes is relevant the way they find paths between food sources and anthill. While walking ants places on the ground some amount of pheromone. Ants smell pheromone and when choosing their way, they tend in probability to the paths marked with stronger pheromone concentrations. When the time pass the pheromone concentration decrease. Repeating same behaviour they compose optimised trails that are dynamically defining and they use to find food sources and their nest. The historic algorithm was enunciate by Dr.Dorigo for salesman traveller[23]. This environment is very similar to the Grid and can be used in a very direct way that we shown bellow, following the algorithm: Initialise Loop /* An iteration */ Each ant is positioned on a starting node. Loop /* A step */ Each ant applies a state transition rule to incrementally build a solution and a local pheromone updating rule Until all ants have built a complete solution A global pheromone updating rule is applied Until End condition. Each edge between node (r, s) has a distance or cost associate δ(r, s) and a pheromone concentration τ(r, s). The equation 2 is the state transition rule, that is a probabilistic function for each node u, that has not been visited by each placed ant on node r. P k (r, s) = [τ(r, s)][η(r, s)]β Σ[τ(r, u)][η(r, u)] β (2) We also have there the η(r, s) that is the reverse tau function. The parameter β determine the relevance of the pheromone concentration compared with the distance or cost, δ(r, s), and always β > 0. The pheromone concentration on equation 3 is applied in each edge of the systems, for a global pheromone updating rule. τ(r, s) = (1 α)τ(r, s) + Σ τ k (r, s) (3)

Where α is the pheromone evaporation factor between 0 and 1. And τ k (r, s) is the reverse of the distance or cost done by ant k, if (r,s) is its path and is 0 if it is not in the path. The ACO-Grid flavour modified from the original ACO is as follow: Every Grid request is an ant, when it find its file object, the ant died. The Grid replies routing is done with traditional methods. ACO-Grid does not use global updating. Every time a request is processed on a Grid site, tau is update for all the site conections. With this hack we avoid periodical discrete time synchronisation between nodes, thus we do not need any control traffic between sites. On the other hand α may be tune for less evaporation: in a site with n conections n-1 will be evaporating and only one will increase pheromone. For our study case below, we use α = 0.01. The Grid distance or cost is defined on equation 4 as a function of network latency lt and bandwidth bw. δ(r, s) = lt r,s c1 + (MAX BW bw r,s ) c2 (4) On the equation 4 c1 and c2 are coefficients that balance the relative relevance between latency and bandwidth, they also should fit with the bandwidth and latency values of the specific Grid infrastructure, and also fit with their measure relationship (ms. and MB/s.). For our case c1 = 1 and c2 = 0.2. At the end of the day latency is more important than bandwidth, because latency is always a constant, and bandwidth has a variable behaviour depending on sockets allocations and number of network request in a specific moment. MAX BW is the highest bandwidth of all the Grid infrastructure. 4 Evaluation Methodology We have implemented a full simulation toolkit, SiCoGrid, a completed Grid simulation environment as we describe on previous work[3]. SiCoGrid is developed in Parsec[24] that is a combination of C and a simulator parser for creating event driven simulators, and also use DiskSim[25] for the storage disks simulation subsystem. Table 1. Scaled Grid Stage Tier Class Real MB/s Scaled Mb/s / 20 Real TB Scaled TB / 20 1 2048 102.4 220 11 2 320 16 100 5 3 10 0.5 20 1 We have configured our SiCoGrid for a common Grid site clase[26] shown on the Table 1. These is the typical CERN Data Grid specification for node tier

class of a Grid infrastructure for different Virtual Organizations. The storage capacity, file size, and network bandwidth is scaled in the magnitude of twenty, for time simulation reasons. Therefore the obtained time results will be on the same magnitude. On the Figure 2 we can see the network infrastructure used in our experiment. The graph disposes a nomenclature where the nodes has a first number that is the tier class, and after the point another identification number. Below there is the storage size of the node in TB. The networks have assigned two numbers, the first one is the latency in ms and the other is the bandwidth in MB/s. Fig.2. Simulated Grid Stage The experiment workload are log files created with specific application of the SiCoGrid toolkit, and dependes on the folowing. The number of Grid clients N on each site of the Figure 2. The number of jobs M that a client do in a simulation(experiment lenghth). Every job has many block file request. We use Gaussian random walk file access pattern, that is the best performance for the state of the art economic OptorSim approach[13]. We have design an experiment with for levels for N and M factors: 4x4, 4x5, 4x6, 4x7, 5x4, 5x5,... 7x6, 7x7. We repeat for unconditional, market model and ACO algorithms. We also do three statistical occurrences for each experiment instantation. 5 Simulation Results We following present Grid simulation results based on the stage described above. The job response time is scale in the magnitude of 20 to usual jobs duration from hours to some days.

Fig.3. Experiments Results in the Simulated Grid Stage The Figure 3 shows the unconditional algorithm performances on the top of the chart, with black color line, for comparison purposes as a canonical algorithm for location and selection, where all block files are unconditional requested to the original file producer, with Last Recent Used(LRU) as the deletion algorithm. The bottom lines are the market model on heavy gray and the ACO on light gray. Market model is the OptorSim optimization and deletion algorithm. Our ACO approach uses LRU for deletion. The results are very similar for the two compared state of the art algorithms. ACO improve performances on the half of the simulation series, for 4x4, 4x7, 6x5, 6x6, 6x7, 7x5, 7x6, and 7x7, that are the heavy workloads. Thus ACO is intuitively expected to performs better on bigger Grid infrastructures. Furthermore we compare canonical line with the others, and we can see that ACO follow the unconditional pattern on lower response time results. Market model start following this pattern, but with heavy workload experiments, on the right part of the chart, market model loses the pattern and it is quickly prone to join with unconditional algorithm line. The simulations does not consider fault tolerance issues. On a real infrastructures, Grid failures are important for final job response time. Considering fault events, ACO may even show much better results compared with market model, and other approaches that exploit functionalities dependencies between Grid sites. The reason for better ACO behaviour on fault tolerance, is that our approach does not need any restore time. After a resource failure, the hold system is still working the same as before the failure. On ACO the control services are flat distributed on the Grid, and all the sites are funtionality independent with each others.

ACO-Grid realize this performances due to its features: no control traffic; distributed optimization, localisation and selection services; autonomous management of each node that will fit best on user and geographical affinities; collaborative strategic against competitive strategic of the economic, that usually performs better on the long term. 6 Conclusions and Future Work We have described a relevant contribution to the Data Grid corpus. Specific ACO algorithm has been proved as better performance job response time and better scalability features than traditional approaches, with no control traffic between Grid sites. It has been evaluated using a full network and disk simulation, SiCoGrid. We have open research line for cyclical graph Grid infrastructure simulations. On cyclical graph ACO has been proved as the top state of the art algorithm, like salesman traveller problem and others very similar to our Grid on a cycled infrastructure, so we are expected for this experiments results. We also have research targets on fault tolerance issues and depth variable correlations studies. 7 Acknowledgements This work has been supported by the Spanish Ministry of Education and Science under the TIN2004-02156 contract. References 1. Foster, I., Kesselman, C., M.Nick, J., Tuecke, S.: The physiology of the grid an open grid services architecture for distributed system integration. Technical report, Globus Proyect Draft Overwiev Paper (2002) 2. Chervenak, A.L., Deelman, E., Foster, I., Iamnitchi, A., Kesselman, C., Hoschek, W., Kunszt, P., Ripeanu, M., Schwartzkopf, B., Stockinger, H., Stockinger, K., Tierney, B.: Giggle: A framework for constructing scalable replica location services. In: Proc. of the IEEE Supercomputing Conference (SC 2002), IEEE Computer Society Press (November 2002) 3. Méndez, V., García, F.: Pso-lru algorithm for datagrid replication service. In: Proceedings of 2006 International Workshop on High-Performance Data Management in Grid Environments, Springer-Verlag (2006) 4. Foster, I., Kesselman, C.: Globus: A metacomputing infrastructure toolkit. IJSA 11(2) (1997) 115 128 5. JOE, T., Hughes, A.S.: Lsam proxy cache: a multicast distributed virtual cache. Computer Networks and ISD Systems (1998) 6. Bassali, H.S., Kamath, K.M., Hosamani, R.B., Gao, L.: Hierarchy-aware algorithms for cdn proxy placement in the internet. Computer Communications (2002) 7. Lamehamedi, H., Szymanski, B., shentu, Z., Deelman, E.: Data replication strategies in grid environments. In: Proceedings Of the Fifth International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP02) (2002)

8. Leiserson, C.H.: Flat-trees: Universal network for hardware-efficient supercomputing. IEEE Transactions on Computers C-34(10) (1985) 892 901 9. http://www mash.cs.berkeley.edu/ns: Ns network simulator (1989) 10. Bell, W.H., Cameron, D.G., Capozza, L., Millar, A.P., Stockinger, K., Zini, F.: Simulation of dynamic grid replication strategies in optorsim. In: Proc. of the ACM/IEEE Workshop on Grid Computing (Grid 2002), Springer-Verlag (November 2002) 11. Cameron, D.G., Carvajal-Schiaffino, R., Millar, A.P., Nicholson, C., Stockinger, K., Zini, F.: Evaluating scheduling and replica optimisation strategies in optorsim. In: International Workshop on Grid Computing (Grid 2003), IEEE Computer Societe Press (November 2003) 12. Capozza, L., Stockinger, K.,, Zini., F.: Preliminary evaluation of revenue prediction functions for economically-effective file replication. Technical report, DataGrid-02-TED-020724, Geneva, Switzerland, July 2002 (July 2002) 13. Bell, W.H., Cameron, D.G., Capozza, L., Millar, A.P., Stockinger, K., Zini, F.: Optorsim - a grid simulator for studying dynamic data replication strategies. International Journal of High Performance Computing Applications 17(4) (2003) 14. Cameron, D.G., Carvajal-Schiaffino, R., Millar, A.P., Nicholson, C., Stockinger, K., Zini, F.: Analysis of scheduling and replica optimisation strategies for data grids using optorsim. International Journal of Grid Computing 2(1) (2004) 57 69 15. Cameron, D.G., Carvajal-Schiaffino, R., Millar, A.P., Nicholson, C., Stockinger, K., Zini, F.: Optorsim: A simulation tool for scheduling and replica optimisation in data grids. In: International Conference for Computing in High Energy and Nuclear Physics (CHEP 2004), Interlaken (September 2004) 16. Min Cai, Ann Chervenak, M.F.: A peer-to-peer replica location service based on a distributed hash table. In: Proceedings of the High Performance Computing, Networking and Storage Conference, SCGlobal (2004) 17. Ranganathan, K., Foster, I.: Simulation studies of computation and data scheduling algorithms for data grids. Journal of Grid Computing 1(1) (2003) 18. Phan, T., Ranganathan, K., Sion, R.: Evolving toward the perfect schedule: Coscheduling job assignments and data replication in wide-area systems using a genetic algorithm. In: JSSPP. (2005) 173 193 19. Ripeanu, M., Foster, I.: A decentralized, adaptive replica location mechanism. In: 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11). (2002) 20. M., D., Maniezzo, V., Colorni, A.: The ant system: An autocatalytic optimizing process. technical report no. 91-016 revised. Technical report, Politecnico di Milano (1991) 21. Dorigo, M., Maniezzo, V., Colorni, A.: The ant system: Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics, part B 26(1) (1996) 1 13 22. Dorigo, M., Gambardella, L.M.: Ant colony system: A cooperative learning approach to the traveling salesman problem. IEEE Transactions on Evolutionary Computation 1(1) (1997) 23. Dorigo, M.: Optimization, learning and natural algorithms. PhD thesis, Politecnico di Milano, Italy. (1992) 24. Leijen, D.: Parsec, a fast combinator parser. Technical report, Computer Science Department, University of Utrecht (2002) 25. R.Granger, G., L.Worthington, B., N.Patt, Y., eds.: The DiskSim Simulation Environment. Version 2.0 Reference Manual. University of Michigan (1999)

26. Ranganathan, K., Foster, I.: Identifying dynamic replication strategies for a highperformance data grid. Technical report, Departament of Computer Science, The University of Chicago (2000)