MULTI-OBJECTIVE OPTIMIZATION USING PARALLEL COMPUTATIONS

MULTI-OBJECTIVE OPTIMIZATION USING PARALLEL COMPUTATIONS Ausra Mackute-Varoneckiene, Antanas Zilinskas Institute of Mathematics and Informatics, Akademijos str. 4, LT-08663 Vilnius, Lithuania, ausra.mackute@gmail.com, antanasz@ktl.mii.lt Abstract. Applied optimization in process design or optimal control frequently is a multi-criteria problem. There are several approaches to the solution of multi-criteria problems, but the main approach is oriented to construct the Pareto set of the considered problem. In case of nonlinear models this problem is difficult especially if the objective functions are multimodal. Therefore application of parallel methods to solution of this problem is urgent. In the present paper efficiency of several evolutionary algorithms is compared taking into account their parallelization using multi-start model and implementation perspectives on computational grids. Keywords: multi-objective optimization, parallel computations, computational grids. 1 Introduction In many real world applications global optimization methods are applied to optimize multiple criteria functions. We consider multi-objective optimization problem defined as follows [11]:, 0,,, where is a vector of objective functions and 2. Decision vectors,,,,, where is population size, belong to the feasible region. Decision vector is Pareto optimal if there is no other solution which would improve some criteria, where,,,, without worsening at least one other criterion. A solution dominates ( ) when is better than in at least one objective and not worse in the others: ( 0 0 ). Two solutions are indifferent or incomparable ( ) if neither dominates, nor dominates. The set of non-dominated solutions is called Pareto optimal set. As all the objectives are equally important the aim of multi-objective optimization methods is not focused to find an optimum solution but a set of trade-off optimal solutions that provide a reasonable approximation of the true Pareto set. Objective functions whose nondominated vectors are in the Pareto optimal set constitute a Pareto front. Multi-criteria problems may be solved combining the multiple criteria into one scalar objective whose solution is a Pareto optimal point for the original multi-objective problem but usage of these methods do not warrant optimal Pareto front for problems with complicated Pareto sets. In last decade there has been a growing interest in evolutionary multi-objective optimization [11], [14], however the computation time needed for solving multi-objective optimization is usually large in solving real world optimization problems. In this paper, we suggest a parallel multi-objective evolutionary algorithm approach based on well known multi-objective evolutionary optimization algorithms: non-dominated sorting genetic algorithm II [6] and strength Pareto evolutionary algorithm II [15]. The efficiency of the proposed parallel algorithm versions is demonstrated using some multi-objective problems. 2 Multi-objective optimization methods 2.1 NSGA II The non-dominated sorting genetic algorithm II (NSGAII) [6] is a popular multi-objective evolutionary algorithm which improves the original NSGA proposed by Srinivas and Deb [12]. The original non-dominated sorting genetic algorithm is similar to a simple genetic algorithm except the classification of non-dominated fronts and the sharing operation. The non-dominated sorting genetic algorithm II includes several main modules: the fast non-dominated sorting and the diversity preservation [6]. In the fast non-dominated sorting every solution from the population is checked with a partially filled new set for domination. In the new set the first member from population is kept. Each solution from the population is compared with all members of the new set. If a solution dominates any member in the new set this member is removed from the new set, otherwise, if a solution is dominated by any member of the new set this solution is ignored. If a solution is not dominated by any member of the new set then this solution is entered to the new set. After this procedure when all solutions of the population are checked, members in the new set constitute the non-dominated solutions set. Since the use of sharing function approach in the original NSGA cause some difficulties because that solutions largely depends on the chosen sharing parameter value and overall complexity of the sharing approach - 100 -

is O k in NSGAII the sharing function approach is replaced with a crowded comparison approach. Crowded comparison operator is used to find between solutions the one with better rank and if solutions belong to the same front this operator finds the solution which is located in a lesser crowded region. Estimate of the density of solutions surrounding a particular solution in the population is average distance of two points on either side of particular point along each of the objectives. The overall complexity of NSGAII is O nk, where k population size and n number of objectives. 2.2 SPEA II The strength Pareto evolutionary algorithm II () [17] is another popular multi-objective evolutionary algorithm which is based on its predecessor SPEA [16]. SPEA operates using two sets: a regular population and an archive of best solutions. For every generation archived solutions are updated by the combination of all non-dominated solutions in the regular population with the archived solutions. In this updating process all dominated or duplicate solutions are removed from the archive. In each generation new fitness values are assigned [17] to both: population and archived solution sets. Solutions with lower fitness values are given more importance, and solutions from the archived set are more likely to be chosen as a result. The final offspring population replaces the original population set to form the new population set for the next generation. An improved fitness assignment scheme is used in, which takes for each individual into account how many individuals it dominates and it is dominated by. This is done to improve decision making between solutions of equal fitness. Secondly, a nearest neighbour density estimation technique is incorporated in, which allows a more precise guidance of the search process. This is done to prevent the situation where there is only one archived solution which causes all population solutions to have the same fitness value. Thirdly, in only solutions from the archived set are used for the selection of parents for reproduction. 3 Parallelization paradigms The main goals of multi-objective evolutionary algorithms parallelization are usually the acceleration of the computation, the gain of solution quality according to the approximation of the true Pareto-Front and the diversity of the solutions. Parallel multi-objective evolutionary algorithms models are classified into main streams using different architectures [4]: The master slave model allows keeping the sequentiality of the original algorithm. The master keeps the population and manages the selection and the replacement steps. It sends sub-populations to the slaves that execute recombination and evaluation tasks. The slaves return back new solutions to the master. This approach is efficient when the cost of generating and evaluating new solutions is high. In the island model the population is constructed from a set of several sub-populations distributed among different processors. Each processor is responsible for the evolution of one sub-population and executes on a processing node all the steps of the own multi-objective algorithm. The goal is to obtain solutions with, at least, the same quality than the master slave model, but in a shorter runtime. In some cases the island model improves the quality of the solutions reached by masterslave model because the islands help to maintain diversity [5]. The diffusion model works with a population. Each processor is responsible for either a single individual or at most a part of population. The difference with respect to the island paradigm is that the diffusion-based scheme requires a neighbourhood structure of processors to perform the recombination and selection. There are additional two architectural streams separated according to whether the model manages a single solution or population of solutions [3],[2]: The parallel neighbourhood model divides the neighbourhood into partitions that are explored in parallel. It is particularly interesting when the evaluation of each solution is costly, and/or when the neighbourhood size is large. The multi-start model consists of executing in parallel several local searches, without any information exchange. This model tries to improve the quality of the search taking advantage of the diversity dispensed by each independent run (maybe using different parameters). In this work we will focus on parallelization of multi-objective evolutionary algorithms using multi-start model in order to achieve better diversity of solution in acceptable computation time. Experimental investigation has been made in the infrastructure of the LitGrid which is based on glite middleware. Workload Management System (WMS) in glite [http://glite.web.cern.ch/glite/] is responsible for distributing and managing jobs across computational resources available on Grid. Jobs in glite WMS are - 101 -

defined using Job Description Language (JDL). Parallel versions of described algorithms were implemented using collection type of jobs. Collection jobs describe a set of independent jobs sharing the same requirements. When submitting a collection job, WMS takes all responsibility to find appropriate resources matching job requirements. When all the jobs from collection are done, the outputs are returned to the user. 4 Performance assessment In order to understand which one of the algorithms is better and in what aspects, the algorithms are compared using performance metrics which evaluates closeness of the obtained non-dominated set to the Pareto optimal front, distribution of solutions within the set and spread of the obtained non-dominated front [7], [18]. Metrics evaluating closeness to the Pareto optimal front: ER error rate [13] is calculated by the percentage of solutions that are not in the Pareto optimal front: ER N, where N is the size of the obtained set and e 1 if the solution i is not in the Pareto set, otherwise e 0. N GD generation distance [13] which is the average distance from the set of solutions to the Pareto set: GD N, where d N is the Euclidean distance from solution i to the nearest solution in the Pareto set. Metrics evaluating diversity among obtained non-dominated solutions: Spacing method [9]: S N N, where d min N M f f and f is the m objective function. N is the population size and M is the number of objectives. The interpretation of this metric is that the smaller the value S, the better the distribution in the set. Metric evaluating both closeness and diversity: Hyper-volume [10] an area of objective space covered by the obtained solutions. For two sets of solutions whichever has the greater value of hyper-volume is the best. 5 Test problems Usually for comparison of multi-objective evolutionary algorithms are used two criteria test problems. The main goal of this work is to compare performance of two parallel multi-objective evolutionary algorithms in several test problems with three criteria. Test functions were taken from literature: LZ07_F6 [8]: min cos 0.5 cos 0.5 2 2 sin 2 min cos 0.5 sin 0.5 2 2 sin 2 min sin 0.5 2 2 sin 2 here: DTLZ2 [1]: here: 3, 1 is multiplication of 3, 3, 2 is multiplication of 3, 3, is multiplications of 3, 0, 1, 2,, 2, 10. min 2 1 cos 2 cos 2, min 2 1 cos 2 sin 2, min 2 1 sin 2, 1 2 0.5 0.5, 0 1, 1,,, 12. - 102 -

6 Empirical results Parallel evolutionary multi-objective optimization algorithms efficiency was evaluated according four criteria of efficiency: ER error rate, S spacing criteria, HV hyper-volume, GD generation distance. First we evaluated non-parallel versions of NSGAII and. Experiments were performed with various populations sizes and number of generations. In Table 1 are presented results of multi-criteria test problems optimization. Values of criteria shows that the most importance to quality of obtained Pareto gives defined population size; the population size is bigger, the approximated Pareto front is better. Even increasing of number of generations does not improve optimization results. Parallel versions of algorithms were investigated in a two ways. First we defined population size 200 and number of generations 5000 and performed these parallel algorithms on 5, 10, 15 processors. Results are presented in Table 2. Table 1. Performance assessment of non-parallel versions of algorithms LZ07_F6 DTLZ2 Algorithm ER S HV GD ER S HV GD NSGAII Pop: 1000 0.7400 0.0269 0.9083 0.1105 0.3870 0.0169 0.4506 0.0185 Gen: 5000 0.5960 0.0936 0.9747 0.0111 0.2850 0.0073 0.4716 0.0174 NSGAII Pop: 1000 0.8135 0.0262 0.9408 0.0126 0.4220 0.0164 0.4445 0.0180 Gen: 500 0.7980 0.0677 0.9437 0.0098 0.2980 0.0070 0.4610 0.0178 NSGAII Pop: 500 0.9300 0.1417 0.9577 0.1066 0.4380 0.0229 0.4299 0.0255 Gen: 5000 0.7340 0.1204 0.9581 0.0231 0.3480 0.0098 0.4818 0.0253 NSGAII Pop: 500 0.8280 0.1617 0.9501 0.8280 0.4680 0.0243 0.4351 0.0257 Gen: 3000 0.7640 0.2073 0.9679 0.0293 0.3620 0.0097 0.4495 0.0252 NSGAII Pop: 500 0.8380 0.0373 0.8803 0.0200 0.5360 0.0259 0.4463 0.0257 Gen: 1000 0.8900 0.0752 0.8846 0.0164 0.4180 0.0091 0.4601 0.0254 NSGAII Pop: 500 0.7140 0.0590 0.9051 0.0284 0.4960 0.0250 0.4417 0.0260 Gen: 500 0.8680 0.1031 0.9004 0.0145 0.4020 0.0100 0.5037 0.0249 Table 2. Performance assessment of parallel versions of algorithms LZ07_F6 DTLZ2 Algorithm ER S HV GD ER S HV GD NSGAII Pop: 200 0.9850 0.0796 0.9674 0.1136 0.5670 0.0170 0.4768 0.0183 Gen: 5000 Proc: 5 0.9210 0.3666 0.9814 0.0194 0.4670 0.0147 0.5156 0.0178 NSGAII Pop: 200 0.9550 0.0780 0.9872 0.0947 0.5830 0.0122 0.4975 0.0130 Gen: 5000 Proc: 10 0.9065 0.1614 0.9978 0.0250 0.4930 0.0117 0.5072 0.0127 NSGAII Pop: 200 0.9380 0.0546 0.9960 0.0583 0.5837 0.0100 0.5276 0.0106 Gen: 5000 Proc: 15 0.9160 0.0933 0.9981 0.0142 0.5020 0.0098 0.5660 0.0104 NSGAII Pop: 1000 0.7863 0.0358 0.9800 0.0036 0.4166 0.0073 0.5162 0.0081 Gen: 500 Proc: 5 0.7808 0.0353 0.9885 0.0670 0.4190 0.0061 0.5167 0.0076 NSGAII Pop: 1000 0.8141 0.0240 0.9957 0.0044 0.4376 0.0056 0.5120 0.0060 Gen: 500 Proc: 10 0.8164 0.0284 0.9930 0.0035 0.4483 0.0051 0.5191 0.0057 NSGAII Pop: 1000 0.8387 0.0246 0.9933 0.0025 0.4553 0.0045 0.5181 0.0046 Gen: 500 Proc: 15 0.8164 0.0287 0.9930 0.0037 0.4497 0.0044 0.5217 0.0045-103 -

Obtained results are similar to results of non-parallel performance of the algorithms. But non-parallel algorithms execution with used parameters is time consuming process and using parallel versions of algorithms improves performance by time criterion. For example, non-parallel with population size 1000 and number of generations 5000 execution takes 35135 s and parallel with population size 200, number of generations 5000 and number of processors 5 takes 5592 s. This is important because similar results are obtained in shorter time. Second experiment was performed to verify hypothesis that usage of larger population size and smaller number of generations on several processors enables to improve Pareto front approximation. For experimental investigation we used population size 1000, number of generations 500 and number of processors 5, 10, 15. Results are presented in Table 2. Comparing results with results obtained with non-parallel algorithms Pareto approximations of parallel algorithms are better. Better approximations of Pareto fronts are obtained with parallel version of algorithm. This can be observed in Figures 1-8. In these figures are presented Pareto fronts approximations obtained with nonparallel and parallel versions of algorithms and with various algorithm parameters. Figure 1. Pareto front approximation for LZ07_F6 problem with non-parallel version (pop. size Pareto front marked by ( ). Figure 2. Pareto front approximation for LZ07_F6 problem with parallel version (pop. size 1000, indicated by ( ). True Pareto front marked by ( ). Figure 3. Pareto front approximation for LZ07_F6 problem with non-parallel NSGAII version (pop. size Pareto front marked by ( ). Figure 4. Pareto front approximation for LZ07_F6 problem with parallel NSGAII version (pop. size 1000, indicated by ( ). True Pareto front marked by ( ). - 104 -

Figure 5. Pareto front approximation for DTLZ2 problem with non-parallel version (pop. size Pareto front marked by points (.). Figure 6. Pareto front approximation for DTLZ2 problem with parallel version (pop. size 1000, indicated by ( ). True Pareto front marked by points (.). Figure 7. Pareto front approximation for DTLZ2 problem with non-parallel NSGAII version (pop. size Pareto front marked by points (.). Figure 8. Pareto front approximation for DTLZ2 problem with parallel NSGAII version (pop. size 1000, indicated by ( ). True Pareto front marked by points (.). 7 Conclusions Multi-objective evolutionary optimization methods are efficient techniques for construction of Pareto front for problems with more than two criteria. Best approximation of Pareto front is obtained when large initial population is used. The considered methods are amenable to parallelization including grid technology. The use of multi-start model for parallelization of multi-objective algorithms reduced computation time and let to obtain better Pareto front approximations. 8 Acknowledgements The authors acknowledge the support of Lithuanian Fund for Science and Studies, and Agency for International Science and Technology Development Programmes in Lithuania through COST programme. References [1] Abraham A., Jain L. C., Goldberg R. Evolutionary Multiobjective Optimization: Theoretical Advances and Applications, Springer, 2005. [2] Banos R., Gil C., Paechter B., Ortega J. Parallelization of population-based multi-objective meta-heuristics: An empirical study. Applied Mathematical Modelling, vol. 30, 2006, pp. 578 592. [3] Cahon S., Melab N., Talbi E.-G. ParadisEO: a framework for the flexible design of parallel and distributed hybrid metaheuristics. J. Heuristics, 10 (3), 2004, pp. 357 380. - 105 -

[4] Cant-paz E. A survey of parallel genetic algorithms. Calculateurs Paralleles, vol. 10, pp 141-171, http:// www. illigal.ge.uiuc.edu/~cantupaz/publications/cparalleles98-survey.ps.gz, 1998 (2008.09.19) [5] Cant-Paz E. Migration policies, selection pressure and parallel evolutionary algorithms. Technical report Illigal TR- 99015, University of Illinois, Urbana-Champaign, 1999, http://citeseerx.ist.psu.edu/viewdoc/summary? doi=10.1.1.21. 5813. [6] Deb K., Pratap A., Agarwal S., Meyarivan T. A fast and elitist multi-objective genetic algorithm: NSGA II. IEEE Transactions on Evolutionary Computation, vol. 6, Nr. 2, 2002, pp. 182 197. [7] Deb K., Thiele L., Laumanns M., Zitzler E. Scalable test problems for evolutionary multi-objective optimization. In Abraham A., Jain R., Goldberg R. eds.: Evolutionary Multiobjective Optimization: Theoretical Advances and Application, Springer, 2005, pp. 105 145. [8] Li H., Zhang Q. Multiobjective Optimization Problems with Complicated Pareto Set, MOEA/D and NSGA-II. IEEE Transactions on Evolutionary Computation, in pres, 2008. [9] Naujoks, B., Beume, N., and Emmerich, M. Multi-objective optimisation using s-metric selection: application to three-dimensional solution spaces. In Evolutionary Computation, IEEE Press, the 2005 IEEE Congress on Publication Date: 2--5 Sept. 2005 (Piscataway, NY, 2005), vol. 2, 2005, pp. 1282-1289. [10] Nebro A.J., Luna F., Alba E., Beham A., Dorronsoro B. AbYSS: Adapting Scatter Search for multiobjective optimization. Tech Rep. ITI-2006-2, University of Málaga, 2006. [11] Miettinen K. M. Nonlinear Multiobjective Optimization. Kluwer Academic Publishers, 1999. [12] Srinivas N., Deb K. Multi-objective function optimization using non-dominated sorting genetic algorithms. Evolutionary computation, Vol. 2, 1995, pp. 221-248. [13] Veldhuizen D. Multiobjective evolutionary algorithms: classifications, analyses, and new innovation. Ph. D. thesis, Department of electrical engineering and computer engineering, Airforce institute of technology, Ohio, 1999, http://citeseer.ist.psu.edu/old/vanveldhuizen99multiobjective.html. [14] Wolpert D.H., Macready W.G. No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, vol.1, Issue 1, 1997, pp. 67 82, digital Object Identifier 10.1109/4235.585893. [15] Zitzler E., Laumanns M., Thiele L. SPEA2: Improving the Strength Pareto Evolutionary Algorithm. In EUROGEN 2001.Evolutionary Methods for Design, Optimization and Control with Applications to Industrial Problems, Giannakoglou K., Tsahalis D., Periaux J., Papailou P., Fogarty T., Eds., Athens, Greece, 2002, pp. 95 100. [16] Zitzler E., Thiele L. An evolutionary algorithm for multiobjective optimization: The strength Pareto approach. Technical Report 43, Computer Engineering and Networks Laboratory (TIK), Swiss Federal Institute of Technology (ETH) Zurich, Gloriastrasse 35, CH-8092 Zurich, Switzerland, 1998. [17] Zitzler E., Thiele L. Multiobjective evolutionary algorithms: A comparative case study and the strength Pareto approach. IEEE Transactions on Evolutionary Computation, 3(4), 1999, pp. 257 271. [18] Zitzler E., Thiele L., Deb K. Comparison of multiobjective evolutionary algorithms: Empirical results. Evolutionary Computation, 8(1), 2000, pp. 173-195. - 106 -