Advances in Safety and Reliability Kołowrocki (ed.) 2005 Taylor & Francis Group, London, ISBN 0 415 38340 4 Maintenance scheduling by variable dimension evolutionary algorithms P. Limbourg & H.-D. Kochs Institute of Information Technology, Department of Engineering, University of Duisburg-Essen, Duisburg, Germany ABSTRACT: Lots of research has been done on optimization strategies finding optimal maintenance schedules. Much less attention is turned to how the schedule is represented to the algorithm. Either the search space is represented as a binary string leading to highly complex combinatorial problem or maintenance operations are restricted to regular intervals which may prevent optimal solutions. An adequate representation however is vitally important for result quality. This work presents several nonstandard input representations and compares them to the standard binary representation. An evolutionary algorithm with extensions to handle variable length genomes is used for the comparison. The results demonstrate that two new representations performs equal or better than the binary representation scheme. A second analysis shows that the performance also increases using modified genetic operators. Thus, the choice of alternative representations leads to better results in the same amount of time and without any loss of accuracy. 1 INTRODUCTION Scheduling the maintenance of technical systems is a nontrivial optimization task which is influenced by many different aspects like reliability and cost optimization. Normally, these two aspects are conflicting as short maintenance intervals cause high maintenance costs. The resulting problems are highly complex and therefore the use of black box optimization strategies has been extensively researched. The optimization of a maintenance schedule can be viewed as a three layered model (Figure 1). The system model simulates the maintenance of a technical system. It receives a maintenance schedule containing a list of maintenance events for all components. Its output are values such as the reliability over time R(t) and the maintenance costs CS. Methods for modelling and calculating reliability are reaching from Monte Carlo simulations, Petri nets and Markov models to reliability block diagrams (Billinton and Allan 1992). The topmost layer is the optimization strategy. Optimization strategies are problem solving heuristics which intelligently guess new solutions based on older experiences and some general assumptions. Its output are decision values. Most of the used optimization strategies are vector-based. The decision values are restricted to vectors with binary/integer/real elements of constant length, so called decision vectors x. For each decision value, they need to receive an objective vector y as a feedback. Many applied strategies belong to the field of evolutionary algorithms (Bris et al. 2003; Elegbede and Adjallah 2003) while other Figure 1. Maintenance optimization process as a three layered model. 1267
strategies such as dynamic programming (Barata et al. 2002) and tabu search (Gopalakrishnan et al. 2001) are also used. The middle layer, the input/output coding (I/O coding) is rarely investigated by researchers but simply stated. The output coding maps the system model output to an objective vector. This vector could be of one dimension (single objective optimization) or of more than one dimension (multi-objective optimization). The input coding is responsible for the transformation of the decision vector to a maintenance schedule. In this work we compare different input coding functions and line out the influence of these functions on the optimization results. Section 3 introduces the system model and output coding used. Section 4 introduces the six different input codings investigated. In section 5 the optimization strategy, an evolutionary algorithm is presented and some extensions to cope with variable dimension decision values are shown. Section 6 shows the results of some comparative tests evaluating the proposed input codings and algorithm extensions. Finally, section 7 draws conclusions of this work and proposes some further research directions. 2 NOMENCLATURE R(t): System reliability function I(t): Importance index CS: Maintenance cost C i : Component i C : Number of components a i : Age of component C i λ: Exponential parameter, failure rate α, β: Weibull parameters PF i (a i ): Failure probability of C i with age a i x: Decision vector x x: Decision vector array X : Decision space x min, x max : Borders of the decision space y Y : Objective vector y, decision space Y w 1, w 2 : Weight factors M: Maintenance matrix T: Number of simulation points of time d min, d max : Minimal and maximal dimension of the decision space X dim a : Vector array dimension dim(x): Dimension of vector x U(a, b): Random number from a uniform distribution in interval [a, b] N(m, σ): Random number from a normal distribution with mean m and standard deviation σ p cross : Crossover probability p mut : Point mutation probability p ins : Insert mutation probability p del : Deletion mutation probability BIN: BINary coding Figure 2. System 1, component MTTF is between 500 and 3000 days. Figure 3. System 2, component MTTF is between 750 and 2850 days. BINC: BINary Component coding VA: Variable length Absolute coding VAC: Variable length Absolute Component coding VR: Variable length Relative coding VAC: Variable length Relative Component coding LIC: Length Independent Crossover LDC: Length Dependent Crossover UMUT: Uniform MUTation NMUT: Normally distributed MUTation. 3 SYSTEM MODEL AND OUTPUT CODING Two simplified system models were used as test cases. They were represented as reliability block diagrams. Failure distribution of the components i were either defined as Weibull or exponential distributions. System 1 (Figure 2) represents a series system with 10 components. All components have exponential failure functions with failure rate λ i. System 2 (Figure 3) also consists of 10 components but both failure functions and system structure are much more complex. The failure functions are described by Weibull distributions with characteristic life α i and shape β i. Maintenance durations are assumed as very short and therefore not affecting the working state of the component. If a component fails, the failure remains undetected (hidden fault) and it is only replaced in a preventive maintenance action. Corrective maintenance is not considered. After the replacement, the component 1268
Figure 4. Importance indices I(t) over one year. is in an as-good-as-new state. All hidden failures are removed. Algorithmically this is done by setting its age back to zero. The system reliability is exactly calculated through a minimal cut algorithm. Simulation outputs are the reliability function R(t) which describes the system reliability as a function of simulation time and the maintenance costs CS of the maintenance schedule. In this simple model we assume that each maintenance event has the same costs and no other costs are accrued. The system was simulated for five years with a simulation step size of 30 days leading to T = 60 simulation steps. Each step is a possible point of time for maintenance. R(t) was multiplied with an importance index function I(t) that reflects the costs of a failure depending on the simulation time. The cost of a system failure may vary in reality due to several reasons, e.g. contractual terms, the amount of customers and spare situation. If this is the case, periodic maintenance schedules will not lead to the optimal solution, and codings allowing nonperiodic schedules are necessary. We assumed high failure costs in winter months and low costs in summer months and thus chose I(t) as seen in Figure 4. This is realistic for gas networks or power plants where load increases in colder seasons. The output coding maps these values to the twodimensional objective vector y. y 1 is the average reliability weighted with the importance factor, y 2 represents the maintenance costs: Figure 5. BIN coding. its component centered version. All codings allow nonperiodic maintenance schedules and are therefore adequate for system models with time dependent cost functions. As we use a time discrete system model, a maintenance schedule can be defined as a binary maintenance matrix M ={0, 1} C T with elements m ij where row i belongs to C i and column j belongs to time interval j = 1,..., T. Ifm ij = 1, component C i is scheduled for maintenance at point of time j (see matrix M in Figure 5). We differentiate between two general forms of decision values, the decision vector x and the decision vector array x. If we define a minimal and maximal dimension d min, d max N, then a real-valued decision vector can be stated as: If d min = d max, the decision space is of constant dimension which is the standard restriction by optimization algorithms. A decision vector array is a vector of vectors, where each element is a vector of dimension d min dim(x i ) d max. The dimension of the vector array itself is denoted as dim a and kept constant in our experiments. A real-valued vector array is defined as: 4 INPUT CODING The performance of four input codings allowing variable length encoding (VA, VAC, VR and VRC) are investigated in comparison to two input codings of constant length (BIN, BINC). The binary input coding BIN is applied in (Lapa et al. 2003), BINC is Maintenance schedules can be viewed as vector arrays with dim a = C where each vector depicts a maintenance schedule for one component. This will be referred to as component centered coding. All proposed codings are surjective so that each possible maintenance matrix may be generated. But only BIN and BINC coding are bijective. All other codings map different decision values to one schedule. What seems to be an unnecessary enlargement of the decision space 1269
Figure 7. VA coding. Figure 6. BINC coding. complexity can turn to an advantage if the objective function becomes smoother or easier to optimize. 4.1 BIN and BINC coding The decision vector of the BIN input coding is defined as a binary vector of length C T. It defines for each component and each point of time if the component is scheduled for maintenance. This coding is simply a linear version of the maintenance matrix M. An example is given in Figure 5. If x 3 = 1 and x 16 = 1, then C 1 would be maintained at t = 3 and C 2 at t = 6. Figure 8. VAC coding. BINC coding (Figure 6) is the very similar vector array coding. Each transposed element vector x T 1... C of x BINC defines one row of the maintenance schedule: Each element defines a maintenance event and as the number of events may vary so does the vector dimension (Figure 7). This automatically leads to the necessity of different optimization strategies that are able to handle decision vectors with variable dimension. If e.g. x VA contains the elements 3 and 37, then C 1 would be scheduled for maintenance at t = 3 and C 4 at t = 7. In VAC coding (Figure 8), each element vector defines one row of M. All values directly represent maintenance times of the appertaining component. 4.2 VA coding VA (variable dimension absolute) coding maps a natural valued vector x VA with dimension 0 n C T to a maintenance schedule: If for example x 11 = 3 and x 42 = 7, then (C 1, t = 3) and (C 4, t = 7) would be maintenance events. 1270
Figure 9. VR coding. 4.3 VR coding In the VR (variable dimension relative) coding similar to VA coding, one element describes one maintenance action. The difference between these two coding schemes is that in VR coding each maintenance event depends on the preceding event. This relative notation defines the schedule through time intervals between two maintenance events: Given the example in Figure 9, we denote that 5 l=1 x 1 = 37 and therefore (C 4, t = 7) is a maintenance event. The VA row shows the associated VA coding. VRC coding works similar to VAC coding (see Figure 8) but using the relative notation. 5 OPTIMIZATION STRATEGY To evaluate the performance of the different coding strategies, a simple genetic algorithm as proposed in (Dorsey and Mayer 1995) was chosen. According to Occam s Razor we do not use much more complex algorithms like SPEA2 (Zitzler et al. 2001) or NSGA2 (Deb et al. 2000) in the hope to achieve more general results with simpler algorithms. As genetic algorithms do not support variable dimension chromosomes we extended it with a different initialization method and new crossover and mutation operators inspired by techniques from the field of genetic programming (Banzhaf et al. 1998). Figure 10. VRC coding. 5.1 General overview and terminology The evolutionary algorithms terminology is strongly related to biological evolution. To prevent misunderstandings, we give a short introduction to the used terms. Evolutionary algorithms are optimization algorithms inspired by nature s strategy to create optimal life forms for a given environment. Evolution can be regarded as a blind optimization progress. Individuals that are well adapted to their environment survive and can generate offsprings which transport successful genetic material through the evolutionary process. This well-known principle is called selection. Selection is solely a phenotypic process. Offspring is normally generated through sexual reproduction of two individuals which mix their genomes through crossover. At some point of the evolutionary process random undirected mutations, changes in the genome of a single individual can occur. Mutations introduce new genetic material in the population which then can make its stand. Evolutionary algorithms transfer this scheme to optimization problems Figure 11. The algorithm maintains a population (set) of individuals (possible maintenance schedules) which are the subject of an artificial evolutionary process. An individual consists of a genotype (decision vector) and a phenotype (objective vector).we refer to the decision vector as genome and a single value as gene. If we deal with a decision vector array, the element vectors will be referred as chromosomes in analogy to the natural genome. Starting point of each optimization run is an initial population of randomly created individuals (initialization). The optimization process is divided into generation steps consisting of selection, crossover, mutation and evaluation phases. Each generation, parenting individuals from the old population are selected for crossover and mutation. The newly created individuals which form the new generation are evaluated by the objective function and the next generation step starts. If a stop condition (e.g. number of generations exceeded) 1271
holds then the algorithm ends, else the next generation step starts. 5.2 Initialization A random length strategy as depicted in (Banzhaf et al. 1998) was chosen to initialize the individuals of the first generation. Each chromosome first obtains a uniform random length between the minimal and the maximal dimension of the problem. Then all chromosomes are initialized with uniform random numbers. 5.3 Selection and elitism We decided to apply binary tournament selection as described in (Bäck 1996) and a stochastic elitism operator. The fitness function f sel : R 2 R maps the objective vector to one dimension by a weighted sum instead of using a Pareto approach as the latter leads to strong difficulties in performance assessment of different input codings and algorithms (Okabe et al. 2003). For a better illustration, f sel can be interpreted as the profit. w 1 then constitutes the yield of a working system while w 2 represents the costs of a maintenance event. Each generation, the best individual found is inserted with a probability of 0.3. 5.4 Crossover With a probability of p cross, two individuals are choosen for crossover. Variable dimension crossover is different to standard crossover insofar as the chromosomes x, x of the two chromosomes crossed can be of different length. We therefore have to select one crossover point p/p N per chromosome. Then the crossover takes place Figure 12. Two crossover operators were investigated, length independent crossover (LIC) and length dependent crossover (LDC). The LIC operator is a straightforward extension of the onepoint crossover operator used in most standard genetic algorithms (Goldberg 1989). Both crossover points are drawn from a uniform distribution: The LIC operator doesn t take the length of the chromosomes into account. Huge length differences between the offspring and parent chromosomes can result. This may not necessarily be an advantage because the search cannot focus on length regions with good solutions (see section 6). The LDC operator tackles this problem. One crossover point is uniform randomly chosen. The second point heavily depends on the first: A random integer from a normal distribution with mean m = 0 and standard deviation σ = dim(x )/4 is Figure 11. Overview of a genetic algorithm. Figure 12. length. Crossover between individuals with different 1272
drawn and added to the crossover point. This guarantees a length variation which is necessary to reach new length regions. N(0, dim(x )/4) is limited so that 0 < p < dim(x ) holds. 5.5 Mutation Three different types of mutation are implemented: Point mutation, insert mutation and deletion mutation. Each value of an individual will be chosen with a probability p mut for point mutation. Binary mutation described in (Goldberg 1989) is used by the binary codings. Integer mutation is done either choosing a new random value at the mutation position (UMUT) or adding a normally distributed value (NMUT) as described in (Bäck 1996). The strategy parameter was fixed at σ = (x max x min )/4 to reduce experimental complexity. Insert and deletion mutation operators are introduced to generate variations in the individual length. At each crossover point a new random value is inserted with probability p ins. This results in a length increase of the individual. The deletion mutation works vice versa. With a probability p del, a position will be deleted and the individual shortened. Table 1. Best parameter combinations. System 1 System 2 Coding Crossover Mutation Crossover Mutation BIN 0.9 0.01 0.9 0.01 BINC 0.9 0.01 0.9 0.01 VA 0.9 0.01 0.7 0.01 VAC 0.7 0.01 0.9 0.01 VR 0.7 0.01 0.7 0.01 VRC 0.9 0.01 0.9 0.01 6 RESULTS We compared the different codings and operators simulating the systems over T = 60 simulation steps. Each step, all components could be maintained resulting in a choice of 600 maintenance points for both systems and thus in a large combinatoric decision space with 2 600 possible schedules. The objective weights were set to w 1 = 600 and w 2 = 1. The experimental setup is divided into three stages: 1. Derive good standard parameter values for all test sets. 2. Compare all test sets using the derived parameter values. 3. Analyze additional operators. The test sets compared are the six coding alternatives BIN, BINC, VA, VAC, VR and VRC. For parameter fitting, a full factorial design plan was created. The initial values chosen were p mut = {0.01, 0.1, 0.2} and p cross = {0.7, 0.9}. As we expect more general results by the use of a simple genetic algorithm, we kept it as simple as possible in step 1 and 2. LIC and UMUT were used, insert and deletion mutation switched off. They will be investigated in step three. Each combination was repeated 10 times. The derived parameters are shown in Table 1. In step two we started a comparative test on all six codings. The population size was 20. Figures 13 and 15 show the best individual found (average over 10 runs) in relation to the current generation. Figures 14 and 16 are snapshots of the best solution found at generation 500 (Boxplot). Clearly the worst input coding on both systems is VR. It seems to converge very early to some local minima and shows a very slow optimization progress. VA scores better in both systems but is also worse than BIN coding. BINC coding shows slight advantages Figure 13. System 1. Best individual (average over 100 runs), Figure 14. System 1. Best individual after 500 generations, boxplot, 1273
Figure 15. System 2. Best individual (average over 100 runs), Figure 17. Best individual after 500 generations, additional Operators, VAC coding. Figure 16. System 2. Best individual after 500 generations, boxplot, Figure 18. Best individual after 500 generations, additional Operators, VRC coding. over BIN. The best input codings seem to be VAC and VRC. VAC clearly outperforms BIN and BINC in system 2 while in system 1 it is worse. VRC produces by far the best results in system 1 and is only beaten by VAC in system 2. The investigation of the additional operators was carried out using system 2 and the most successful codings VAC and VRC. Referring to the algorithm of step two (Normal), four test sets were evaluated: LDC: Length dependent crossover instead of LIC. NMUT: NMUT instead of UMUT. INS: Insert mutation turned on with p ins = 0.01. DEL: Deletion mutation turned on with p del = 0.01. The results, represented as boxplots in Figures 17 and 18 show that the operators perform different depending on the coding used. VRC coding results could be slightly but insignificantly improved by deletion mutation, the other operators perform worse. VAC coding leads to clearly better results using length dependent crossover and may also be improved if NMUT is applied. The INS mutation performed worse results, DEL was equal. 7 CONCLUSIONS AND OUTLOOK The results show that the input coding is highly important for effective maintenance optimization. Of the proposed five alternative coding strategies, two performed equal or better than the standard encoding (Lapa et al. 2003) in the test cases. Some of the additionally introduced operators for genetic algorithms also achieve good results. Together, they form a powerful alternative to standard evolutionary maintenance optimization. This work is only a first investigation on variable dimension maintenance scheduling. Further research 1274
going beyond this work could include the development and adaption of new genetic operators such as Explicitly Defined Introns (Banzhaf et al. 1998) to handle maintenance schedules and the evaluation of the proposed methods in a Pareto multi-objective approach. The extension to other system models could also be of interest. ACKNOWLEDGEMENTS The authors would like to thank Dr. Jörg Petersen for his support and inspirational comments andtim Hohm for the great collaboration where some ideas of this work emerged. REFERENCES Bäck,T. (1996). EvolutionaryAlgorithms intheory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms. New York: Oxford Univ. Press. Banzhaf, W., P. Nordin, R.E. Keller, and F. Francone (1998). Genetic Programming :An Introduction: On theautomatic Evolution of Computer Programs and Its Applications. San Francisco: Academic Press/Morgan Kaufmann. Barata, J., C.G. Soares, M. Marseguerra, and E. Zio (2002). Simulation modelling of repairable multicomponent deteriorating systems for on condition maintenance optimisation. Reliability Engineering & System Safety 76(3), 255 264. Billinton, R. and R.N. Allan (1992). Reliability evaluation of engineering systems: concepts and techniques. NewYork: Plenum. Bris, R., E. Châtelet, and F. Yalaoui (2003). New method to minimize the preventive maintenance cost of seriesparallel systems. Reliability Engineering and System Safety 82(3), 247 255. Deb, K., S. Agrawal, A. Pratab, and T. Meyarivan (2000). A Fast Elitist Non-Dominated Sorting Genetic Algorithm for Multi-Objective Optimization: NSGA-II. In Proceedings of the Parallel Problem Solving from Nature VI Conference, Paris, France, pp. 849 858. Dorsey, R. E. and W. J. Mayer (1995). Genetic algorithms for estimation problems with multiple optima, nondifferentiability, and other irregular features. Journal of Business and Economic Statistics 13(1), 53 66. Elegbede, C. and K. Adjallah (2003). Availability allocation to repairable systems with genetic algorithms: a multiobjective formulation. Reliability Engineering and System Safety 82(3), 319 330. Goldberg, D. (1989). GeneticAlgorithms in search, optimization, and machine learning. Turin Ney York: Addison- Wesley. Gopalakrishnan, M., S. Mohan, and Z. He (2001). A tabu search heuristic for preventive maintenance scheduling. Computers and Industrial Engineering 40(1 2), 149 160. Lapa, C. M., C.M. Pereira, and P.F. Frutuoso e Melo (2003). Surveillance test policy optimization through genetic algorithms using non-periodic intervention frequencies and considering seasonal constraints. Reliability Engineering and System Safety 81, 103 109. Okabe, T., Y. Jin, and B. Sendhoff (2003). A critical survey of performance indices for multi-objective optimization. In IEEE Congress on Evolutionary Computation, Volume 2, Canberra, Australia, pp. 878 885. Zitzler, E., M. Laumanns, and L. Thiele (2001). SPEA2: Improving the strength pareto evolutionary algorithm for multiobjective optimization. In EUROGEN 2001, Athens, Greece, pp. 95 100. 1275