On the distribution of Traveling Salesman Problem solutions among the ranked Linear Assignment alternatives

On the distribution of Traveling Salesman Problem solutions among the ranked Linear Assignment alternatives The Linear Assignment Problem is considered one of the fundamental combinatorial optimization problems in operations research. Operations researchers are concerned with finding optimal solutions for complex problems. These include decision variables, constraints and objective function. When the constraints and objective function only involve linear relationships among the decision variables, the problem is called Linear Programming. Often is the integrality of the decision variables not guaranteed by the Linear Programming formulation but, if required, needs to be imposed. These so-called Integer Linear Programming problems are in general more complex than their Linear Programming counterparts. The Linear Assignment Problem is one of those special problems: its decision variables naturally assume only the values 0 and 1. Roughly speaking the Linear Assignment Problem deals with choosing n entries in a n x n cost matrix, one in each row and column, in such a way total costs are minimized. A problem closely related to the Linear Assignment Problem is called the Traveling Salesman Problem: a salesman wants to visit n cities in an optimal way. In this thesis the distribution of the solutions of the Traveling Salesman Problem among the Linear Assignment Problem alternatives is investigated. Therefore the Linear Assignment solutions are ordered in increasing costs. Jannetje Veerman iets The Linear Assignment Problem and Traveling Salesman Problem The linear assignment problem (LAP) is besides of mathematical interest of practical interest as well. In manpower management, the assignment problem is to assign N employees optimally to N different jobs. It is supposed that a numerical performance rating of all N 2 employee-job combinations is known. The optimal solution are the N employee-job combinations which maximizes the sum of performances or minimizes the sum of costs. The matrix C = (c ij ) i;j=1,...,n is the cost matrix. In terms of the manpower assignment c ij are the costs when employee i takes care of job j. Goal is to find a matrix X = (x ij ) i;j=1,...,n satisfying the constraints. This is a matrix in which only one element on each row and column equals 1 and all the other elements are 0. In terms of the personnel-assignment this is if x ij == 1 then employee i gets job j. An X like this is therefore called an assignment. In general, there are a lot of feasible assignments for this problem. It can easily be seen that there are n! feasible solutions. Despite being large this number is finite and hence there is an assignment that minimizes the total cost. In 1946 Eastereld presented the first algorithm to solve the linear assignment problem. In 1955 H. W. Kuhn developed an improved method, called the Hungarian Method [2]. This algorithm is specic to the assignment problem and more efficient than solving the LAP as a general LP problem. For more details, see the complete thesis [5]. In some cases the assignment problem needs to have an extra constraint. For example, it is not possible to have employee i assigned job j together with employee k assigned job l (i k; j l). In this case it would be helpful to have a list with all assignments of the original problem and their total costs. The assignment satisfying all the constraints with the lowest cost level is chosen. This approach yields the optimal solution for the constrained problem. 36 AENORM 52 Juni 2006

Katta G. Murty developed an algorithm for ranking all the assignments in non decreasing order in [4]. Murty used the Hungarian Method to identify the optimal solution. Following a few simple steps the second best solution is found. By repetition of the steps all assignments are identified. This algorithm is described in [5]. For the Traveling Salesman Problem the matrix with distances between cities is used as cost matrix, being the objective the minimization of the total distances. The Traveling Salesman Problem (TSP) has a (natural) constraint which prohibits the salesman to visit a city more than once. The optimal solution is a single loop between n cities. Again, there is more than one feasible solution for the TSP. Starting with the first city (i.e. first row) the next city can be chosen from n-1 other cities. When the second city is chosen, the third city can be chosen from the remaining n-2 cities. Continuing like this yields there are (n-1)! feasible solutions for the TSP because the choice of the first city is irrelevant. Clearly, a solution of the Traveling Salesman Problem is a feasible solution of the Linear Assignment Problem. These are the so-called circular permutations. On the other hand, a solution of the LAP is not necessarily a solution of the TSP. As shown above, if the problem is of dimension n there are n! assignments and (n-1)! solutions of the TSP. This yields (n-1)! in n! assignments are solutions of the TSP or 1 in n assignments is a solution of the TSP. In this thesis the following problem will be addressed: How do the TSP solutions distribute among the LAP solutions when these are monotonously ordered in cost? Computational results Figure 1: Typical results for a matrix, a symmetric matrix and a matrix with distances of dimension 6 For the purpose of this thesis, the ranking algorithm of Murty is implemented in MatLab, a high-level programming language to perform numerical computing. This section presents the results obtained. Some remarks can be done at this stage. At first, there are more efficient algorithms developed which can be used to solve the discussed problem. However, these improvements are beyond the scope of this thesis. Hence, Murty s algorithm is used to rank all the assignments and to find a least cost assignment in a given matrix a LAP code of Jonker & Volgenant [1] is used as a black box. Secondly, the example in [5] showed that by repeating the steps of the algorithm of Murty some linear assignment problems arise more than once. After the u-th best solution is identified, the algorithm creates a new list in which the u+1-th best solution has to be found. This list is slightly different from the previous list. Therefore, it is more efficient to store the obtained results than to solve the same problems again. However, this isn t included in our implementation in MatLab. For the magnitude of the problems used in the thesis, it suffices. Initial efforts To get insight in the distribution of Traveling Salesman solutions among linear assignment solutions the results of different cost matrices are described. Because the number of assignments becomes very large when the dimension AENORM 52 Juni 2006 37

of the matrix increases, all matrices have dimension 6 or smaller. For the aim of this thesis, this is not a big restriction. Furthermore, the matrices contain no fractions because the code used to create the LAP MatLab extension can only deal with integers. At first, matrices of dimension 4, 5 and 6 with integers 0,..., 10 are used to run the program. Secondly, matrices with integers 0,...,10 are multiplied by their transpose. This results in symmetric matrices. Finally, n coordinates in the Euclidean space are ly generated. The distances between these n points, which can be thought of as cities, are calculated and rounded off. Typically for a distance matrix is the symmetry and the zero-diagonal: the distance between city i and city i is 0. The program returns all assignments ordered by their costs and a zero-one vector. In case an element in the vector equals 1, this assignment is a solution of the TSP. In the figures the zeroone vector has been plotted in a bar graph. The x-axes run from 0 to n! and represent the number of the assignments. Hence, the distribution of the TSP solutions in the assignments is visualized. In the first picture the TSP solutions seem ly distributed among all assignments. This is not surprising, the elements of the cost matrix were chosen ly. The second picture shows a more surprising result; the density of TSP solutions is in the bigger in the first assignments, i.e. in the best assignments. There is no obvious cause for this particular distribution. Further investigation may possibly give more insight. The TSP solutions in the final picture are more concentrated between the assignments with highest costs. The reason for this are the zeros on the diagonal. The best assignments will include such a diagonal element, because this results in low costs. Since choosing element (i, i) means the salesman has to travel from city i to city i, this prohibits the solution to be a solution of the TSP. So after a couple of assignments with a diagonal element, other assignments will show up in the list. These assignments are more likely to be a solution of the TSP. Adapted diagonal From now on, the diagonal elements in all matrices are replaced by enormous costs. The expectation is that the TSP solutions are among the cheapest assignments. In all graphics in figure 2 the TSP solutions are situated among the assignments with lowest costs. It is interesting to visualize the costs of the assignments. The program is asked to plot the Figure 2: Typical result for asymmetric, symmetric and distance matrix with huge costs on the diagonal, dimension 6 costs of the first 50 assignments and mark the TSP solutions with a circle. The graphics don t show surprising results. The costs grow slowly in all matrices. There are more assignments with same costs in the and distances matrix than in the symmetric matrix. There is no obvious reason for this. In these examples the first TSP solution is the fourth, ninth and ninth assignment in the ranked assignments. These solutions have costs which are about the same as the better LAP alternatives. 38 AENORM 52 Juni 2006

This has been done for dimension n = 10 and n = 20. Matrix Dimension n Mean number of first TSP solution Asymmetric 10 5.5 Asymmetric 20 10.4 Symmetric 10 120.6 Symmetric 20 Took too long Distances 10 204.6 Distances 20 Took too long Table 1: Number of first TSP solution The matrices which are used to get the results as shown in the table, all had huge diagonal elements. By doing so, the TSP solutions are more likely to arise in the first (i.e. best) ranked assignments, see figure 2. Nevertheless, the first TSP solution in a 20 x 20 matrix cannot be found in case the matrix is symmetric, like the symmetric matrix and the distances matrix. After 30 minutes just the 450 best solution of one matrix were found. In this respect, it should be mentioned that the inefficiency of our program plays an important role. This result causes some trouble for the Traveling Salesman Problem; in this problem the corresponding matrix will often be symmetric. For dimension n = 10, the results are comparable. It can be (carefully) concluded that in the ranked assignments of a asymmetric matrix a TSP solution can be found in an early stage. Conclusion Figure 3: Costs of the rst 50 assignments and an indication when the assignment is a TSP solution of a asymmetric, symmetric and distance matrix with huge diagonal elements As stated above the total number of assignments explodes when the dimension of the cost matrix increases. However, ranking all assignments until the first TSP solution has been found should be possible in most matrices because this solution can arise in an early stage. In table 1 the mean number of the first TSP solution of different matrices is given. Because of the huge computation time 5 matrices have been created. Of these matrices the number of the first TSP solution has been identified and of these numbers the average has been taken. The problem addressed in this article was: How do the TSP solutions distribute among the LAP solutions when these are monotonously ordered in cost? In the first place, it can readily be concluded that 1 out of n assignments is a solution of the Traveling Salesman Problem. By implementing K.G. Murty s algorithm for ranking linear assignments on the computer, this question can now be answered. For matrices with low dimensions, all assignments can be found quickly. Because the total number of assignments explodes when the size of the matrix increases, matrices with higher dimensions are difficult to handle. Among the assignments of, asymmetric cost matrices the TSP solutions are ly distributed. In these matrices the first TSP solution is found between the assignments with lowest costs. This is why the computer can deal with greater matrices if they are asymmetric. AENORM 52 Juni 2006 39

The TSP solutions of matrices which are symmetric, like a matrix with distances between cities, are in the assignments with highest costs. To find the first TSP solution in a matrix of dimension greater than 10, this causes trouble. A solution for this is to replace the diagonal elements of the cost matrices by huge costs. Now, the TSP solutions are between the linear assignment solutions with low costs. Hence, the program is more able to return the first TSP solution in a reasonable time. However, in a symmetric matrix of dimension 20, the first TSP solution cannot be found even after running the program for more than 30 minutes. If interested, further research can be done by improving the MatLab code. As stated above, storing the obtained results in each stage increases the efficiency and the speed of the program. Probably the first TSP solution (or even all) can then be identified for larger matrices. In addition, one can improve the code in case the cost matrix is symmetric. In that case the two assignments with rows and columns inverted succeed each other. References Jonker, R. and Volgenant A. (1987). A shortest path algorithm for dense and sparse linear assignment problems. Computing, 38, 325-340. Kuhn, H. W. (1955). The Hungarian Method for the Assignment Problem. Nav. Res. Log. Quart, 2, 83-97. Munkres, J. (1957). Algorithms for the Assignment and Transportation Problems. Journal of the Society for Industrial and Applied Mathematics, 5(1), p 32-38. Murty, K. G. (1967). An Algorithm for Ranking all the Assignments in Order of Increasing Cost. Operations Research, 16(3), 682-687. Veerman, J. M. A. (2008). On the distribution of Traveling Salesman Problem solutions among the ranked Linear Assignment alternatives. Bachelor thesis Econometrics and Operations Research, VU Amsterdam. 40 AENORM 52 Juni 2006