Load Balancing and Rebalancing on Web Based Environment. Yu Zhang

Transcription

1 Load Balancing and Rebalancing on Web Based Environment Yu Zhang This report is submitted as partial fulfilment of the requirements for the Honours Programme of the School of Computer Science and Software Engineering, The University of Western Australia, 2004

2 Abstract We investigate two variants of a load distribution problem that is associated with distributing loads of varying size on a multi-server web-based environment. Solving the classical Load Balancing Problem allows us to distribute static web components to multiple servers, so that the loads on the servers are as equally distributed as possible. A typical objective is to minimize the makespan, the load on the heaviest loaded server. In reality however, loads on servers are often dynamic. As the load of web components change over time, the Load Rebalancing Problem was introduced by S. Keshav of Ensim Corporation. To solve the Load Rebalancing Problem we try to redistribute the loads of web components, in a fixed number of steps as moving components across servers can be expensive, so that the load on the servers are as equally distributed as possible. Solving these two problems successfully would allow us to utilize resources better and achieve better performance. However these problems have been proven to be NP-hard, thus generating the exact solutions in tractable amount of time becomes infeasible when the problems become large. We thus adopt four greedy approximation algorithms to solve these two problems in polynomial time, within constant guaranteed error ratio. We give implementations of the algorithms in the Java programming environment. We carry out experiments to show that the error bounds are valid on our implementation. We also performed various experiments to test the performance of the algorithms in practical situations. By analyzing our results carefully we identified weakness in some of the algorithms and proposed improvements. We conclude that these approximation algorithms do indeed run in polynomial time, they generate approximated results within the said error ratio on our test data sets, and they are valid tools to assist us to balance and rebalance loads on a multi-server web-based environment. Keywords: Approximation Algorithms, Load Balancing, Scheduling CR Categories: F.2.2, G.2.1, C.2.4 ii

3 Acknowledgements I would like to thank Ensim Corporation for posting the Load Rebalancing Problem, and my supervisor Gordon Royle for all his help and support. iii

4 Contents Abstract Acknowledgements ii iii 1 Introduction 1 2 Literature Review 3 3 Load Balancing Problem The Load Balancing Problem Algorithm GBalance Algorithm SBalance Implementations Test Methodology Test Result Analysis and Comparison Load Rebalancing Problem the Load Rebalancing Problem Algorithm GRebalance Algorithm SRebalance Implementations Test Methodology Test Result Analysis and Comparison Proposed Improvement Proposed Improvement for GRebalance Proposed Improvement for SRebalance iv

5 5 Conclusion 24 A Original Research Proposal 26 B Approximated Makespan Generated for the Load Balancing and Load Rebalancing Problem 28 B.1 Result Generated Using GBalance and SBalance B.2 Result Generated Using GRebalance and SRebalance C Java Code Used 33 C.1 Model of Web Component in Load Balancing and Load Rebalancing Problem C.2 Model of Server in the Load Balancing Problem C.3 Implmentation of GBalance Algorithm C.4 Implementations of the SBalance Algorithm C.5 Model of Server in the Load Rebalancing Problem C.6 Implementations of the GRebalance Algorithm C.7 Implementations of the SRebalance Algorithm C.8 Implmentations of the Random Generators for test cases v

6 CHAPTER 1 Introduction Over the last four years, the number of Internet users increased by 125%, reaching a population of 812, 931, 592 [1]. The growth in Asian countries is tremendous, for instance in China, the number of Internet users reached a record high of 87 million, according to CCNIC [2]. As a result of this phenomenon, more people are relying on the Internet for educational and recreational purposes. Web sites are a prevalent means in which people receive information and interact with others on the net. The size of web sites have thus grown significantly to accommodate the increasing needs. As the number of surfers, size of web pages as well as the number of web pages in web sites increases, using a single server to host all the web components is no longer sufficient, as the single server could very easily become overloaded and unable to accommodate to all requests in time. Thus distributing web components on multiple servers to allow faster delivery of information becomes a beneficial solution. The problem we are trying to solve then is how do we balance the web components across a number of servers as equally as possible. Firstly we would like to define the load of a web component, such as a static HTML document, a gif image, an avi clip, a macromedia flash file or in some cases an entire web site to be the total number of bytes that a server has to send to all surfers. It is the size of the file in bytes multiplied by the number of hits. We wish to discover methods to distribute the web components to a number of servers as equally as possible. A good way to achieve that is to minimize the load on the most loaded server. We are interested in two specific problems in this context, namely the Load Balancing Problem and the Load Rebalancing Problem. Load Balancing Problem states that given a number of servers and a list of loads, we seek to assign each load to a server, such that the loads are distributed as equally as possible. When a web site is first published, the Load Balancing Problem can be used to model the initial distribution of web components - given the size of web components and their estimated number of hits, we wish to try to find a way to optimally distribute the web components on a fixed number of servers such that the load on 1

7 the most loaded server is minimized. The second problem we wish to investigate is associated with load rebalancing. After certain amount of time, the loads of web components are very likely to vary. For instance the number of hits of a HTML document might increase, or the size of a thread in a forum might get larger. As a result, some server might become much more loaded than others. We would thus like to redistribute the web components by moving some of them from servers to servers. Because moving web components among servers could be expensive, we would like to minimize the number of moves. The Load Rebalancing Problem is thus given the servers and associated loads, how do we redistribute the loads such that the loads on servers are as equal as possible. By solving these problems, we are able to distribute load evenly among servers when a web site is first published, and readjust the loads when necessary. However, the Load Balancing Problem has been proved to be NP-hard [3], Aggarwal et al. [4] has also shown that the Load Rebalancing Problem is NPhard. Thus we are unable to deliver an exact solution to either in a reasonable amount of time when the problem gets large. For instance, if we have 5000 web components and 10 servers that we could use, we have configurations to enumerate and compare for the Load Balancing Problem. For the Load Rebalancing Problem, if we restrict the number of moves to 12, then we would end up with approximately configurations. While solving these problems is not feasible under such a situation, approximation algorithms could possibly be implemented to achieve a reasonable estimate within a fixed error ratio in a reasonable amount of time. Motivated by the above, we adopted four greedy algorithms to help us address the problems. GBalance and SBalance [3] are adopted to approximate the Load Balancing Problem, and GRebalance and SRebalance [4] are adopted to approximate the Load Rebalancing Problem. To gain a better understanding of the algorithms and test the correctness of the error bound we implemented and tested them in the Java environment [9]. Our goal is to test how well the algorithms perform in practice, and whether the run time of these algorithms are realistic in practice. For each problem, we first state it formally in mathematical forms, present the listing of the algorithm, describe our implementation and testing procedure, and lastly, present and compare the performance of the algorithms. During the process of analyzing the results, we also learned the weakness of GRebalance and SRebalance, and proposed an improved version of these two algorithms. 2

8 CHAPTER 2 Literature Review Currently there are many web sites with huge volume of content. These web sites are adopting different methods to distribute loads on a number of servers to ensure faster server response time. Server response time is largely determined by the underlying hardware of the servers. The performance of many of these webs such as ebay.com, amazon.com and expedia.com is heavily dependant on how fast its servers respond to requests, as the interaction between users and these web sites are real-time in nature. Overly loaded server would jeopardize the performance of these web sites significantly. Many sources [10, 11, 12] have detailed introductions of the common techniques used in practice. Currently load balancing can be done through hardware or software based techniques. One technique, called DNS load balancing, involves maintaining identical copies of the site on physically separate servers. The DNS entry for the site is then set to return multiple IP addresses, each corresponding to the different copies of the site. The DNS server then returns a different IP address for each request it receives, cycling through the multiple IP addresses. This method gives a very basic implementation of load balancing. However, since DNS entries are cached by clients and other DNS servers, a client continues to use the same copy during a session. This can be a serious drawback, as heavy website users may get the particular IP address that is cached on their client or DNS server, while less-frequent users get another. So, heavy users could experience a performance slowdown, even though the server s resources may be available in abundance. Another load-balancing technique involves mapping the site name to a single IP address, which belongs to a machine that is set up to intercept HTTP requests and distribute them among multiple copies of the Web server. This can be done using both hardware and software. hardware solutions, even though expensive, are preferred for their stability. This method is preferred over the DNS approach, as better load balancing can be achieved. Also, these load balancers can see if a particular machine is down, and accordingly divert the traffic to another address dynamically. This is in contrast to the DNS method, where a client is stuck with the address of the dead machine, until it can request a new one. 3

9 Another technique, reverse proxying, involves setting up a reverse proxy, that receives requests from the clients, proxies them to the Web server and caches the response onto itself on its way back to the client. This means that the proxy server can provide static content from its cache itself, when the request is repeated. This in turn ensures that the server itself can focus its energies on delivering dynamic content. Dynamic content cannot generally be cached, as it is generated real time. Reverse proxying can be used in conjunction with the simple load balancing techniques discussed earlier - static and dynamic contents can be split across different servers and reverse proxying used for the static content Web server only.method1 All the above approaches requires duplication of contents. While under most circumstances, this is not a huge problem for corporations, situations can arise when one wish to use an alternative approach. For instance, a web hosting company would not wish to duplicate multiple copies of the web sites it is hosting on all servers, as it introduces excessive cost. Instead of making duplications, our approach balances and rebalances web components by treating every web component as unique. Only one copy of a components will be found on all servers. Other than above mentioned cost saving benefits, this approach also eliminates the need to implement extra hardware and software to ensure concurrency. We are able to find implementations of most of the algorithms we adopted. For instance, the GRebalance algorithm has been implemented by Linder and Shah [13] to rebalance loads in real life. We choose to construct our own implementations to gain a better understanding of them. This proved fruitful as we learned the weakness of the algorithms, and were able to propose improved versions of the GRebalance and SRebalance algorithm. 4

10 CHAPTER 3 Load Balancing Problem 3.1 The Load Balancing Problem Load Balancing Problem is defined as the following. We are given a set of m servers M 1,.., M m, and a set of n components; each component j has a load of t j. We seek to assign each component to one of the servers so that the loads placed on all servers are as balanced as possible. Mathematically, in any assignment of components to servers, we can let A(i) denote the set of components assigned to server M i ; then server M i needs to work for a total time of T i = j A(i) t j (3.1) We define this as the load on server M i. In distributing the load evenly we wish to minimize a quantity known as the makespan - the maximum load on any server, T = max i T i. This classical problem does not only handle traditional load balancing on a multi-job multi-machine situation, but could also help us to distribute web components to various servers. Solving this problem would allow us to distribute load across servers evenly, such that the load on the most heavily loaded server is minimized. This is useful when a web site is first published, when servers are upgraded or when major updates take place such that all the web components need to be reassigned. Two algorithms will be adopted for this purpose, namely GBalance and SBalance. Both algorithm runs in polynomial time, and generate approximations that are guaranteed to be within a constant factor of the optimal solution [3]. More Specifically, GBalance achieves a guaranteed ratio of 2, meaning the makespan of the approximated solution is at most twice that of the optimal solution, while SBalance achieves a better guaranteed ratio of

11 3.2 Algorithm GBalance The first algorithm is called GBalance. It is a simple algorithm that passes through the entire web component list in any arbitrary order. The web component being processed is assigned to the currently minimal loaded server. This process is repeated until there are no web components left to be assigned. Algorithm GBalance (input: Component List, number of servers; output: servers with new assignment) Start with no web components assigned T i = 0, A(i) = null for all servers FOR j = 1,..., n Let M i be a server that achieves the minimum min k T k Assign component j to server M i A(i) = A(i) j T i = T i + t j END FOR The GBalance algorithm achieves a constant error bound of 2, with a run time of O(n m). The mathematical proof of the error bound and run time can be found in Kleinberg and Tardos [3]. 3.3 Algorithm SBalance An improvement over the previous algorithm is made through a simple sorting routine. The improved algorithm is called SBalance, it still runs in polynomial time and it achieves a better error bound of 3 of the optimal solution. The 2 algorithm first sorts the list of web components in decreasing load order, then goes through the list of sorted web components, and assigns each web component to the currently least loaded server. The algorithm is described below. Algorithm SBalance (input: Component List, number of servers; output: servers with new assignment) Start with no web components assigned. Set T i = 0, A(i) = null for all servers. 6

12 Sort all components in descending order of load. Assuming t 1 t 2 t 3... t n. For j = 1,.., n Let M i be a server that achieves the minimum min k T k Assign component j to server M i Set A(i) = A(i) j Set T i = T i + t j End for The SBalance algorithm achieves a constant ratio of 3, meaning that the 2 load on heaviest loaded server is at most 150% that of the optimal value. The algorithm s run time depends primarily on the sorting procedure, as the sorting procedure takes a time that is in higher order than the balancing procedure. A merge sort, pivot sort or quick sort algorithm would give SBalance a running time of O(n log n). The mathematical proof of the error bound and run time can be found in Kleinberg and Tardos [3]. 3.4 Implementations We would like to implement the algorithms in a way such that give the size and (estimated) hits of a web component, we would obtain an approximated optimal solution to distribute these web components as even as possible across our web servers. We used Object Oriented Programming approach to model web components, servers as well as the balancers. We choose to use the Java programming language [9] to implement both algorithms. Appendix C.1 appendix A shows the data structures we used to model web components. The component is modeled using a single double value, representing the load that it will contribute to a server. Appendix C.2 shows the construct of servers and operations that they can perform in our load balancing process. Appendix C.3 is the heart of the GBalance algorithm, it is the balancer and does the balancing process. 7

13 Appendix C.4 is the heard of the SBalance algorithm, with sorting and balancing. The sorting algorithm we used is the modified merge sort algorithm implemented by Java [8]. Appendix C.8 is our random test case generator. The inputs of both GBalance and SBalance are m - number of servers to balance the load on, and S - the name of the input data file. The web components are represented by data type double, and are line-separated in the data file. The output of the programs is a printed list of web components assigned to each server. An ArrayList is used to store web components read in from the data file. The web components in the data file is first read then inserted into the ArrayList. After that we sort the web components based on a modified merge sort algorithm for the SBalance algorithm. Each web component in the ArrayList is then distributed sequentially to the currently minimum loaded server. 3.5 Test Methodology We would also like to find out how well the algorithms perform in practical situations. Another aim is to see how variables such as the number of servers and the average number of web components on each server affect the performance of the algorithms. We use makespan to measure the performance of the algorithms; the smaller the makespan, the better the result. Ideally, we would like to have a large number of real life data with calculated optimal values to compare the result of our implementations with. However due to various limitations this was not practical. We then decided to construct our own test cases. Apparently if we want to test the accuracy of the algorithms we would first need to know the optimal value of the test cases we generated. This is difficult due to the NP-hardness of the problems that we are trying to solve. Our first attempt was to enumerate all possible cases of assigning n web components to m servers, and pick the assignment in which the heaviest loaded server has the least load. We started by assigning one web component to each server at random. This is because for all optimal assignment each server needs to have at least one web component given n m. For the remaining web components we partition them into m parts. This is done by numbering the remaining n m components in sequence, and using a partition function to partition integer 1 to n m to m parts. We then look up the corresponding component for each integer and replace the integer with the component. For each partition assignment, we record the maximum load on any server. We then attempted to identify the partition with the least maximum load and thus the optimal value. This attempt was not successful due 8

14 to the extremely long computation time owing to the number of partitions, and program simply crashes. Our second attempt was to first generate an optimal case, and work back-wards to load all servers with loads that sum to this optimal value. We then proceed to scramble the arrangement and let our program assign the web components. Doing so guarantees the generation of an optimal solution, that is the load on any server. We then work back-wards to figure out what web components are on each server. This is achieved by generating a series of random numbers from 0 to the optimal load, and taking the difference between successive numbers. For instance, if we decided the optimal load is 10, and the random numbers representing slices are 2,7,3,4, we first add the number 0 and 10 to it and sort them in descending order. The series now becomes 10,7,4,3,2,0. We then take the difference of the successive numbers to obtain the size of the web components. In this particular case, we have 10 7 = 3, 7 4 = 3, 4 3 = 1, 3 2 = 1 and 2 0 = 2.The size of the web components assigned to that servers is then 3,3,1,1,2. We then remove all web components from all servers and put them in the data file in an arbitrary manner. The test file is then read in by our programs, and the loads on the heaviest loaded servers are recorded and compared with the optimal solution that we have pre-generated. We present the results obtained in the following section. 3.6 Test Result We would first like to introduce the parameters we used to obtain the result. For each test, we pre-generate O, the optimal value. We set it to 100 for all the test cases used for simplicity reasons. m is the number of servers, and n cap, the maximum number of web components a server can have during the generation of the optimal result. The reason why we introduced n cap is that we wish to introduce some large web components into servers deliberately to test the correctness of the algorithms, and examine how they perform when loads are not well balanced. n cap essentially controls how close the web components sizes are. For instance, when n cap is small, the chance of having web components with large load is high, and vice versa. For each test run, we run 200 numbers of tests using the same O, m and n cap. We record T, the approximated solution generated by GBalance and SBalance. We then compute A R, the average value of load on the heaviest loaded server and M R the maximum value of load on the heaviest loaded server obtained using the same m and n cap for 200 tests. Lastly we compute the average error ratio A E, and maximum error ratio M E, using separate sets of tests. We start our test by setting m to 4 and n cap to 4. We then increase the number of m and n cap and repeat the experiment. The results are presented in 9

15 Figure 3.1: Average Output Obtained Using m = 64 the tables in Appendix B.1. Figure 1 is a plot of the makespan generated using these two algorithms against log n cap, obtained using m = Analysis and Comparison Our implementations of both algorithms achieved the said ratio on all our test 3 cases. The error bound of 2 is valid on GBalance and the error bound of is 2 valid on SBalance. 10

16 SBalance takes longer time to run in practice, however, even with the largest set of data in our test cases (256 servers, 256 slices), the practical run time difference between the two algorithm is not very significant, as SBalance on average only takes about 1 to 2 seconds more to execute. Theoretically, GBalance could produce better results in certain cases, however such situations seem rare as we did not observe such behavior in all tests. In our test cases, the results produced by SBalance is constantly superior to the result produced by GBalance. Initially we thought that as the number of slices go up, the web component size will become smaller and more even, thus helping GBalance to obtain better results. However, this is not the case when the results are analyzed. The effect of number of slices as well as the number of servers we use does not seem to affect the performance of SBalance significantly. SBalance constantly produced near perfect result, while GBalance s performance is crippled by the presence of any large web components. Our result showed that as n cap or m goes up, the total number of web components increases. The chance of getting a large web component from these set ups also increases. As makespan is a single measurement of the loads on the heaviest loaded server, as a result of that, we found that GBalance performance deteriorates as m and n cap increases. Figure 1 is an illustration of this claim. The result of both algorithms tend to be better when m << n cap. 11

17 CHAPTER 4 Load Rebalancing Problem 4.1 the Load Rebalancing Problem Although the greedy algorithms introduced in the previous section help us to balance static loads on multiple servers, the assumptions we made were too simple for realistic situations faced in real life. There are several important issues not addressed in the previous model, and we would like to discuss two of them. First of all, the loads on servers are not static but generally dynamic. There are several instances in which the load might change: the size of the web component might change over time, for example, a dynamic page in a web forum is likely to increase in size; the component might get more or less hits over time, contributing to the change in load; lastly, a new component could be created or destroyed, for instance, a user might upload/delete a file stored on a server. As the load on servers change over time, the load becomes unbalanced again, some servers might become overloaded again, and we are faced with the problem of not utilizing resources efficiently. Secondly, moving web components across servers can be an expensive procedure. For example, removing a HTML page and host it on another server would require downtime for server maintenance, changing HTML links on other pages with reference to it or changing the address of the component in mapping. The Load Rebalancing Problem was introduced by S. Keshav of Ensim Corporation [5], precisely because cooperation are facing the problem of changing load on servers over time. We borrow the concept of makespan to help us visualize the problem better. The makespan again is the load on the heaviest loaded server. The problem we try to solve is the following: given the loads on different servers, what s the minimal makespan that can be achieved with at most k moves? A more formal definition is the following. Given an assignment of the n web components to m servers, and a positive integer k, relocate no more than k web components so as to minimize the maximum load on a server. Again, we would like to define the load of a web component to be its size multi- 12

18 plied by its number of hits, and the load of a server to be the total amount of load from the web components it hosts. If we are able to approximate this problem, then we would be able to give an optimal move strategy to redistribute load onto servers more evenly, and achieves a faster overall delivery time. Two algorithms will be implemented for this purpose, namely GRebalance and SRebalance. Both algorithm runs in polynomial time, GRebalance achieves a guaranteed ratio of 2 1 m, while SRebalance achieves a better guaranteed ratio of Algorithm GRebalance This simple algorithm is very similar to the SBalance algorithm. It is a simple variant of Graham s greedy heuristic [7] and yields a 2-approximation, as described by Shmoys and Tardos. [6] Algorithm GRebalance (input k - number of moves, P - list of servers with loads, output: BP - list of servers with loads) For n in 1 to k Remove the largest single web component from the currently most loaded server. (Step 1) End for For n in 1 to k Placed the k removed web components onto the currently minimumloaded server (Step 2) End for Output BP The sorting of web components according to decreasing size of load takes O(n log n) time, to reinsert the removed web components it takes O(k log m) time. Since we are interested in non-trivial case in which m n, we have a total running time of O(n log n). The mathematical proof of the error bound and run time can be found in Aggarwal et al. [4]. 4.3 Algorithm SRebalance This algorithm is originally presented by Aggarwal et al. [4], and takes a more complicated approach as compared with the GRebalance algorithm. To formalize 13

19 it better, we begin by presenting some definitions used by its original authors. Definition 1. Web components of size strictly greater than 1 OP T are called large, the rest 2 are called small. Let L t denote the total number of large web components. m l is the number of servers with at least one large web components; then L e = L t m l denotes the number of extra large web components on this set of servers. A server is large free if it doesn t have a large web component assigned to it currently. We first look at an algorithm that does the rebalancing with given optimal value. This algorithm is called Partition, has an error bound of 1.5,and makes at most the same number of moves needed by the optimal algorithm. At this stage it does not enforce the number of moves restriction. Later on we will describe a method to do away with having to input an optimal value, and enforcing the number of moves restriction. Algorithm Partition 1. From each of the m L servers which has a large web component, remove all large web components, except for the smallest sized large web component therein. 2. Calculate for each server i, the following values with respect to their current configuration a i : the minimum number of small web components to be removed so that the total size of the remaining small web components is at most 1 2 OP T b i : the minimum number of web components (including any large web components) to be removed so that the total size of the remaining web components (including any large web components) is at most OPT. c i = a i b i 3. Select the L T servers within the smallest values of c i, breaking ties by giving preference to the servers containing large web components. Remove the a i small web components from the selected servers, thereby ensuring that the total size of the remaining small web components on these servers is at most 1 2 OP T. 4. From the remaining m L T servers, remove the b i web components from them. Large web components, if any needs to be reassigned. Assign each of the removed large web components (arbitrarily) to distinct large-free servers created in step 3. 14

20 5. Arbitrarily assign the large web components removed in step 1 to the remaining large-free servers 6. For the small web components removed in 3 and 4, assign them one-by-one to the current minimum-load server. To do away without the optimal value, one key observation is that only when OP T cross some threshold value, would it affect L T, a i or b i. For instance, only when the value of 1 2 OP T crosses some web component s load p j, does the value of L T changes. Similarly, we can obtain the threshold values of a i and b i. The set of threshold of a i, b i over all servers combined with 2p j of all web components gives the threshold value of OP T. Given that, it is sufficient to implement SRebalance. LEMMA 1. Enumerating in increasing order of all threshold values for each server i, with respect to L T, a i and b i, then L T, a i and b i remain unchanged for OP T varying between 2 consecutive threshold values. Algorithm SRebalance 1. Use the average load as the starting guess for OP T 2. Calculate the corresponding L T, L E, a i, b i, c i values using Partition. Let k b be the total number of moves needed by this algorithm. 3. While k b > k do Increase the guessed value of OP T over to the next threshold value Recalculate corresponding values for L T, L E, a i,b i, c i End While 4. Return the result produced by the last execution of Partition. The error bound for this algorithm is 3, and the run time is O(n log n). The 2 mathematical proof of the error bound and run time can be found in Aggarwal et al. [4]. 4.4 Implementations We again choose to program under the Java Programming Language [9] as we would like to model web components and servers using Object Oriented Programming approach. We would like to implement the algorithms in a way such that give the list of servers with the current size and hits of web component it is hosting, we would 15

21 obtain an approximated optimal solution to redistribute these web components as even as possible across our web servers, within the number of moves we allow them to make. Appendix C.1 appendix A shows the data structures we used to model web components. Appendix C.5 shows the construct of servers and operations that they can perform in our load balancing process. Appendix C.6 is the heart of the GRebalance algorithm, it is the balancer and does the balancing process. Appendix C.7 describes the Partition algorithm, and Listing 2.5 is the SRebalance algorithm. Appendix C.8 is our random test case generator. The input for both algorithm again is m-the number of servers and S - name of the data file containing the web components. We use data type double to represent all our web components. Each line in the data file describes the current load on this particular server, the load are separated with the String +. We used Java s own implementation of a modified merge sort [8] to do all sorting needed by the algorithms. For the SRebalance algorithm, we used two ArrayLists to hold different types of web components, namely large components and small components, instead of using one ArrayList and a reference to the first small web component. 4.5 Test Methodology We would like to test the theoretical error bounds of these algorithms against our data, to see if the error bounds are indeed valid. We would also like to find out how well the algorithms perform in practical situations. We are also interested in how variables such as the number of moves permitted and the average number of web components on each server affect the performance of the algorithms. Again, we use makespan to measure the performance of the algorithms; the smaller the makespan, the better the result. As the Load Rebalancing Problem is NP-hard as well, we run into a similar situation as before. To test the accuracy of the algorithms we would first need to know the optimal value. Our first attempt was to enumerate all possible cases of assigning k web components to m servers, we do so by generating all possible cases of picking k web components and assign them to m servers. We then we record the maximum load on any server for each assignment and pick 16

22 the assignment in which the heaviest loaded server has the least load. Finally we attempt to identify the assignment with the least maximum load and thus the optimal value. This attempt was not successful as the run time required to obtain all ways of moving k web components and re-assign them is impractical to calculate within a short time, after n and m gets large. Our second attempt was to use a similar strategy as we used in the Load Balancing Problem. We first decide the optimal solution, and the number of moves permitted k. To guarantee the generation of an optimal solution, we assign all web components the same load, i.e, the optimal load. We then work back-wards to figure out what web components are on each server. This is again achieved by generating a series of random numbers from 0 to the optimal load, and taking the difference between successive numbers. Lastly, we perform the moving step k times. In this step, we first pick a random server as the source server, and another as the target server. We then pick a random web component on the source server and move it to the target server. This process is reversible, and we can achieve the pre-generated optimal solution in no more than k steps. The test file is then read in by our programs, and the loads on the heaviest loaded servers are record and compared with the optimal solution that we have pre-generated. We present the results obtained in the following section. 4.6 Test Result We would first like to introduce the parameters we used to obtain the result. For each test, we pre-generate O, the optimal value. We set it to 100 for all the test cases used for simplicity reasons. m is the number of servers, and we have decided to keep it constant at 128 for simplicity reason. n cap is the maximum number of web components a server can have during the generation of the optimal result. n cap again provides us with some controls over how close the web components sizes are. For instance, when n cap is small, the chance of having web components with large load is high, and vice versa. Finally we decide k, the maximum number of moves permitted during the rebalancing process. For each test run, we run 200 numbers of tests using the same O, m, n cap and k. We record T, the approximated solution generated by GBalance and SBalance. We then compute A R, the average value of load on the heaviest loaded server and M R the maximum value of load on the heaviest loaded server obtained using the same m and n cap for 200 tests. Lastly We compute the average error ratio A E, and maximum error ratio M E, using separate sets of tests. We start our test by setting n cap and k to 4. We then increase the number of k and n cap and repeat the experiment. The results are presented in the tables in 17

23 Figure 4.1: Average Output Obtained Using m = 128 and n cap = 64 the Appendix B.2. Figure 2 is a plot of the makespan generated using these two algorithms against log n cap, obtained using m = 128 and n cap = Analysis and Comparison The first thing we noticed is that we experienced machine precision issues while running the SRebalance algorithm. For instance, if O is 100, we would sometimes get an average load of as our initial guess. As a result of this, all servers with a load of 100 will suffer a mis-calculation in a i, b i and c i. The 18

24 number of moves required would usually be much greater than if we use the correct value as the initial guess, i.e This is mainly due to servers with web components summing up to 100 will suffer from a miscalculation of b i, because now the program believes there are excess web components on these servers to be removed. For example, servers with load summing up to 100 will have a b i value of at least 1. However, the machine precision issue does not seem to affect the result significantly, as in all our cases, the next threshold value resulted in an approximated solution within the said error bound. We also observed that all the said bounds are valid on our implementations. In particular, GRebalance achieved a error bound of 2 1, while SRebalance m achieved a ratio of 3. 2 SRebalance takes longer time to run in practice, however, even with the largest set of data in our test cases (256 moves, 256 slices), the run time difference between the two algorithm is not very significant. On average SRebalance only takes about 1 second more to execute. In our test cases, the results produced by SRebalance is generally better than the result produced by GRebalance. However, the case does not hold all the time. For a number of individual cases generated we noticed that GRebalance produces better results. The result of GRebalance is also affected by the number of moves. Interestingly, we discovered that sometimes this algorithm produce better results when it makes less than permitted move. For instance, if the test cases is generated by moving 64 steps, the result generated by GRebalance is sometimes better if we only allow it to make 16 moves. It would be easier to understand this if we consider the extreme case, that is, if we allow it to make as many moves as the total number of web components. In this case, it would remove all the web components from servers, and perform the exact same operation as GBalance. This operation is not ideal as any large web component inserted near the end would significantly worsen the result approximated. We also discovered that the performance of SRebalance is greatly affected by its bottleneck cases, that is, if there exist a server with one large web component of size close to Opt, and several small web components with sum close to 1Opt. 2 In this case, SRebalance will not attempt to move any web components from this server, thus leaving an error ratio close to 3. In fact, this behavior counts for 2 the vast majority cases where the program produces approximations close to the theoretical bound. This behavior appears to take place more often when k gets larger, as illustrated on Figure 2. We observed that these two algorithms have apparent weaknesses. The al- 19

25 gorithms seem to be more interested to keep the approximated result within a constant error bound rather than balancing it as best as it could. In particular, GRebalance could produce an approximation worse than what it is given to start with, and the results produced tend to worsen when it uses all the allowed moves; SRebalance does not utilize all the moves it is given and is only interested in servers with a load greater than Proposed Improvement By learning the limitations and weakness of GRebalance and SRebalance, we constructed two improved versions of the original algorithms Proposed Improvement for GRebalance There are two operations that can be implemented to improved the performance of GRebalance. We first noted that after the removal of k web components from the currently most loaded serve, the problems essentially becomes the load balancing problem. More specifically, since we can no longer move the remaining web components, we can treat the sum of web components on any server to be a single web components. In the original GRebalance algorithm, the removed k jobs are then assigned to the currently least loaded server in the sequence that they have been removed. That implies that if a large web component is removed last, it will be assigned last. Inserting a large job last is likely to worsen the performance, as we have discovered earlier for the GBalance algorithm. The first improvement we propose is then to sort the k removed jobs into descending order of size, and assign them to the currently least loaded server in this sequence. During the testing phase, we accidentally discovered that GRebalance sometimes produced better result when it makes less than the k permitted moves. In other words, its performance worsen beyond a certain value y. This would be easier to visualize if we start from a perfectly balanced set of servers, and ask GRebalance to make k moves. The result returned is very unlikely to be what we started with, i.e. the optimal case. Knowing that, another improvement we proposed is to keep track of the makespan while increasing the number of moves permitted, until the physical limitation of k moves is reached, and return the least makespan. The original GBalance will always make k moves, while the improved version will likely to make less than k moves. This however does not contradict with the objective of the problem, since we are trying to rebalance web components with at most k moves. 20

26 Algorithm GRebalance2 (input k - number of moves, P - list of servers with loads, output: BP - list of servers with loads) Calculate the current makespan Lspan = current makespan For i in 1 to k For n in 1 to i Remove the largest single web component from the currently most loaded server. (Step 1) End for Sort the removed i web components into descending order of size For n in 1 to i Placed the k removed web components onto the currently minimumloaded server (Step 2) End for If current makespan < Lspan, Lspan =current makespan End for Clearly this algorithm would not produce worse result than its input or the original GRebalance algorithm, the only drawback being the run time from O(n log n) to O(n k log n). However, this is still polynomial in run time Proposed Improvement for SRebalance There are two improvement we would like to propose for the SRebalance algorithm. Firstly, we noticed that in the Partition algorithm, Step 4, 5 and 6 involves the reassignment of jobs. In Step 4 and 5, large jobs are being reassigned to remaining large free servers in an arbitrary manner. In Step 6, small jobs are reassigned to the currently least loaded server. Again, a simple sorting procedure on the large jobs before reassigning them seems beneficial as the larger large jobs would be taken care of first, and does not affect the Step 6 reassignment of small jobs since all small jobs have a size less than that of any large jobs. This would be easier to illustrate with an example with no small jobs to be reassigned. In this case, at this step this is similar to a load balancing problem. As illustrated 21

27 earlier, SBalance produced better result as it takes care of the large web components first. We could also sort the small jobs in Step 6 before reassigning them for the same reason. The second improvement comes from the observation that not all k moves are utilized. This under utilization can be taken advantage of. We would like to use b to denote the number of unused moves. Since we have already developed GRebalance2 which is guaranteed to produce a result no worse than its input, we propose that we run GRebalance2 on the configuration produced by our improved algorithm, with b permitted moves. For instance, if we have 2 servers with loads 1 and 0.5 on the first server, and loads 0.5 on the second server, and we allow it to make 2 moves, the original SRebalance program would not do any actual movement and would leave with an approximated makespan of 1.5 and 2 unused moves. Utilizing these 2 moves using GRebalance2 would give an optimal solution of makespan 1. Algorithm Partition2 1. From each of the m L servers which has a large web component, remove all large web components, except for the smallest sized large web component therein. 2. Calculate for each server i, the following values with respect to their current configuration a i : the minimum number of small web components to be removed so that the total size of the remaining small web components is at most 1 2 OP T b i : the minimum number of web components (including any large web components) to be removed so that the total size of the remaining web components (including any large web components) is at most OPT. c i = a i b i 3. Select the L T servers within the smallest values of c i, breaking ties by giving preference to the servers containing large web components. Remove the a i small web components from the selected servers, thereby ensuring that the total size of the remaining small web components on these servers is at most 1 2 OP T. 4. From the remaining m L T servers, remove the b i web components from them. Large web components, if any needs to be reassigned. Move these large web components. 22

28 5. Sort the large jobs removed in step 1 and step 4 in decreasing order of size, and assign them to the remaining large-free servers with least load. 6. For the small web components removed in 3 and 4, sort them in decreasing order of size and assign them one-by-one to the current minimum-load server. Algorithm SRebalance2 1. Use the average load as the starting guess for OP T 2. Calculate the corresponding L T, L E, a i, b i, c i values using Partition2. Let k b be the total number of moves needed by this algorithm. 3. While k b > k do Increase the guessed value of OP T over to the next threshold value Recalculate corresponding values for L T, L E, a i,b i, c i End While 4. Return the result produced by the last execution of Partition2. 5. Run GRebalance2 (current configuration, k k b ); Very clearly, the combination of these two algorithms in sequence would not increase the magnitude of the order of the run time, and thus the run time would clearly be polynomial. 23

29 CHAPTER 5 Conclusion Even though the load balancing and Load Rebalancing Problems are NP-hard, approximation algorithms can be effectively used to help us address these problems. By using very simple models and developing a simple implementation of the algorithms we gain a better understand of the algorithms, and their effectiveness in performing balancing and rebalancing tasks. Our test results show that all implementations of the algorithms achieve an error ratio within the said bound. More specifically, GBalance and GRebalance produces an approximation that is within a constant factor of 2 of the optimal value, while SBalance and SRebalance produces an approximation with a constant factor of 3. The four algorithms are polynomial in run time, and produce 2 result in realistic amount of time. More specifically, GBalance runs in O(mn) time, while SBalance, GRebalance and SRebalance runs in O(n log n) time. With these result, we therefore conclude that these algorithms are very efficient and helpful to give a close approximation to the load balancing and rebalancing problem on a web-based environment. Our tests also provide us some idea of how well the algorithms perform in real life situations. More specifically, we found that for the Load Balancing Problem, SBalance consistently achieved better result than GBalance in all our test cases, regardless of number of web components and number of servers. For the Load Rebalancing Problem, SRebalance achieved better result than GRebalance in most cases. Although compared with GBalance and GRebalance respectively, SBalance and SRebalance take longer time to do the computations, the clear advantage of producing better result outweighs the slight computation disadvantage, if we assume an additional computational time of a few minutes to be of negligible cost. We therefore conclude that SBalance and SRebalance are better greedy algorithms. Since in practice, hosting companies or company s hosting large web sites have a good idea of the physical limitations of their servers, they can run these algorithms, and understand whether the current number of servers are sufficient 24

30 based on the approximated makespan generated. If the makespan approximated is greater than the physical limitation of servers, more server is needed to ensure all servers are not overloaded. Some of these algorithms have weaknesses that we could identify. Since in practical situations we are interested in making the load as balanced as possibly across servers, we proposed possible improvements of the original GRebalance and SRebalance algorithm. Although not implemented and empirically tested, we have explained why we believed such improvements would be able to produce equal or better result in practice, and still able to produce result in polynomial time. Lastly, there are many limitations to the model and testing methodology we used. We would like to point out that the models we developed are not complicated enough to model real life situations. For instance, for the Load Rebalancing Problem, re-hosting different web components may incur different costs. As such, a cost function should be associated with each web components. Aggarwal et al [4] discussed the modifications needed for this to take place in their research in great depth in their paper. Ideally we would like to have a large set of data on real web servers, and the optimal arrangements to test our algorithms against; however, due to the lack of such resources, we had to make up our own test cases. There are two significant problems with that. First of all, the data generated might not be very realistic. Secondly, although generating all test cases to an optimal solution guarantees a reversible procedure to obtain the solution, however, it is very limiting to test the effectiveness of the algorithms as such cases are extremely rare in real life situations. 25

31 APPENDIX A Original Research Proposal An Analysis of Methods Used to Approximate the Computationally Intractable Problems Introduction: NP-complete problems are well known in the field of Computer Science. The debate on NP completeness started as early as mid 90s. It is known that these problems are currently not solvable in polynomial time. Although tremendous effort have been put in, up to today, no one could prove either NP-complete problems can be solved in polynomial time, nor disprove this. This suggests solving these problems would take exponential time, and become computationally intractable when the problems get large. However, on the other hand, NP-complete problems have tremendous economic and academic value, and they appear everywhere. For instance, a freight company might try to dispatch goods on an airplane to a set a locations while minimizing total distance traveled so as to reduce fuel cost; a school might want to schedule exams so that minimal number of conflicts are involved so as to reduce the human cost involved in rescheduling exams; A map making company might wish to color the map with minimal number of colors so that no neighbors share the same color; The military might want to establish minimum number of patrol points so that all roads in the country are connected to at least one patrol point The list goes on. Research Aim: Even though a successful reduction of such an intractable problem to a polynomial problem is not feasible at this point of time, many approximation algorithms that generate approximated solutions within a well-defined range were introduced during the years. Although it is known that NP-complete problems could be reduced to one another, the effectiveness and time complexity required by these 26

32 approximation algorithms to solve different NP-complete problems varies significantly. For instance, approximation algorithm to find a solution to the Knapsack problem generally takes shorter time than an approximation algorithm for the map-coloring problem. This feature of these approximation algorithms makes solving some of the NP-complete problems feasible, while leaving the others to be explored. This paper aims to examine, compare and contrast the effectiveness and time complexity of these approximation algorithms. In other words, it aims to examine what algorithms are feasible for approximating solutions of problems with known size within a given an error range. Research Methods and Time Line The research is mainly done through the gathering and reading of previously done work in this field by other professionals, either from journals, articles or published paper. The tasks are outlined below; each task comes with a time frame associated. Examinations of motivation for solving NP-complete problems: Materials needed will be gathered form the UWA library, ACM (American Computing Machinery) journals and other resources on the Internet. Time required: 2 weeks Gathering Articles, journals and published papers: Materials used in this research paper will be carefully selected within a time frame of 3 weeks. From ACM: 1 week Internet: 1 week University of Western Australia Libraries: 1 week Reading of journals and papers: This reading time is allocated for a general reading of all materials gathered. At this point, irrelevant papers will be discarded from the list. Time required: 2 weeks Selection of problems to examine: There are many NP-problems. Due to technical and time limitations about 4-5 approximation algorithms will be finally selected. Time required: 3 weeks In-depth reading of the materials gathered: During this period of time the selected articles will be examined in depth. Time required: 4 weeks Experiment/Finding appropriate articles with time complexity: To test the actually time complexity, some experiments might be needed. If this is needed, the MATLAB programming environment will be used. This period is mainly used to obtain results, and comparing them. Time required: 4 weeks Final write up: 2 weeks Estimated Total Time needed: 20 weeks Hardware/Software required: No special software/hardware is needed. For this research, the access to UWA library, computer labs and ACM is needed. The MATLAB programming environment is needed. 27

33 28

34 APPENDIX B Approximated Makespan Generated for the Load Balancing and Load Rebalancing Problem B.1 Result Generated Using GBalance and SBalance Results Obtained using m = 4 and n cap =4 Algorithm A R M R A E M E GBalance % 77.8% SBalance % 10.5% Results Obtained using m = 16 and n cap =4 Algorithm A R M R A E M E GBalance % 72.9% SBalance % 11.3% Results Obtained using m = 64 and n cap =4 Algorithm A R M R A E M E GBalance % 71.2% SBalance % 8.9% Results Obtained using m = 256 and n cap =4 Algorithm A R M R A E M E GBalance % 69.7% SBalance % 7.3% Results Obtained using m = 4 and n cap =16 Algorithm A R M R A E M E GBalance % 68.4% SBalance % 5.8% Results Obtained using m = 16 and n cap =16 Algorithm A R M R A E M E GBalance % 84.3% SBalance % 0.8% Results Obtained using m = 64 and n cap =16 29 Algorithm A R M R A E M E GBalance % 81.1% SBalance % 0.5% Results Obtained using m = 256 and n cap =16 Algorithm A R M R A E M E GBalance % 82.7%

35 Results Obtained using m = 16 and n cap =64 Algorithm A R M R A E M E GBalance % 88.1% SBalance % 0.1% Results Obtained using m = 64 and n cap =64 Algorithm A R M R A E M E GBalance % 79.6% SBalance % 0.1% Results Obtained using m = 256 and n cap =64 Algorithm A R M R A E M E GBalance % 87.3% SBalance % 0.1% Results Obtained using m = 4 and n cap =256 Algorithm A R M R A E M E GBalance % 22.6% SBalance % 0.2% Results Obtained using m = 16 and n cap =256 Algorithm A R M R A E M E GBalance % 51.1% SBalance % 0.1% Results Obtained using m = 64 and n cap =256 Algorithm A R M R A E M E GBalance % 93.2% SBalance % 0.1% Results Obtained using m = 256 and n cap =256 Algorithm A R M R A E M E GBalance % 97.6% SBalance % 0.1% 30

36 B.2 Result Generated Using GRebalance and SRebalance Results Obtained using k = 4 and n cap =4 Algorithm A R M R A E M E GRebalance % 70.4% SRebalance % 49.4% Results Obtained using k = 4 and n cap =16 Algorithm A R M R A E M E GRebalance % 60.2% SRebalance % 32.2% Results Obtained using k = 4 and n cap =64 Algorithm A R M R A E M E GRebalance % 25.5% SRebalance % 21.5% Results Obtained using k = 4 and n cap =256 Algorithm A R M R A E M E GRebalance % 20.9% SRebalance % 10.0% Results Obtained using k = 16 and n cap =4 Algorithm A R M R A E M E GRebalance % 76.3% SRebalance % 49.2% Results Obtained using k = 16 and n cap =16 Algorithm A R M R A E M E GRebalance % 73.8% SRebalance % 44.7% Results Obtained using k = 16 and n cap =64 Algorithm A R M R A E M E GRebalance % 69.9% SRebalance % 24.6% Results Obtained using k = 16 and n cap =256 Algorithm A R M R A E M E GRebalance % 21.4% SRebalance % 9.0% Results Obtained using k = 64 and n cap =4 Algorithm A R M R A E M E GRebalance % 73.9% SRebalance % 49.9% 31

37 Results Obtained using k = 64 and n cap =16 Algorithm A R M R A E M E GRebalance % 71.6% SRebalance % 45.7% Results Obtained using k = 64 and n cap =64 Algorithm A R M R A E M E GRebalance % 71.6% SRebalance % 45.7% Results Obtained using k = 64 and n cap =256 Algorithm A R M R A E M E GRebalance % 93.1% SRebalance % 129.0% Results Obtained using k = 256 and n cap =4 Algorithm A R M R A E M E GRebalance % 84.6% SRebalance % 49.8% Results Obtained using k = 256 and n cap =16 Algorithm A R M R A E M E GRebalance % 72.8% SRebalance % 49.2% Results Obtained using k = 256 and n cap =64 Algorithm A R M R A E M E GRebalance % 74.5% SRebalance % 48.2% Results Obtained using k = 256 and n cap =256 Algorithm A R M R A E M E GRebalance % 88.96% SRebalance % 41.7% 32

38 APPENDIX C Java Code Used C.1 Model of Web Component in Load Balancing and Load Rebalancing Problem public class Gjob{ double gjob; public Gjob(double jobsize) { gjob = jobsize; public double thisjob(){ return gjob; C.2 Model of Server in the Load Balancing Problem import java.io.*; import java.lang.*; import java.util.*; public class SimpleMachine { double TotalLoad; ArrayList JobList; public SimpleMachine () { this.totalload = 0; JobList = new ArrayList(); 33

39 public void Add(double job){ TotalLoad += job; JobList.add(new Gjob(job)); public double currentload() { return this.totalload; public void printjobs() { for (int i = 0; i< JobList.size(); i++){ System.out.println("Job"+i+"="+((Gjob)(JobList.get(i))).thisjob()); System.out.println("Total="+TotalLoad); C.3 Implmentation of GBalance Algorithm import java.io.*; import java.util.*; import java.lang.*; public class GBalance{ public static int update (SimpleMachine[] macs) { int Mindex = -1; double Mload = 1.0/0.0; for (int j = 0; j<macs.length; j++){ if (macs[j].currentload() < Mload) { Mload = macs[j].currentload(); Mindex = j; return Mindex; 34

40 public static void main (String args[]) { SimpleMachine [] machines; ArrayList Jobs = new ArrayList(); int MLoadIndex = 0; machines = new SimpleMachine[Integer.valueOf(args[0]).intValue()]; for (int k=0; k< (Integer.valueOf(args[0])).intValue(); k++){ machines[k] = new SimpleMachine(); try { int n = 0; FileInputStream fstream = new FileInputStream(args[1]); DataInputStream in = new DataInputStream(fstream); while (in.available()!=0) { Jobs.add(new Gjob(Double.valueOf(in.readLine()).doubleValue())); n++; in.close(); catch (Exception e) { System.err.println("File input error.."+args[1]); for (int p = 0;p< Jobs.size();p++){ Machines[MLoadIndex].Add(((Gjob)(Jobs.get(p))).thisjob()); MLoadIndex = update(machines); int MMindex=-1; double MMload = -1; for (int m = 0; m< machines.length; m++){ 35

41 if (machines[m].currentload()>mmload){ MMload = machines[m].currentload(); MMindex = m; System.out.println("max load is "+machines[mmindex].currentload()); C.4 Implementations of the SBalance Algorithm import java.io.*; import java.util.*; import java.lang.*; public class SBalance{ public static int update (SimpleMachine[] macs) { int Mindex = -1; double Mload = 1.0/0.0; for (int j = 0; j<macs.length; j++) { if (macs[j].currentload() < Mload) { Mload = macs[j].currentload(); Mindex = j; return Mindex; public static void main (String args[]) { SimpleMachine [] machines; ArrayList Jobs = new ArrayList(); 36

42 int MLoadIndex = 0; machines = new SimpleMachine[Integer.valueOf(args[0]).intValue()]; for (int k=0; k< (Integer.valueOf(args[0])).intValue(); k++) { machines[k] = new SimpleMachine(); try { int n = 0; FileInputStream fstream = new FileInputStream(args[1]); DataInputStream in = new DataInputStream(fstream); while (in.available()!=0) { Jobs.add(new Gjob(Double.valueOf(in.readLine()).doubleValue())); n++; in.close(); catch (Exception e) { System.err.println("File input error.."+args[1]); Collections.sort(Jobs, new Comparator() { public int compare(object a, Object b){ double k,t; k = ((Gjob)(a)).thisjob(); t = ((Gjob)(b)).thisjob(); if (k>t){ return -1; else if (k<t){ return 1; else { return 0; 37

43 ); for (int p = 0;p< Jobs.size();p++){ machines[mloadindex].add(((gjob)(jobs.get(p))).thisjob()); MLoadIndex = update(machines); int MMindex=-1; double MMload = -1; for (int m = 0; m< machines.length; m++){ if (machines[m].currentload()>mmload){ MMload = machines[m].currentload(); MMindex = m; System.out.println("max load is "+machines[mmindex].currentload()); C.5 Model of Server in the Load Rebalancing Problem import java.io.*; import java.lang.*; import java.util.*; public class Processor{ ArrayList SmallJobs; ArrayList LargeJobs; double sum; int Ai,Bi,Ci; public double firstlarge(){ if (LargeJobs.size()>0){ return ((Gjob)(LargeJobs.get(0))).thisjob(); else { return -1; 38

44 public double[] gett(){ double[] result = new double[largejobs.size()]; for (int i=0;i<largejobs.size();i++){ result[i] = ((Gjob)(LargeJobs.get(i))).thisjob(); return result; public static String[] explode(string s, String delimiter) { int delimiterlength; int stringlength = s.length(); if (delimiter == null (delimiterlength = delimiter.length()) == 0) { return new String[] {s; int count = 0; int start = 0; int end; while ((end = s.indexof(delimiter, start))!= -1) { count++; start = end + delimiterlength; count++; String[] result = new String[count]; count = 0; start = 0; while ((end = s.indexof(delimiter, start))!= -1) { result[count] = s.substring(start, end); count++; start = end + delimiterlength; end = stringlength; result[count] = s.substring(start, end); return result; 39

45 public void report(){ System.out.println("List of large jobs:"+largejobs.size()); for (int i=0;i<largejobs.size();i++){ System.out.println( ((Gjob)(LargeJobs.get(i))).thisjob()); System.out.println("List of small jobs:"+smalljobs.size()); for (int i=0;i<smalljobs.size();i++){ System.out.println( ((Gjob)(SmallJobs.get(i))).thisjob()); System.out.println("Ai is "+Ai); System.out.println("Bi is "+Bi); System.out.println("Ci is "+Ci); //sorts smalljobs and large jobs array public void sortall(){ Collections.sort(SmallJobs, new Comparator() { public int compare(object a, Object b){ double k,t; k = ((Gjob)(a)).thisjob(); t = ((Gjob)(b)).thisjob(); if (k>t){ return -1; else if (k<t){ return 1; else { return 0; ); Collections.sort(LargeJobs, new Comparator() { public int compare(object a, Object b){ double k,t; k = ((Gjob)(a)).thisjob(); t = ((Gjob)(b)).thisjob(); if (k>t){ 40

46 return -1; else if (k<t){ return 1; else { return 0; ); public double RemoveLarge(){ double res = ((Gjob)(LargeJobs.remove(0))).thisjob(); sum-=res; return res; public double RemoveSmall(){ double res = ((Gjob)(SmallJobs.remove(0))).thisjob(); sum-=res; return res; //remove excess large jobs and returns result public double[] RemoveLargeJobs(){ int index = 0; double [] extralarge; if (LargeJobs.size()>1){ extralarge = new double[largejobs.size()-1]; else {//else do nothing extralarge=new double[1]; extralarge[0]=-1; return extralarge; for (int i = LargeJobs.size();i>1;i--){ extralarge[index] = RemoveLarge(); index ++; 41

47 return extralarge; //remove ai small jobs and return in an array public double[] RemoveAi(){ int index = 0; double [] extrasmall; //System.out.println("Ai is "+Ai); if (Ai>0){ extrasmall = new double[ai]; else { extrasmall = new double[1]; extrasmall[0]=-1; return extrasmall; for (int i=0; i<ai;i++){ //System.out.println(i+" :number of small jobs "+SmallJobs.size()); extrasmall[index] = RemoveSmall(); System.out.println(" removing small job "+extrasmall[index]); index ++; return extrasmall; public double[] RemoveBi(){ int index = 0; double [] extrab; if (Bi>0){ extrab = new double[bi]; for (int i=0;i<bi;i++){ if (LargeJobs.size()>0){ extrab[index] = RemoveLarge(); System.out.println(" removing large (bi) job "+extrab[index]); else { extrab[index] = RemoveSmall(); 42

48 System.out.println(" removing small (bi) job "+extrab[index]); index ++; else { extrab = new double[1]; extrab[0]=-1; return extrab; return extrab; public double SizeSmall(){ return SmallJobs.size(); public double SizeLarge(){t return LargeJobs.size(); public int getci(){ return this.ci; public double getsum(){ return sum; public Processor() { this.smalljobs = new ArrayList(); this.largejobs = new ArrayList(); this.ai=0;this.bi=0;this.ci=0; this.sum=0; public int AddJob(double job, double Opt){ if (job>opt/2) { LargeJobs.add(new Gjob(job)); sum+=job; return 1; 43

49 else { SmallJobs.add(new Gjob(job)); sum+=job; return 0; //sum of the last to ith element private double sumsmall(int i){ double tempsum = 0; for (int b=0;b<=i;b++){ Gjob c = (Gjob) SmallJobs.get(b); tempsum+=c.thisjob(); return tempsum; private double sumlarge(int i){ double tempsum = 0; for (int b=0;b<=i;b++){ Gjob c = (Gjob) LargeJobs.get(b); tempsum+=c.thisjob(); return tempsum; public void count(double Opt){ if (sumsmall(smalljobs.size()-1)>opt*0.5){ double fakesum = sumsmall(smalljobs.size()-1); //System.out.println("fake sum is "+fakesum); Ai++; for (int a=0;a<smalljobs.size();a++){ fakesum-=((gjob)(smalljobs.get(a))).thisjob(); //System.out.println("fake sum is "+fakesum); if (fakesum <= Opt/2){ //System.out.ptrintln("breaking"); break; Ai++; 44

50 if (sum>opt){ double fakesum = sum; for (int b=0;b<largejobs.size();b++){ Bi++; fakesum-=((gjob)(largejobs.get(b))).thisjob(); //System.out.println("here famkesum is "+fakesum); if (fakesum <= Opt){ break; //System.out.println("Bi if (fakesum > Opt){ Bi++; for (int c=0;c<smalljobs.size();c++){ fakesum-=((gjob)(smalljobs.get(c))).thisjob(); if (fakesum <=Opt){ break; Bi++; Ci=Ai-Bi ; C.6 Implementations of the GRebalance Algorithm import java.io.*; import java.lang.*; import java.util.*; public class GRebalancer { public static void main (String args[]) { int MMindex = -1; double MMload = -1; 45

51 ArrayList machines = new ArrayList(); ArrayList removedjobs = new ArrayList(); //making new machines for (int i=0;i<(integer.valueof(args[0]).intvalue());i++){ machines.add(new Processor()); /** Read All Jobs and assign them to processors **/ try { FileInputStream fstream = new FileInputStream(args[1]); DataInputStream in = new DataInputStream(fstream); int currentm = 0; while (in.available()!=0) { // Print file line to screen String CJobs=in.readLine(); String [] TJobs; TJobs = Processor.explode(CJobs,"+"); Processor P = (Processor) machines.get(currentm); for (int j=0;j<tjobs.length;j++){ P.AddJob(Double.valueOf(TJobs[j]).doubleValue(),0); P.sortAll(); machines.set(currentm,p); currentm++; catch (Exception e) { System.err.println("File input error"); for (int ii=0;ii<(integer.valueof(args[2]).intvalue());ii++){ int Mindex = -1; double LJsize = -1; for (int jj=0;jj<machines.size();jj++){ Processor V = (Processor) machines.get(jj); if (V.getSum()>LJsize){ Mindex = jj; 46

52 LJsize = V.getSum(); Processor VV = (Processor) machines.get(mindex); removedjobs.add(new Gjob(VV.RemoveLarge())); machines.set(mindex,vv); //end for (int ii=0;... while (removedjobs.size()>0){ Gjob here = (Gjob) removedjobs.remove(0); //find the least loaded machine int Lindex = -1; double Lload = 99999; for (int kk=0;kk<machines.size();kk++){ Processor VVV = (Processor) machines.get(kk); if (VVV.getSum()<Lload){ Lindex = kk; Lload = VVV.getSum(); Processor VVVV = (Processor) machines.get(lindex); VVVV.AddJob(here.thisjob(),0); machines.set(lindex,vvvv); for (int i=0;i<machines.size();i++){ Processor p4 = (Processor) machines.get(i); //System.out.println("final report.."+i); //p4.report(); if (p4.getsum()>mmload){ MMindex = i; MMload = p4.getsum(); System.out.println ("Maximum Loaded Processor is: "+MMload); //end main //end class 47

53 C.7 Implementations of the SRebalance Algorithm import java.io.*; import java.lang.*; import java.util.*; //args 0 = number of machin, args[1] = file name, args[2] = moves public class Rebalancer { public static void main (String args[]) { ArrayList finalmachines = new ArrayList(); int MMindex = -1; double MMload = -1; ArrayList Tholds = new ArrayList(); ArrayList TempMachines = new ArrayList(); int moves =999999; double sumjobs = 0; for (int i=0;i<(integer.valueof(args[0]).intvalue());i++){ TempMachines.add(new Processor()); /************ code to read in jobs to TempMachines array ****************/ /************ and sum up all jobs to give initial guess *****************/ try { FileInputStream fstream1 = new FileInputStream(args[1]); DataInputStream in1 = new DataInputStream(fstream1); int currentm1 = 0; while (in1.available()!=0) { // Print file line to screen String CJobs1=in1.readLine(); String [] TJobs1; TJobs1 = Processor.explode(CJobs1,"+"); Processor P1 = (Processor) TempMachines.get(currentM1); for (int j=0;j<tjobs1.length;j++){ P1.AddJob(Double.valueOf(TJobs1[j]).doubleValue(),0); 48

54 sumjobs+=double.valueof(tjobs1[j]).doublevalue(); P1.sortAll(); //P.report(); //end while catch (Exception e) { System.err.println("File input error1"); /****************************************************************************** /************** end code to read in all jobs to temparray and give initila gues double init_guess = sumjobs / Integer.valueOf(args[0]).intValue(); /********************************************************************/ /*** code to generate all threashold values *************************/ /********************************************************************/ while (TempMachines.size()>0){ Processor B = (Processor) TempMachines.remove(0); double[] kjobs = B.getT(); double localsum = 0; for (int i=0;i<kjobs.length;i++){ if (kjobs[i]*2>init_guess){ Tholds.add(new Gjob(kjobs[i]*2)); if (localsum+kjobs[i]>init_guess){ Tholds.add(new Gjob(localsum+kjobs[i])); if (localsum*2+kjobs[i]*2>init_guess){ Tholds.add(new Gjob(localsum*2+kjobs[i]*2)); localsum += kjobs[i]; 49

55 //end (while TempMachines.size()>0) /********************************************************************/ /*** end code to generate all threashold values *********************/ /********************************************************************/ /*** code to add init_guess to tholds and sort tholds *****/ Tholds.add(new Gjob(init_guess)); Collections.sort(Tholds, new Comparator() { public int compare(object a, Object b){ double k,t; k = ((Gjob)(a)).thisjob(); t = ((Gjob)(b)).thisjob(); if (k>t){ return -1; else if (k<t){ return 1; else { return 0; ); /** end finalization of tholds ****************************/ //repeat when while(tholds.size()!=0&&moves>integer.valueof(args[2]).intvalue()){ double c_t = ((Gjob)(Tholds.remove(Tholds.size()-1))).thisjob(); moves = 0; System.out.print("C_T is: "+c_t); ArrayList machines = new ArrayList(); ArrayList Ltmachines, Ltbarmachines, Largefree; ArrayList ELJobs; ArrayList ESJobs; //extra large jobs, extra small jobs int Lt = 0; Ltmachines = new ArrayList(); 50

56 Ltbarmachines = new ArrayList(); Largefree = new ArrayList(); for (int i=0;i<(integer.valueof(args[0]).intvalue());i++){ machines.add(new Processor()); ESJobs = new ArrayList(); ELJobs = new ArrayList(); /** Read All Jobs and assign them to processors **/ try { FileInputStream fstream = new FileInputStream(args[1]); DataInputStream in = new DataInputStream(fstream); // Continue to read lines while // there are still some left to read int currentm = 0; while (in.available()!=0) { // Print file line to screen String CJobs=in.readLine(); String [] TJobs; TJobs = Processor.explode(CJobs,"+"); Processor P = (Processor) machines.get(currentm); for (int j=0;j<tjobs.length;j++){ Lt+=P.AddJob(Double.valueOf(TJobs[j]).doubleValue(),c_t); P.sortAll(); /** remove all but one large job **/ double [] templarge = P.RemoveLargeJobs(); /** add extra large jobs to ELjobs **/ if (templarge[0]!=-1){ for (int l=0;l<templarge.length;l++){ ELJobs.add(new Gjob(tempLarge[l])); moves++; 51

57 /** set Ai,Bi,Bi accordingly **/ P.count(c_t); machines.set(currentm,p); currentm++; //in.close(); catch (Exception e) { System.err.println("File input error"); /** end code to load jobs **/ System.out.println("Lt is "+Lt); /****** step III *****/ Collections.sort(machines,new Comparator() { public int compare(object a, Object b){ double k,t; k = ((Processor)(a)).getCi(); t = ((Processor)(b)).getCi(); if (k>t){ return -1; else if (k<t){ return 1; else { if (((Processor)(a)).SizeLarge()>0){ return 1; else { return -1; 52

58 ); for (int k =1;k<= Lt;k++){ double [] SmallJ; Processor G = new Processor(); //System.out.println("Lt is:"+lt); G = (Processor) machines.remove(machines.size()-1); //System.out.println("attemping small removeal"); //G.report(); SmallJ = G.RemoveAi(); if (SmallJ[0]!=-1){ for (int m=0;m<smallj.length;m++){ ESJobs.add(new Gjob(SmallJ[m])); moves++; //add processed machine back //if no Large jobs (aka large free) if (G.SizeLarge()==0){ Largefree.add(G); System.out.println("adding to largefree"); else { Ltbarmachines.add(G); /********* Step III END ******/ //System.out.println("size of largefree = "+Largefree.size()); //System.out.println("size of Ltmachines = "+Ltmachines.size()); //System.out.println("size of Ltbarmachines = "+Ltbarmachines.size()); //System.out.println("size of machines = "+machines.size()); /********** Step IV ************/ while (machines.size()>0){ Processor G = new Processor(); 53

59 double [] BJobs; G = (Processor) machines.remove(0); BJobs = G.RemoveBi(); if (BJobs[0]!=-1){ for (int ii=0;ii<bjobs.length;ii++){ //if small onqueue if (BJobs[ii]<=c_t/2){ ESJobs.add(new Gjob(BJobs[ii])); moves++; else{ Processor P1 = new Processor(); P1 = (Processor) Largefree.remove(0); P1.AddJob(BJobs[ii],c_t); Ltmachines.add(P1); moves++; Ltbarmachines.add(G); //System.out.println("size of largefree = "+Largefree.size()); //System.out.println("size of Ltmachines = "+Ltmachines.size()); //System.out.println("size of Ltbarmachines = "+Ltbarmachines.size()); //System.out.println("size of machines = "+machines.size()); /********* END step IV ***********/ /********* Step V **************/ while (ELJobs.size()>0){ Processor P2 = new Processor(); P2 = (Processor) Largefree.remove(0); P2.AddJob(((Gjob)ELJobs.remove(0)).thisjob(),c_t); Ltmachines.add(P2); /************ End Step V *******/ /************** Step VI ********/ //merge All processors to one single array 54

60 while (Ltmachines.size()>0){ machines.add(ltmachines.remove(0)); while (Ltbarmachines.size()>0){ machines.add(ltbarmachines.remove(0)); while (Largefree.size()>0){ machines.add(largefree.remove(0)); while(esjobs.size()>0){ double LeastLoad = 99999; int LeastLoadIndex = -1; //find least loaded machine for (int j = 0; j<machines.size(); j++){ if (((Processor)machines.get(j)).getSum() < LeastLoad) { LeastLoadIndex = j; LeastLoad = ((Processor) machines.get(j)).getsum(); Processor P3 = (Processor) machines.remove(leastloadindex); P3.AddJob(((Gjob)ESJobs.remove(0)).thisjob(),c_t); machines.add(p3); MMindex = -1; MMload = -1; for (int i=0;i<machines.size();i++){ Processor p4 = (Processor) machines.get(i); //System.out.println("final report.."+i); //p4.report(); if (p4.getsum()>mmload){ MMindex = i; MMload = p4.getsum(); 55

61 /*** End step XI *********************/ System.out.println("moves this round is.."+moves); finalmachines = machines; //end while thold etc System.out.println ("Maximum Loaded the load is "+MMload); Processor p44 = (Processor) finalmachines.get(mmindex); p44.report(); //end main //end class C.8 Implmentations of the Random Generators for test cases import java.util.*; import java.io.*; import java.lang.*; //generator form balancing public class RandomGen{ public static void main (String args[]){ int m = Integer.valueOf(args[0]).intValue(); //number of machines double opt = Double.valueOf(args[1]).doubleValue(); //opt value int n = Integer.valueOf(args[2]).intValue(); //number of max slices ArrayList jobs = new ArrayList(); try { FileOutputStream out1 = new FileOutputStream("result.txt"); PrintStream p = new PrintStream(out1); //generate case for each machine for (int i=0;i<m;i++){ ArrayList slices = new ArrayList(); double x = Math.random()*n; int num_slice = (int) Math.round(x); if (num_slice!=0) num_slice--; for (int j=0;j<num_slice;j++){ 56

62 slices.add(new Gjob(Math.random()*opt)); slices.add(new Gjob(0)); slices.add(new Gjob(opt)); Collections.sort(slices, new Comparator() { public int compare(object a, Object b){ double k,t; k = ((Gjob)(a)).thisjob(); t = ((Gjob)(b)).thisjob(); if (k>t){ return -1; else if (k<t){ return 1; else { return 0; ); for (int k=0;k<slices.size()-1;k++){ if (((Gjob)(slices.get(k))).thisjob()-((Gjob)(slices.get(k+1))).thisjob jobs.add(new Gjob(((Gjob)(slices.get(k))).thisjob()-((Gjob)(slices.ge //end for (int k = 0...) //end for int i=0.. double sum=0; /*** print result to file **/ for (int l=0;l<jobs.size();l++){ try { p.println(((gjob)(jobs.get(l))).thisjob()); sum+=((gjob)(jobs.get(l))).thisjob(); catch (Exception e) { System.err.println("error3"); 57

63 //end for (int l=0) System.out.println(sum); catch (Exception e)//main catch { System.err.println("error4"); //end main //end class import java.util.*; import java.io.*; import java.lang.*; //generator form balancing public class RandomGenRB{ public static void main (String args[]){ int m = Integer.valueOf(args[0]).intValue(); //number of machines double opt = Double.valueOf(args[1]).doubleValue(); //opt value int n = Integer.valueOf(args[2]).intValue(); //number of max slices int shuf = Integer.valueOf(args[3]).intValue(); //number of steps ArrayList [] Alljobs = new ArrayList [m]; try { FileOutputStream out1 = new FileOutputStream("result.txt"); PrintStream p = new PrintStream(out1); //generate case for each machine for (int i=0;i<m;i++){ ArrayList slices = new ArrayList(); ArrayList jobs = new ArrayList(); double x = Math.random()*n; int num_slice = (int) Math.round(x); if (num_slice!=0) num_slice--; for (int j=0;j<num_slice;j++){ slices.add(new Gjob(Math.random()*opt)); 58

64 slices.add(new Gjob(0)); slices.add(new Gjob(opt)); Collections.sort(slices, new Comparator() { public int compare(object a, Object b){ double k,t; k = ((Gjob)(a)).thisjob(); t = ((Gjob)(b)).thisjob(); if (k>t){ return -1; else if (k<t){ return 1; else { return 0; ); for (int k=0;k<slices.size()-1;k++){ if (((Gjob)(slices.get(k))).thisjob()-((Gjob)(slices.get(k+1))).thisjob jobs.add(new Gjob(((Gjob)(slices.get(k))).thisjob()-((Gjob)(slices.ge //end for (int k = 0...) Alljobs[i]=jobs; //end for int i=0.. /****************************************/ /* shuffling step ***********************/ /****************************************/ for (int o=0;o<shuf;o++){ //generate machine to take; int source_m = (int) Math.round(Math.random()*m); if (source_m!=0) source_m--; 59

65 //make sure source machine has at least 2 jobs while(alljobs[source_m].size()<2){ source_m = (int) Math.round(Math.random()*m); if (source_m!=0 && Alljobs[source_m-1].size()>1) source_m--; int target_m = (int) Math.round(Math.random()*m); if (target_m!=0) target_m--; ArrayList s_m = Alljobs[source_m]; ArrayList t_m = Alljobs[target_m]; int jmove = (int) Math.round(Math.random()*s_m.size()); if (jmove!=0) jmove--; t_m.add(s_m.remove(jmove)); Alljobs[source_m]=s_m; Alljobs[target_m]=t_m; double sum=0; /*** print result to file **/ for (int l=0;l<alljobs.length;l++){ try { ArrayList g = Alljobs[l]; while (g.size()>0){ double pp = ((Gjob)(g.remove(0))).thisjob(); p.print(pp); sum+=pp; if (g.size()>0) p.print("+"); catch (Exception e) { System.err.println("error3"); if (l!=alljobs.length-1) p.println(""); //end for (int l=0) System.out.println(sum); catch (Exception e)//main catch { System.err.println("error4"); 60

66 //end main //end class 61

67 Bibliography [1] World Internet Users and Population Stats, retrieved from [2] China Internet Network Information Center, retrieved from [3] Kleinberg G. and Tardos E. Algorithm Design, Cornell University, Spring 2004 [4] Aggarwal G., Motwani R., and Zhu A. The Load Rebalancing Problem, ACM Symposium of Parallel Algorithms and Architecture, ACM Press, 2003 [5] Ensim Cooperation, Sunnyvale, CA, USA [6] Shmoys D.and Tardos E. An approximation algorithm for the generalized assignment problem. Mathematical Programming, 62(1993) [7] Graham R.L. Bounds for certain multiprocessing anomalies. Bell System Technical Jounal, 45(1966) [8] Class Collections, [9] Java 2 Platform, Standard Edition (J2SE 1.4.2), [10] Kunal Dua Balance your Web Server s Load, retrieved from [11] Ray S. Engelschall Load Balancing Your Web Site, retrieved from [12] Server Load Balancing Methods, retrieved from balance methods.htm [13] P.B.Linder and A.Shah. Website Migration Load Balancing of Web Servers. Manuscript. 62