Load Balancing and Rebalancing on Web Based Environment. Yu Zhang

Size: px
Start display at page:

Download "Load Balancing and Rebalancing on Web Based Environment. Yu Zhang"

Transcription

1 Load Balancing and Rebalancing on Web Based Environment Yu Zhang This report is submitted as partial fulfilment of the requirements for the Honours Programme of the School of Computer Science and Software Engineering, The University of Western Australia, 2004

2 Abstract We investigate two variants of a load distribution problem that is associated with distributing loads of varying size on a multi-server web-based environment. Solving the classical Load Balancing Problem allows us to distribute static web components to multiple servers, so that the loads on the servers are as equally distributed as possible. A typical objective is to minimize the makespan, the load on the heaviest loaded server. In reality however, loads on servers are often dynamic. As the load of web components change over time, the Load Rebalancing Problem was introduced by S. Keshav of Ensim Corporation. To solve the Load Rebalancing Problem we try to redistribute the loads of web components, in a fixed number of steps as moving components across servers can be expensive, so that the load on the servers are as equally distributed as possible. Solving these two problems successfully would allow us to utilize resources better and achieve better performance. However these problems have been proven to be NP-hard, thus generating the exact solutions in tractable amount of time becomes infeasible when the problems become large. We thus adopt four greedy approximation algorithms to solve these two problems in polynomial time, within constant guaranteed error ratio. We give implementations of the algorithms in the Java programming environment. We carry out experiments to show that the error bounds are valid on our implementation. We also performed various experiments to test the performance of the algorithms in practical situations. By analyzing our results carefully we identified weakness in some of the algorithms and proposed improvements. We conclude that these approximation algorithms do indeed run in polynomial time, they generate approximated results within the said error ratio on our test data sets, and they are valid tools to assist us to balance and rebalance loads on a multi-server web-based environment. Keywords: Approximation Algorithms, Load Balancing, Scheduling CR Categories: F.2.2, G.2.1, C.2.4 ii

3 Acknowledgements I would like to thank Ensim Corporation for posting the Load Rebalancing Problem, and my supervisor Gordon Royle for all his help and support. iii

4 Contents Abstract Acknowledgements ii iii 1 Introduction 1 2 Literature Review 3 3 Load Balancing Problem The Load Balancing Problem Algorithm GBalance Algorithm SBalance Implementations Test Methodology Test Result Analysis and Comparison Load Rebalancing Problem the Load Rebalancing Problem Algorithm GRebalance Algorithm SRebalance Implementations Test Methodology Test Result Analysis and Comparison Proposed Improvement Proposed Improvement for GRebalance Proposed Improvement for SRebalance iv

5 5 Conclusion 24 A Original Research Proposal 26 B Approximated Makespan Generated for the Load Balancing and Load Rebalancing Problem 28 B.1 Result Generated Using GBalance and SBalance B.2 Result Generated Using GRebalance and SRebalance C Java Code Used 33 C.1 Model of Web Component in Load Balancing and Load Rebalancing Problem C.2 Model of Server in the Load Balancing Problem C.3 Implmentation of GBalance Algorithm C.4 Implementations of the SBalance Algorithm C.5 Model of Server in the Load Rebalancing Problem C.6 Implementations of the GRebalance Algorithm C.7 Implementations of the SRebalance Algorithm C.8 Implmentations of the Random Generators for test cases v

6 CHAPTER 1 Introduction Over the last four years, the number of Internet users increased by 125%, reaching a population of 812, 931, 592 [1]. The growth in Asian countries is tremendous, for instance in China, the number of Internet users reached a record high of 87 million, according to CCNIC [2]. As a result of this phenomenon, more people are relying on the Internet for educational and recreational purposes. Web sites are a prevalent means in which people receive information and interact with others on the net. The size of web sites have thus grown significantly to accommodate the increasing needs. As the number of surfers, size of web pages as well as the number of web pages in web sites increases, using a single server to host all the web components is no longer sufficient, as the single server could very easily become overloaded and unable to accommodate to all requests in time. Thus distributing web components on multiple servers to allow faster delivery of information becomes a beneficial solution. The problem we are trying to solve then is how do we balance the web components across a number of servers as equally as possible. Firstly we would like to define the load of a web component, such as a static HTML document, a gif image, an avi clip, a macromedia flash file or in some cases an entire web site to be the total number of bytes that a server has to send to all surfers. It is the size of the file in bytes multiplied by the number of hits. We wish to discover methods to distribute the web components to a number of servers as equally as possible. A good way to achieve that is to minimize the load on the most loaded server. We are interested in two specific problems in this context, namely the Load Balancing Problem and the Load Rebalancing Problem. Load Balancing Problem states that given a number of servers and a list of loads, we seek to assign each load to a server, such that the loads are distributed as equally as possible. When a web site is first published, the Load Balancing Problem can be used to model the initial distribution of web components - given the size of web components and their estimated number of hits, we wish to try to find a way to optimally distribute the web components on a fixed number of servers such that the load on 1

7 the most loaded server is minimized. The second problem we wish to investigate is associated with load rebalancing. After certain amount of time, the loads of web components are very likely to vary. For instance the number of hits of a HTML document might increase, or the size of a thread in a forum might get larger. As a result, some server might become much more loaded than others. We would thus like to redistribute the web components by moving some of them from servers to servers. Because moving web components among servers could be expensive, we would like to minimize the number of moves. The Load Rebalancing Problem is thus given the servers and associated loads, how do we redistribute the loads such that the loads on servers are as equal as possible. By solving these problems, we are able to distribute load evenly among servers when a web site is first published, and readjust the loads when necessary. However, the Load Balancing Problem has been proved to be NP-hard [3], Aggarwal et al. [4] has also shown that the Load Rebalancing Problem is NPhard. Thus we are unable to deliver an exact solution to either in a reasonable amount of time when the problem gets large. For instance, if we have 5000 web components and 10 servers that we could use, we have configurations to enumerate and compare for the Load Balancing Problem. For the Load Rebalancing Problem, if we restrict the number of moves to 12, then we would end up with approximately configurations. While solving these problems is not feasible under such a situation, approximation algorithms could possibly be implemented to achieve a reasonable estimate within a fixed error ratio in a reasonable amount of time. Motivated by the above, we adopted four greedy algorithms to help us address the problems. GBalance and SBalance [3] are adopted to approximate the Load Balancing Problem, and GRebalance and SRebalance [4] are adopted to approximate the Load Rebalancing Problem. To gain a better understanding of the algorithms and test the correctness of the error bound we implemented and tested them in the Java environment [9]. Our goal is to test how well the algorithms perform in practice, and whether the run time of these algorithms are realistic in practice. For each problem, we first state it formally in mathematical forms, present the listing of the algorithm, describe our implementation and testing procedure, and lastly, present and compare the performance of the algorithms. During the process of analyzing the results, we also learned the weakness of GRebalance and SRebalance, and proposed an improved version of these two algorithms. 2

8 CHAPTER 2 Literature Review Currently there are many web sites with huge volume of content. These web sites are adopting different methods to distribute loads on a number of servers to ensure faster server response time. Server response time is largely determined by the underlying hardware of the servers. The performance of many of these webs such as ebay.com, amazon.com and expedia.com is heavily dependant on how fast its servers respond to requests, as the interaction between users and these web sites are real-time in nature. Overly loaded server would jeopardize the performance of these web sites significantly. Many sources [10, 11, 12] have detailed introductions of the common techniques used in practice. Currently load balancing can be done through hardware or software based techniques. One technique, called DNS load balancing, involves maintaining identical copies of the site on physically separate servers. The DNS entry for the site is then set to return multiple IP addresses, each corresponding to the different copies of the site. The DNS server then returns a different IP address for each request it receives, cycling through the multiple IP addresses. This method gives a very basic implementation of load balancing. However, since DNS entries are cached by clients and other DNS servers, a client continues to use the same copy during a session. This can be a serious drawback, as heavy website users may get the particular IP address that is cached on their client or DNS server, while less-frequent users get another. So, heavy users could experience a performance slowdown, even though the server s resources may be available in abundance. Another load-balancing technique involves mapping the site name to a single IP address, which belongs to a machine that is set up to intercept HTTP requests and distribute them among multiple copies of the Web server. This can be done using both hardware and software. hardware solutions, even though expensive, are preferred for their stability. This method is preferred over the DNS approach, as better load balancing can be achieved. Also, these load balancers can see if a particular machine is down, and accordingly divert the traffic to another address dynamically. This is in contrast to the DNS method, where a client is stuck with the address of the dead machine, until it can request a new one. 3

9 Another technique, reverse proxying, involves setting up a reverse proxy, that receives requests from the clients, proxies them to the Web server and caches the response onto itself on its way back to the client. This means that the proxy server can provide static content from its cache itself, when the request is repeated. This in turn ensures that the server itself can focus its energies on delivering dynamic content. Dynamic content cannot generally be cached, as it is generated real time. Reverse proxying can be used in conjunction with the simple load balancing techniques discussed earlier - static and dynamic contents can be split across different servers and reverse proxying used for the static content Web server only.method1 All the above approaches requires duplication of contents. While under most circumstances, this is not a huge problem for corporations, situations can arise when one wish to use an alternative approach. For instance, a web hosting company would not wish to duplicate multiple copies of the web sites it is hosting on all servers, as it introduces excessive cost. Instead of making duplications, our approach balances and rebalances web components by treating every web component as unique. Only one copy of a components will be found on all servers. Other than above mentioned cost saving benefits, this approach also eliminates the need to implement extra hardware and software to ensure concurrency. We are able to find implementations of most of the algorithms we adopted. For instance, the GRebalance algorithm has been implemented by Linder and Shah [13] to rebalance loads in real life. We choose to construct our own implementations to gain a better understanding of them. This proved fruitful as we learned the weakness of the algorithms, and were able to propose improved versions of the GRebalance and SRebalance algorithm. 4

10 CHAPTER 3 Load Balancing Problem 3.1 The Load Balancing Problem Load Balancing Problem is defined as the following. We are given a set of m servers M 1,.., M m, and a set of n components; each component j has a load of t j. We seek to assign each component to one of the servers so that the loads placed on all servers are as balanced as possible. Mathematically, in any assignment of components to servers, we can let A(i) denote the set of components assigned to server M i ; then server M i needs to work for a total time of T i = j A(i) t j (3.1) We define this as the load on server M i. In distributing the load evenly we wish to minimize a quantity known as the makespan - the maximum load on any server, T = max i T i. This classical problem does not only handle traditional load balancing on a multi-job multi-machine situation, but could also help us to distribute web components to various servers. Solving this problem would allow us to distribute load across servers evenly, such that the load on the most heavily loaded server is minimized. This is useful when a web site is first published, when servers are upgraded or when major updates take place such that all the web components need to be reassigned. Two algorithms will be adopted for this purpose, namely GBalance and SBalance. Both algorithm runs in polynomial time, and generate approximations that are guaranteed to be within a constant factor of the optimal solution [3]. More Specifically, GBalance achieves a guaranteed ratio of 2, meaning the makespan of the approximated solution is at most twice that of the optimal solution, while SBalance achieves a better guaranteed ratio of

11 3.2 Algorithm GBalance The first algorithm is called GBalance. It is a simple algorithm that passes through the entire web component list in any arbitrary order. The web component being processed is assigned to the currently minimal loaded server. This process is repeated until there are no web components left to be assigned. Algorithm GBalance (input: Component List, number of servers; output: servers with new assignment) Start with no web components assigned T i = 0, A(i) = null for all servers FOR j = 1,..., n Let M i be a server that achieves the minimum min k T k Assign component j to server M i A(i) = A(i) j T i = T i + t j END FOR The GBalance algorithm achieves a constant error bound of 2, with a run time of O(n m). The mathematical proof of the error bound and run time can be found in Kleinberg and Tardos [3]. 3.3 Algorithm SBalance An improvement over the previous algorithm is made through a simple sorting routine. The improved algorithm is called SBalance, it still runs in polynomial time and it achieves a better error bound of 3 of the optimal solution. The 2 algorithm first sorts the list of web components in decreasing load order, then goes through the list of sorted web components, and assigns each web component to the currently least loaded server. The algorithm is described below. Algorithm SBalance (input: Component List, number of servers; output: servers with new assignment) Start with no web components assigned. Set T i = 0, A(i) = null for all servers. 6

12 Sort all components in descending order of load. Assuming t 1 t 2 t 3... t n. For j = 1,.., n Let M i be a server that achieves the minimum min k T k Assign component j to server M i Set A(i) = A(i) j Set T i = T i + t j End for The SBalance algorithm achieves a constant ratio of 3, meaning that the 2 load on heaviest loaded server is at most 150% that of the optimal value. The algorithm s run time depends primarily on the sorting procedure, as the sorting procedure takes a time that is in higher order than the balancing procedure. A merge sort, pivot sort or quick sort algorithm would give SBalance a running time of O(n log n). The mathematical proof of the error bound and run time can be found in Kleinberg and Tardos [3]. 3.4 Implementations We would like to implement the algorithms in a way such that give the size and (estimated) hits of a web component, we would obtain an approximated optimal solution to distribute these web components as even as possible across our web servers. We used Object Oriented Programming approach to model web components, servers as well as the balancers. We choose to use the Java programming language [9] to implement both algorithms. Appendix C.1 appendix A shows the data structures we used to model web components. The component is modeled using a single double value, representing the load that it will contribute to a server. Appendix C.2 shows the construct of servers and operations that they can perform in our load balancing process. Appendix C.3 is the heart of the GBalance algorithm, it is the balancer and does the balancing process. 7

13 Appendix C.4 is the heard of the SBalance algorithm, with sorting and balancing. The sorting algorithm we used is the modified merge sort algorithm implemented by Java [8]. Appendix C.8 is our random test case generator. The inputs of both GBalance and SBalance are m - number of servers to balance the load on, and S - the name of the input data file. The web components are represented by data type double, and are line-separated in the data file. The output of the programs is a printed list of web components assigned to each server. An ArrayList is used to store web components read in from the data file. The web components in the data file is first read then inserted into the ArrayList. After that we sort the web components based on a modified merge sort algorithm for the SBalance algorithm. Each web component in the ArrayList is then distributed sequentially to the currently minimum loaded server. 3.5 Test Methodology We would also like to find out how well the algorithms perform in practical situations. Another aim is to see how variables such as the number of servers and the average number of web components on each server affect the performance of the algorithms. We use makespan to measure the performance of the algorithms; the smaller the makespan, the better the result. Ideally, we would like to have a large number of real life data with calculated optimal values to compare the result of our implementations with. However due to various limitations this was not practical. We then decided to construct our own test cases. Apparently if we want to test the accuracy of the algorithms we would first need to know the optimal value of the test cases we generated. This is difficult due to the NP-hardness of the problems that we are trying to solve. Our first attempt was to enumerate all possible cases of assigning n web components to m servers, and pick the assignment in which the heaviest loaded server has the least load. We started by assigning one web component to each server at random. This is because for all optimal assignment each server needs to have at least one web component given n m. For the remaining web components we partition them into m parts. This is done by numbering the remaining n m components in sequence, and using a partition function to partition integer 1 to n m to m parts. We then look up the corresponding component for each integer and replace the integer with the component. For each partition assignment, we record the maximum load on any server. We then attempted to identify the partition with the least maximum load and thus the optimal value. This attempt was not successful due 8

14 to the extremely long computation time owing to the number of partitions, and program simply crashes. Our second attempt was to first generate an optimal case, and work back-wards to load all servers with loads that sum to this optimal value. We then proceed to scramble the arrangement and let our program assign the web components. Doing so guarantees the generation of an optimal solution, that is the load on any server. We then work back-wards to figure out what web components are on each server. This is achieved by generating a series of random numbers from 0 to the optimal load, and taking the difference between successive numbers. For instance, if we decided the optimal load is 10, and the random numbers representing slices are 2,7,3,4, we first add the number 0 and 10 to it and sort them in descending order. The series now becomes 10,7,4,3,2,0. We then take the difference of the successive numbers to obtain the size of the web components. In this particular case, we have 10 7 = 3, 7 4 = 3, 4 3 = 1, 3 2 = 1 and 2 0 = 2.The size of the web components assigned to that servers is then 3,3,1,1,2. We then remove all web components from all servers and put them in the data file in an arbitrary manner. The test file is then read in by our programs, and the loads on the heaviest loaded servers are recorded and compared with the optimal solution that we have pre-generated. We present the results obtained in the following section. 3.6 Test Result We would first like to introduce the parameters we used to obtain the result. For each test, we pre-generate O, the optimal value. We set it to 100 for all the test cases used for simplicity reasons. m is the number of servers, and n cap, the maximum number of web components a server can have during the generation of the optimal result. The reason why we introduced n cap is that we wish to introduce some large web components into servers deliberately to test the correctness of the algorithms, and examine how they perform when loads are not well balanced. n cap essentially controls how close the web components sizes are. For instance, when n cap is small, the chance of having web components with large load is high, and vice versa. For each test run, we run 200 numbers of tests using the same O, m and n cap. We record T, the approximated solution generated by GBalance and SBalance. We then compute A R, the average value of load on the heaviest loaded server and M R the maximum value of load on the heaviest loaded server obtained using the same m and n cap for 200 tests. Lastly we compute the average error ratio A E, and maximum error ratio M E, using separate sets of tests. We start our test by setting m to 4 and n cap to 4. We then increase the number of m and n cap and repeat the experiment. The results are presented in 9

15 Figure 3.1: Average Output Obtained Using m = 64 the tables in Appendix B.1. Figure 1 is a plot of the makespan generated using these two algorithms against log n cap, obtained using m = Analysis and Comparison Our implementations of both algorithms achieved the said ratio on all our test 3 cases. The error bound of 2 is valid on GBalance and the error bound of is 2 valid on SBalance. 10

16 SBalance takes longer time to run in practice, however, even with the largest set of data in our test cases (256 servers, 256 slices), the practical run time difference between the two algorithm is not very significant, as SBalance on average only takes about 1 to 2 seconds more to execute. Theoretically, GBalance could produce better results in certain cases, however such situations seem rare as we did not observe such behavior in all tests. In our test cases, the results produced by SBalance is constantly superior to the result produced by GBalance. Initially we thought that as the number of slices go up, the web component size will become smaller and more even, thus helping GBalance to obtain better results. However, this is not the case when the results are analyzed. The effect of number of slices as well as the number of servers we use does not seem to affect the performance of SBalance significantly. SBalance constantly produced near perfect result, while GBalance s performance is crippled by the presence of any large web components. Our result showed that as n cap or m goes up, the total number of web components increases. The chance of getting a large web component from these set ups also increases. As makespan is a single measurement of the loads on the heaviest loaded server, as a result of that, we found that GBalance performance deteriorates as m and n cap increases. Figure 1 is an illustration of this claim. The result of both algorithms tend to be better when m << n cap. 11

17 CHAPTER 4 Load Rebalancing Problem 4.1 the Load Rebalancing Problem Although the greedy algorithms introduced in the previous section help us to balance static loads on multiple servers, the assumptions we made were too simple for realistic situations faced in real life. There are several important issues not addressed in the previous model, and we would like to discuss two of them. First of all, the loads on servers are not static but generally dynamic. There are several instances in which the load might change: the size of the web component might change over time, for example, a dynamic page in a web forum is likely to increase in size; the component might get more or less hits over time, contributing to the change in load; lastly, a new component could be created or destroyed, for instance, a user might upload/delete a file stored on a server. As the load on servers change over time, the load becomes unbalanced again, some servers might become overloaded again, and we are faced with the problem of not utilizing resources efficiently. Secondly, moving web components across servers can be an expensive procedure. For example, removing a HTML page and host it on another server would require downtime for server maintenance, changing HTML links on other pages with reference to it or changing the address of the component in mapping. The Load Rebalancing Problem was introduced by S. Keshav of Ensim Corporation [5], precisely because cooperation are facing the problem of changing load on servers over time. We borrow the concept of makespan to help us visualize the problem better. The makespan again is the load on the heaviest loaded server. The problem we try to solve is the following: given the loads on different servers, what s the minimal makespan that can be achieved with at most k moves? A more formal definition is the following. Given an assignment of the n web components to m servers, and a positive integer k, relocate no more than k web components so as to minimize the maximum load on a server. Again, we would like to define the load of a web component to be its size multi- 12

18 plied by its number of hits, and the load of a server to be the total amount of load from the web components it hosts. If we are able to approximate this problem, then we would be able to give an optimal move strategy to redistribute load onto servers more evenly, and achieves a faster overall delivery time. Two algorithms will be implemented for this purpose, namely GRebalance and SRebalance. Both algorithm runs in polynomial time, GRebalance achieves a guaranteed ratio of 2 1 m, while SRebalance achieves a better guaranteed ratio of Algorithm GRebalance This simple algorithm is very similar to the SBalance algorithm. It is a simple variant of Graham s greedy heuristic [7] and yields a 2-approximation, as described by Shmoys and Tardos. [6] Algorithm GRebalance (input k - number of moves, P - list of servers with loads, output: BP - list of servers with loads) For n in 1 to k Remove the largest single web component from the currently most loaded server. (Step 1) End for For n in 1 to k Placed the k removed web components onto the currently minimumloaded server (Step 2) End for Output BP The sorting of web components according to decreasing size of load takes O(n log n) time, to reinsert the removed web components it takes O(k log m) time. Since we are interested in non-trivial case in which m n, we have a total running time of O(n log n). The mathematical proof of the error bound and run time can be found in Aggarwal et al. [4]. 4.3 Algorithm SRebalance This algorithm is originally presented by Aggarwal et al. [4], and takes a more complicated approach as compared with the GRebalance algorithm. To formalize 13

19 it better, we begin by presenting some definitions used by its original authors. Definition 1. Web components of size strictly greater than 1 OP T are called large, the rest 2 are called small. Let L t denote the total number of large web components. m l is the number of servers with at least one large web components; then L e = L t m l denotes the number of extra large web components on this set of servers. A server is large free if it doesn t have a large web component assigned to it currently. We first look at an algorithm that does the rebalancing with given optimal value. This algorithm is called Partition, has an error bound of 1.5,and makes at most the same number of moves needed by the optimal algorithm. At this stage it does not enforce the number of moves restriction. Later on we will describe a method to do away with having to input an optimal value, and enforcing the number of moves restriction. Algorithm Partition 1. From each of the m L servers which has a large web component, remove all large web components, except for the smallest sized large web component therein. 2. Calculate for each server i, the following values with respect to their current configuration a i : the minimum number of small web components to be removed so that the total size of the remaining small web components is at most 1 2 OP T b i : the minimum number of web components (including any large web components) to be removed so that the total size of the remaining web components (including any large web components) is at most OPT. c i = a i b i 3. Select the L T servers within the smallest values of c i, breaking ties by giving preference to the servers containing large web components. Remove the a i small web components from the selected servers, thereby ensuring that the total size of the remaining small web components on these servers is at most 1 2 OP T. 4. From the remaining m L T servers, remove the b i web components from them. Large web components, if any needs to be reassigned. Assign each of the removed large web components (arbitrarily) to distinct large-free servers created in step 3. 14

20 5. Arbitrarily assign the large web components removed in step 1 to the remaining large-free servers 6. For the small web components removed in 3 and 4, assign them one-by-one to the current minimum-load server. To do away without the optimal value, one key observation is that only when OP T cross some threshold value, would it affect L T, a i or b i. For instance, only when the value of 1 2 OP T crosses some web component s load p j, does the value of L T changes. Similarly, we can obtain the threshold values of a i and b i. The set of threshold of a i, b i over all servers combined with 2p j of all web components gives the threshold value of OP T. Given that, it is sufficient to implement SRebalance. LEMMA 1. Enumerating in increasing order of all threshold values for each server i, with respect to L T, a i and b i, then L T, a i and b i remain unchanged for OP T varying between 2 consecutive threshold values. Algorithm SRebalance 1. Use the average load as the starting guess for OP T 2. Calculate the corresponding L T, L E, a i, b i, c i values using Partition. Let k b be the total number of moves needed by this algorithm. 3. While k b > k do Increase the guessed value of OP T over to the next threshold value Recalculate corresponding values for L T, L E, a i,b i, c i End While 4. Return the result produced by the last execution of Partition. The error bound for this algorithm is 3, and the run time is O(n log n). The 2 mathematical proof of the error bound and run time can be found in Aggarwal et al. [4]. 4.4 Implementations We again choose to program under the Java Programming Language [9] as we would like to model web components and servers using Object Oriented Programming approach. We would like to implement the algorithms in a way such that give the list of servers with the current size and hits of web component it is hosting, we would 15

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm.

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm. Approximation Algorithms 11 Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of three

More information

Big Data & Scripting storage networks and distributed file systems

Big Data & Scripting storage networks and distributed file systems Big Data & Scripting storage networks and distributed file systems 1, 2, in the remainder we use networks of computing nodes to enable computations on even larger datasets for a computation, each node

More information

Applied Algorithm Design Lecture 5

Applied Algorithm Design Lecture 5 Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 5 1 / 86 Approximation Algorithms Pietro Michiardi (Eurecom) Applied Algorithm Design

More information

Chapter 11. 11.1 Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling

Chapter 11. 11.1 Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling Approximation Algorithms Chapter Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should I do? A. Theory says you're unlikely to find a poly-time algorithm. Must sacrifice one

More information

SIMS 255 Foundations of Software Design. Complexity and NP-completeness

SIMS 255 Foundations of Software Design. Complexity and NP-completeness SIMS 255 Foundations of Software Design Complexity and NP-completeness Matt Welsh November 29, 2001 mdw@cs.berkeley.edu 1 Outline Complexity of algorithms Space and time complexity ``Big O'' notation Complexity

More information

Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:

Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation: CSE341T 08/31/2015 Lecture 3 Cost Model: Work, Span and Parallelism In this lecture, we will look at how one analyze a parallel program written using Cilk Plus. When we analyze the cost of an algorithm

More information

arxiv:1112.0829v1 [math.pr] 5 Dec 2011

arxiv:1112.0829v1 [math.pr] 5 Dec 2011 How Not to Win a Million Dollars: A Counterexample to a Conjecture of L. Breiman Thomas P. Hayes arxiv:1112.0829v1 [math.pr] 5 Dec 2011 Abstract Consider a gambling game in which we are allowed to repeatedly

More information

JUST-IN-TIME SCHEDULING WITH PERIODIC TIME SLOTS. Received December May 12, 2003; revised February 5, 2004

JUST-IN-TIME SCHEDULING WITH PERIODIC TIME SLOTS. Received December May 12, 2003; revised February 5, 2004 Scientiae Mathematicae Japonicae Online, Vol. 10, (2004), 431 437 431 JUST-IN-TIME SCHEDULING WITH PERIODIC TIME SLOTS Ondřej Čepeka and Shao Chin Sung b Received December May 12, 2003; revised February

More information

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm. Approximation Algorithms Chapter Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of

More information

Near Optimal Solutions

Near Optimal Solutions Near Optimal Solutions Many important optimization problems are lacking efficient solutions. NP-Complete problems unlikely to have polynomial time solutions. Good heuristics important for such problems.

More information

Classification - Examples

Classification - Examples Lecture 2 Scheduling 1 Classification - Examples 1 r j C max given: n jobs with processing times p 1,...,p n and release dates r 1,...,r n jobs have to be scheduled without preemption on one machine taking

More information

Compact Representations and Approximations for Compuation in Games

Compact Representations and Approximations for Compuation in Games Compact Representations and Approximations for Compuation in Games Kevin Swersky April 23, 2008 Abstract Compact representations have recently been developed as a way of both encoding the strategic interactions

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms

More information

Fairness in Routing and Load Balancing

Fairness in Routing and Load Balancing Fairness in Routing and Load Balancing Jon Kleinberg Yuval Rabani Éva Tardos Abstract We consider the issue of network routing subject to explicit fairness conditions. The optimization of fairness criteria

More information

Performance Comparison of Server Load Distribution with FTP and HTTP

Performance Comparison of Server Load Distribution with FTP and HTTP Performance Comparison of Server Load Distribution with FTP and HTTP Yogesh Chauhan Assistant Professor HCTM Technical Campus, Kaithal Shilpa Chauhan Research Scholar University Institute of Engg & Tech,

More information

Classification - Examples -1- 1 r j C max given: n jobs with processing times p 1,..., p n and release dates

Classification - Examples -1- 1 r j C max given: n jobs with processing times p 1,..., p n and release dates Lecture 2 Scheduling 1 Classification - Examples -1-1 r j C max given: n jobs with processing times p 1,..., p n and release dates r 1,..., r n jobs have to be scheduled without preemption on one machine

More information

NP-Completeness and Cook s Theorem

NP-Completeness and Cook s Theorem NP-Completeness and Cook s Theorem Lecture notes for COM3412 Logic and Computation 15th January 2002 1 NP decision problems The decision problem D L for a formal language L Σ is the computational task:

More information

Offline sorting buffers on Line

Offline sorting buffers on Line Offline sorting buffers on Line Rohit Khandekar 1 and Vinayaka Pandit 2 1 University of Waterloo, ON, Canada. email: rkhandekar@gmail.com 2 IBM India Research Lab, New Delhi. email: pvinayak@in.ibm.com

More information

Tutorial 8. NP-Complete Problems

Tutorial 8. NP-Complete Problems Tutorial 8 NP-Complete Problems Decision Problem Statement of a decision problem Part 1: instance description defining the input Part 2: question stating the actual yesor-no question A decision problem

More information

14.1 Rent-or-buy problem

14.1 Rent-or-buy problem CS787: Advanced Algorithms Lecture 14: Online algorithms We now shift focus to a different kind of algorithmic problem where we need to perform some optimization without knowing the input in advance. Algorithms

More information

Algorithm Design and Analysis

Algorithm Design and Analysis Algorithm Design and Analysis LECTURE 27 Approximation Algorithms Load Balancing Weighted Vertex Cover Reminder: Fill out SRTEs online Don t forget to click submit Sofya Raskhodnikova 12/6/2011 S. Raskhodnikova;

More information

The Conference Call Search Problem in Wireless Networks

The Conference Call Search Problem in Wireless Networks The Conference Call Search Problem in Wireless Networks Leah Epstein 1, and Asaf Levin 2 1 Department of Mathematics, University of Haifa, 31905 Haifa, Israel. lea@math.haifa.ac.il 2 Department of Statistics,

More information

Scheduling Shop Scheduling. Tim Nieberg

Scheduling Shop Scheduling. Tim Nieberg Scheduling Shop Scheduling Tim Nieberg Shop models: General Introduction Remark: Consider non preemptive problems with regular objectives Notation Shop Problems: m machines, n jobs 1,..., n operations

More information

Duplicating and its Applications in Batch Scheduling

Duplicating and its Applications in Batch Scheduling Duplicating and its Applications in Batch Scheduling Yuzhong Zhang 1 Chunsong Bai 1 Shouyang Wang 2 1 College of Operations Research and Management Sciences Qufu Normal University, Shandong 276826, China

More information

1 Approximating Set Cover

1 Approximating Set Cover CS 05: Algorithms (Grad) Feb 2-24, 2005 Approximating Set Cover. Definition An Instance (X, F ) of the set-covering problem consists of a finite set X and a family F of subset of X, such that every elemennt

More information

CAD Algorithms. P and NP

CAD Algorithms. P and NP CAD Algorithms The Classes P and NP Mohammad Tehranipoor ECE Department 6 September 2010 1 P and NP P and NP are two families of problems. P is a class which contains all of the problems we solve using

More information

NP-Completeness I. Lecture 19. 19.1 Overview. 19.2 Introduction: Reduction and Expressiveness

NP-Completeness I. Lecture 19. 19.1 Overview. 19.2 Introduction: Reduction and Expressiveness Lecture 19 NP-Completeness I 19.1 Overview In the past few lectures we have looked at increasingly more expressive problems that we were able to solve using efficient algorithms. In this lecture we introduce

More information

A Note on Maximum Independent Sets in Rectangle Intersection Graphs

A Note on Maximum Independent Sets in Rectangle Intersection Graphs A Note on Maximum Independent Sets in Rectangle Intersection Graphs Timothy M. Chan School of Computer Science University of Waterloo Waterloo, Ontario N2L 3G1, Canada tmchan@uwaterloo.ca September 12,

More information

The Trip Scheduling Problem

The Trip Scheduling Problem The Trip Scheduling Problem Claudia Archetti Department of Quantitative Methods, University of Brescia Contrada Santa Chiara 50, 25122 Brescia, Italy Martin Savelsbergh School of Industrial and Systems

More information

Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints

Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints Michael Bauer, Srinivasan Ravichandran University of Wisconsin-Madison Department of Computer Sciences {bauer, srini}@cs.wisc.edu

More information

Load Balancing. Load Balancing 1 / 24

Load Balancing. Load Balancing 1 / 24 Load Balancing Backtracking, branch & bound and alpha-beta pruning: how to assign work to idle processes without much communication? Additionally for alpha-beta pruning: implementing the young-brothers-wait

More information

Topic: Greedy Approximations: Set Cover and Min Makespan Date: 1/30/06

Topic: Greedy Approximations: Set Cover and Min Makespan Date: 1/30/06 CS880: Approximations Algorithms Scribe: Matt Elder Lecturer: Shuchi Chawla Topic: Greedy Approximations: Set Cover and Min Makespan Date: 1/30/06 3.1 Set Cover The Set Cover problem is: Given a set of

More information

Load Balancing in Distributed Web Server Systems With Partial Document Replication

Load Balancing in Distributed Web Server Systems With Partial Document Replication Load Balancing in Distributed Web Server Systems With Partial Document Replication Ling Zhuo, Cho-Li Wang and Francis C. M. Lau Department of Computer Science and Information Systems The University of

More information

Minimal Cost Reconfiguration of Data Placement in a Storage Area Network

Minimal Cost Reconfiguration of Data Placement in a Storage Area Network Minimal Cost Reconfiguration of Data Placement in a Storage Area Network Hadas Shachnai Gal Tamir Tami Tamir Abstract Video-on-Demand (VoD) services require frequent updates in file configuration on the

More information

Approximation Algorithms. Scheduling. Approximation algorithms. Scheduling jobs on a single machine

Approximation Algorithms. Scheduling. Approximation algorithms. Scheduling jobs on a single machine Approximation algorithms Approximation Algorithms Fast. Cheap. Reliable. Choose two. NP-hard problems: choose 2 of optimal polynomial time all instances Approximation algorithms. Trade-off between time

More information

Performance evaluation of Web Information Retrieval Systems and its application to e-business

Performance evaluation of Web Information Retrieval Systems and its application to e-business Performance evaluation of Web Information Retrieval Systems and its application to e-business Fidel Cacheda, Angel Viña Departament of Information and Comunications Technologies Facultad de Informática,

More information

The Goldberg Rao Algorithm for the Maximum Flow Problem

The Goldberg Rao Algorithm for the Maximum Flow Problem The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }

More information

Quantum and Non-deterministic computers facing NP-completeness

Quantum and Non-deterministic computers facing NP-completeness Quantum and Non-deterministic computers facing NP-completeness Thibaut University of Vienna Dept. of Business Administration Austria Vienna January 29th, 2013 Some pictures come from Wikipedia Introduction

More information

Single machine parallel batch scheduling with unbounded capacity

Single machine parallel batch scheduling with unbounded capacity Workshop on Combinatorics and Graph Theory 21th, April, 2006 Nankai University Single machine parallel batch scheduling with unbounded capacity Yuan Jinjiang Department of mathematics, Zhengzhou University

More information

R u t c o r Research R e p o r t. A Method to Schedule Both Transportation and Production at the Same Time in a Special FMS.

R u t c o r Research R e p o r t. A Method to Schedule Both Transportation and Production at the Same Time in a Special FMS. R u t c o r Research R e p o r t A Method to Schedule Both Transportation and Production at the Same Time in a Special FMS Navid Hashemian a Béla Vizvári b RRR 3-2011, February 21, 2011 RUTCOR Rutgers

More information

Approximated Distributed Minimum Vertex Cover Algorithms for Bounded Degree Graphs

Approximated Distributed Minimum Vertex Cover Algorithms for Bounded Degree Graphs Approximated Distributed Minimum Vertex Cover Algorithms for Bounded Degree Graphs Yong Zhang 1.2, Francis Y.L. Chin 2, and Hing-Fung Ting 2 1 College of Mathematics and Computer Science, Hebei University,

More information

Security-Aware Beacon Based Network Monitoring

Security-Aware Beacon Based Network Monitoring Security-Aware Beacon Based Network Monitoring Masahiro Sasaki, Liang Zhao, Hiroshi Nagamochi Graduate School of Informatics, Kyoto University, Kyoto, Japan Email: {sasaki, liang, nag}@amp.i.kyoto-u.ac.jp

More information

A binary search algorithm for a special case of minimizing the lateness on a single machine

A binary search algorithm for a special case of minimizing the lateness on a single machine Issue 3, Volume 3, 2009 45 A binary search algorithm for a special case of minimizing the lateness on a single machine Nodari Vakhania Abstract We study the problem of scheduling jobs with release times

More information

Scheduling Single Machine Scheduling. Tim Nieberg

Scheduling Single Machine Scheduling. Tim Nieberg Scheduling Single Machine Scheduling Tim Nieberg Single machine models Observation: for non-preemptive problems and regular objectives, a sequence in which the jobs are processed is sufficient to describe

More information

Optimizing a ëcontent-aware" Load Balancing Strategy for Shared Web Hosting Service Ludmila Cherkasova Hewlett-Packard Laboratories 1501 Page Mill Road, Palo Alto, CA 94303 cherkasova@hpl.hp.com Shankar

More information

1 st year / 2014-2015/ Principles of Industrial Eng. Chapter -3 -/ Dr. May G. Kassir. Chapter Three

1 st year / 2014-2015/ Principles of Industrial Eng. Chapter -3 -/ Dr. May G. Kassir. Chapter Three Chapter Three Scheduling, Sequencing and Dispatching 3-1- SCHEDULING Scheduling can be defined as prescribing of when and where each operation necessary to manufacture the product is to be performed. It

More information

Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations

Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations Amara Keller, Martin Kelly, Aaron Todd 4 June 2010 Abstract This research has two components, both involving the

More information

A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems*

A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems* A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems* Junho Jang, Saeyoung Han, Sungyong Park, and Jihoon Yang Department of Computer Science and Interdisciplinary Program

More information

Online and Offline Selling in Limit Order Markets

Online and Offline Selling in Limit Order Markets Online and Offline Selling in Limit Order Markets Kevin L. Chang 1 and Aaron Johnson 2 1 Yahoo Inc. klchang@yahoo-inc.com 2 Yale University ajohnson@cs.yale.edu Abstract. Completely automated electronic

More information

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

1. Comments on reviews a. Need to avoid just summarizing web page asks you for: 1. Comments on reviews a. Need to avoid just summarizing web page asks you for: i. A one or two sentence summary of the paper ii. A description of the problem they were trying to solve iii. A summary of

More information

Distributed Computing over Communication Networks: Maximal Independent Set

Distributed Computing over Communication Networks: Maximal Independent Set Distributed Computing over Communication Networks: Maximal Independent Set What is a MIS? MIS An independent set (IS) of an undirected graph is a subset U of nodes such that no two nodes in U are adjacent.

More information

Efficient Fault-Tolerant Infrastructure for Cloud Computing

Efficient Fault-Tolerant Infrastructure for Cloud Computing Efficient Fault-Tolerant Infrastructure for Cloud Computing Xueyuan Su Candidate for Ph.D. in Computer Science, Yale University December 2013 Committee Michael J. Fischer (advisor) Dana Angluin James Aspnes

More information

A Load Balancing Algorithm based on the Variation Trend of Entropy in Homogeneous Cluster

A Load Balancing Algorithm based on the Variation Trend of Entropy in Homogeneous Cluster , pp.11-20 http://dx.doi.org/10.14257/ ijgdc.2014.7.2.02 A Load Balancing Algorithm based on the Variation Trend of Entropy in Homogeneous Cluster Kehe Wu 1, Long Chen 2, Shichao Ye 2 and Yi Li 2 1 Beijing

More information

Lecture 6 Online and streaming algorithms for clustering

Lecture 6 Online and streaming algorithms for clustering CSE 291: Unsupervised learning Spring 2008 Lecture 6 Online and streaming algorithms for clustering 6.1 On-line k-clustering To the extent that clustering takes place in the brain, it happens in an on-line

More information

Competitive Analysis of QoS Networks

Competitive Analysis of QoS Networks Competitive Analysis of QoS Networks What is QoS? The art of performance analysis What is competitive analysis? Example: Scheduling with deadlines Example: Smoothing real-time streams Example: Overflow

More information

11. APPROXIMATION ALGORITHMS

11. APPROXIMATION ALGORITHMS 11. APPROXIMATION ALGORITHMS load balancing center selection pricing method: vertex cover LP rounding: vertex cover generalized load balancing knapsack problem Lecture slides by Kevin Wayne Copyright 2005

More information

Evaluation of Complexity of Some Programming Languages on the Travelling Salesman Problem

Evaluation of Complexity of Some Programming Languages on the Travelling Salesman Problem International Journal of Applied Science and Technology Vol. 3 No. 8; December 2013 Evaluation of Complexity of Some Programming Languages on the Travelling Salesman Problem D. R. Aremu O. A. Gbadamosi

More information

Factoring & Primality

Factoring & Primality Factoring & Primality Lecturer: Dimitris Papadopoulos In this lecture we will discuss the problem of integer factorization and primality testing, two problems that have been the focus of a great amount

More information

Data Mining Practical Machine Learning Tools and Techniques

Data Mining Practical Machine Learning Tools and Techniques Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

More information

Online Scheduling with Bounded Migration

Online Scheduling with Bounded Migration Online Scheduling with Bounded Migration Peter Sanders, Naveen Sivadasan, and Martin Skutella Max-Planck-Institut für Informatik, Saarbrücken, Germany, {sanders,ns,skutella}@mpi-sb.mpg.de Abstract. Consider

More information

Job Reference Guide. SLAMD Distributed Load Generation Engine. Version 1.8.2

Job Reference Guide. SLAMD Distributed Load Generation Engine. Version 1.8.2 Job Reference Guide SLAMD Distributed Load Generation Engine Version 1.8.2 June 2004 Contents 1. Introduction...3 2. The Utility Jobs...4 3. The LDAP Search Jobs...11 4. The LDAP Authentication Jobs...22

More information

Diversity Coloring for Distributed Data Storage in Networks 1

Diversity Coloring for Distributed Data Storage in Networks 1 Diversity Coloring for Distributed Data Storage in Networks 1 Anxiao (Andrew) Jiang and Jehoshua Bruck California Institute of Technology Pasadena, CA 9115, U.S.A. {jax, bruck}@paradise.caltech.edu Abstract

More information

Batch Scheduling of Deteriorating Products

Batch Scheduling of Deteriorating Products Decision Making in Manufacturing and Services Vol. 1 2007 No. 1 2 pp. 25 34 Batch Scheduling of Deteriorating Products Maksim S. Barketau, T.C. Edwin Cheng, Mikhail Y. Kovalyov, C.T. Daniel Ng Abstract.

More information

Strategic planning in LTL logistics increasing the capacity utilization of trucks

Strategic planning in LTL logistics increasing the capacity utilization of trucks Strategic planning in LTL logistics increasing the capacity utilization of trucks J. Fabian Meier 1,2 Institute of Transport Logistics TU Dortmund, Germany Uwe Clausen 3 Fraunhofer Institute for Material

More information

Analysis of Approximation Algorithms for k-set Cover using Factor-Revealing Linear Programs

Analysis of Approximation Algorithms for k-set Cover using Factor-Revealing Linear Programs Analysis of Approximation Algorithms for k-set Cover using Factor-Revealing Linear Programs Stavros Athanassopoulos, Ioannis Caragiannis, and Christos Kaklamanis Research Academic Computer Technology Institute

More information

Lecture 4 Online and streaming algorithms for clustering

Lecture 4 Online and streaming algorithms for clustering CSE 291: Geometric algorithms Spring 2013 Lecture 4 Online and streaming algorithms for clustering 4.1 On-line k-clustering To the extent that clustering takes place in the brain, it happens in an on-line

More information

Week 7 - Game Theory and Industrial Organisation

Week 7 - Game Theory and Industrial Organisation Week 7 - Game Theory and Industrial Organisation The Cournot and Bertrand models are the two basic templates for models of oligopoly; industry structures with a small number of firms. There are a number

More information

CSE 4351/5351 Notes 7: Task Scheduling & Load Balancing

CSE 4351/5351 Notes 7: Task Scheduling & Load Balancing CSE / Notes : Task Scheduling & Load Balancing Task Scheduling A task is a (sequential) activity that uses a set of inputs to produce a set of outputs. A task (precedence) graph is an acyclic, directed

More information

CommuniGate Pro White Paper. Dynamic Clustering Solution. For Reliable and Scalable. Messaging

CommuniGate Pro White Paper. Dynamic Clustering Solution. For Reliable and Scalable. Messaging CommuniGate Pro White Paper Dynamic Clustering Solution For Reliable and Scalable Messaging Date April 2002 Modern E-Mail Systems: Achieving Speed, Stability and Growth E-mail becomes more important each

More information

The Relative Worst Order Ratio for On-Line Algorithms

The Relative Worst Order Ratio for On-Line Algorithms The Relative Worst Order Ratio for On-Line Algorithms Joan Boyar 1 and Lene M. Favrholdt 2 1 Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark, joan@imada.sdu.dk

More information

Moral Hazard. Itay Goldstein. Wharton School, University of Pennsylvania

Moral Hazard. Itay Goldstein. Wharton School, University of Pennsylvania Moral Hazard Itay Goldstein Wharton School, University of Pennsylvania 1 Principal-Agent Problem Basic problem in corporate finance: separation of ownership and control: o The owners of the firm are typically

More information

New Hash Function Construction for Textual and Geometric Data Retrieval

New Hash Function Construction for Textual and Geometric Data Retrieval Latest Trends on Computers, Vol., pp.483-489, ISBN 978-96-474-3-4, ISSN 79-45, CSCC conference, Corfu, Greece, New Hash Function Construction for Textual and Geometric Data Retrieval Václav Skala, Jan

More information

Distributed Load Balancing for Machines Fully Heterogeneous

Distributed Load Balancing for Machines Fully Heterogeneous Internship Report 2 nd of June - 22 th of August 2014 Distributed Load Balancing for Machines Fully Heterogeneous Nathanaël Cheriere nathanael.cheriere@ens-rennes.fr ENS Rennes Academic Year 2013-2014

More information

20 Selfish Load Balancing

20 Selfish Load Balancing 20 Selfish Load Balancing Berthold Vöcking Abstract Suppose that a set of weighted tasks shall be assigned to a set of machines with possibly different speeds such that the load is distributed evenly among

More information

Improved Algorithms for Data Migration

Improved Algorithms for Data Migration Improved Algorithms for Data Migration Samir Khuller 1, Yoo-Ah Kim, and Azarakhsh Malekian 1 Department of Computer Science, University of Maryland, College Park, MD 20742. Research supported by NSF Award

More information

Guessing Game: NP-Complete?

Guessing Game: NP-Complete? Guessing Game: NP-Complete? 1. LONGEST-PATH: Given a graph G = (V, E), does there exists a simple path of length at least k edges? YES 2. SHORTEST-PATH: Given a graph G = (V, E), does there exists a simple

More information

The Top 20 VMware Performance Metrics You Should Care About

The Top 20 VMware Performance Metrics You Should Care About The Top 20 VMware Performance Metrics You Should Care About Why you can t ignore them and how they can help you find and avoid problems. WHITEPAPER BY ALEX ROSEMBLAT Table of Contents Introduction... 3

More information

Big Data: A Geometric Explanation of a Seemingly Counterintuitive Strategy

Big Data: A Geometric Explanation of a Seemingly Counterintuitive Strategy Big Data: A Geometric Explanation of a Seemingly Counterintuitive Strategy Olga Kosheleva and Vladik Kreinovich University of Texas at El Paso 500 W. University El Paso, TX 79968, USA olgak@utep.edu, vladik@utep.edu

More information

THE SCHEDULING OF MAINTENANCE SERVICE

THE SCHEDULING OF MAINTENANCE SERVICE THE SCHEDULING OF MAINTENANCE SERVICE Shoshana Anily Celia A. Glass Refael Hassin Abstract We study a discrete problem of scheduling activities of several types under the constraint that at most a single

More information

Complexity Theory. IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar

Complexity Theory. IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar Complexity Theory IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar Outline Goals Computation of Problems Concepts and Definitions Complexity Classes and Problems Polynomial Time Reductions Examples

More information

Analysis of Micromouse Maze Solving Algorithms

Analysis of Micromouse Maze Solving Algorithms 1 Analysis of Micromouse Maze Solving Algorithms David M. Willardson ECE 557: Learning from Data, Spring 2001 Abstract This project involves a simulation of a mouse that is to find its way through a maze.

More information

Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows

Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows TECHNISCHE UNIVERSITEIT EINDHOVEN Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows Lloyd A. Fasting May 2014 Supervisors: dr. M. Firat dr.ir. M.A.A. Boon J. van Twist MSc. Contents

More information

Joint Optimization of Overlapping Phases in MapReduce

Joint Optimization of Overlapping Phases in MapReduce Joint Optimization of Overlapping Phases in MapReduce Minghong Lin, Li Zhang, Adam Wierman, Jian Tan Abstract MapReduce is a scalable parallel computing framework for big data processing. It exhibits multiple

More information

Dynamic programming formulation

Dynamic programming formulation 1.24 Lecture 14 Dynamic programming: Job scheduling Dynamic programming formulation To formulate a problem as a dynamic program: Sort by a criterion that will allow infeasible combinations to be eli minated

More information

A class of on-line scheduling algorithms to minimize total completion time

A class of on-line scheduling algorithms to minimize total completion time A class of on-line scheduling algorithms to minimize total completion time X. Lu R.A. Sitters L. Stougie Abstract We consider the problem of scheduling jobs on-line on a single machine and on identical

More information

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

More information

Magento & Zend Benchmarks Version 1.2, 1.3 (with & without Flat Catalogs)

Magento & Zend Benchmarks Version 1.2, 1.3 (with & without Flat Catalogs) Magento & Zend Benchmarks Version 1.2, 1.3 (with & without Flat Catalogs) 1. Foreword Magento is a PHP/Zend application which intensively uses the CPU. Since version 1.1.6, each new version includes some

More information

A Study on Workload Imbalance Issues in Data Intensive Distributed Computing

A Study on Workload Imbalance Issues in Data Intensive Distributed Computing A Study on Workload Imbalance Issues in Data Intensive Distributed Computing Sven Groot 1, Kazuo Goda 1, and Masaru Kitsuregawa 1 University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8505, Japan Abstract.

More information

Approximability of Two-Machine No-Wait Flowshop Scheduling with Availability Constraints

Approximability of Two-Machine No-Wait Flowshop Scheduling with Availability Constraints Approximability of Two-Machine No-Wait Flowshop Scheduling with Availability Constraints T.C. Edwin Cheng 1, and Zhaohui Liu 1,2 1 Department of Management, The Hong Kong Polytechnic University Kowloon,

More information

Complexity Classes P and NP

Complexity Classes P and NP Complexity Classes P and NP MATH 3220 Supplemental Presentation by John Aleshunas The cure for boredom is curiosity. There is no cure for curiosity Dorothy Parker Computational Complexity Theory In computer

More information

Today. Intro to real-time scheduling Cyclic executives. Scheduling tables Frames Frame size constraints. Non-independent tasks Pros and cons

Today. Intro to real-time scheduling Cyclic executives. Scheduling tables Frames Frame size constraints. Non-independent tasks Pros and cons Today Intro to real-time scheduling Cyclic executives Scheduling tables Frames Frame size constraints Generating schedules Non-independent tasks Pros and cons Real-Time Systems The correctness of a real-time

More information

A New Nature-inspired Algorithm for Load Balancing

A New Nature-inspired Algorithm for Load Balancing A New Nature-inspired Algorithm for Load Balancing Xiang Feng East China University of Science and Technology Shanghai, China 200237 Email: xfeng{@ecusteducn, @cshkuhk} Francis CM Lau The University of

More information

Load Balancing Web Applications

Load Balancing Web Applications Mon Jan 26 2004 18:14:15 America/New_York Published on The O'Reilly Network (http://www.oreillynet.com/) http://www.oreillynet.com/pub/a/onjava/2001/09/26/load.html See this if you're having trouble printing

More information

Load balancing of temporary tasks in the l p norm

Load balancing of temporary tasks in the l p norm Load balancing of temporary tasks in the l p norm Yossi Azar a,1, Amir Epstein a,2, Leah Epstein b,3 a School of Computer Science, Tel Aviv University, Tel Aviv, Israel. b School of Computer Science, The

More information

Optimal Online-list Batch Scheduling

Optimal Online-list Batch Scheduling Optimal Online-list Batch Scheduling Jacob Jan Paulus a,, Deshi Ye b, Guochuan Zhang b a University of Twente, P.O. box 217, 7500AE Enschede, The Netherlands b Zhejiang University, Hangzhou 310027, China

More information

School Timetabling in Theory and Practice

School Timetabling in Theory and Practice School Timetabling in Theory and Practice Irving van Heuven van Staereling VU University, Amsterdam Faculty of Sciences December 24, 2012 Preface At almost every secondary school and university, some

More information

Algorithm Design for Performance Aware VM Consolidation

Algorithm Design for Performance Aware VM Consolidation Algorithm Design for Performance Aware VM Consolidation Alan Roytman University of California, Los Angeles Sriram Govindan Microsoft Corporation Jie Liu Microsoft Research Aman Kansal Microsoft Research

More information

Embedded Systems 20 REVIEW. Multiprocessor Scheduling

Embedded Systems 20 REVIEW. Multiprocessor Scheduling Embedded Systems 0 - - Multiprocessor Scheduling REVIEW Given n equivalent processors, a finite set M of aperiodic/periodic tasks find a schedule such that each task always meets its deadline. Assumptions:

More information

A COOL AND PRACTICAL ALTERNATIVE TO TRADITIONAL HASH TABLES

A COOL AND PRACTICAL ALTERNATIVE TO TRADITIONAL HASH TABLES A COOL AND PRACTICAL ALTERNATIVE TO TRADITIONAL HASH TABLES ULFAR ERLINGSSON, MARK MANASSE, FRANK MCSHERRY MICROSOFT RESEARCH SILICON VALLEY MOUNTAIN VIEW, CALIFORNIA, USA ABSTRACT Recent advances in the

More information

A Tool for Evaluation and Optimization of Web Application Performance

A Tool for Evaluation and Optimization of Web Application Performance A Tool for Evaluation and Optimization of Web Application Performance Tomáš Černý 1 cernyto3@fel.cvut.cz Michael J. Donahoo 2 jeff_donahoo@baylor.edu Abstract: One of the main goals of web application

More information