MANY big-data computing applications have been deployed

Size: px
Start display at page:

Download "MANY big-data computing applications have been deployed"

Transcription

1 1 Data Transfer Scheduling for Maximizing Throughput of Big-Data Computing in Cloud Systems Ruitao Xie and Xiaohua Jia, Fellow, IEEE Abstract Many big-data computing applications have been deployed in cloud platforms. These applications normally demand concurrent data transfers among computing nodes for parallel processing. It is important to find the best transfer scheduling leading to the least data retrieval time the maximum throughput in other words. However, the existing methods cannot achieve this, because they ignore link bandwidths and the diversity of data replicas and paths. In this paper, we aim to develop a max-throughput data transfer scheduling to minimize the data retrieval time of applications. Specifically, the problem is formulated into mixed integer programming, and an approximation algorithm is proposed, with its approximation ratio analyzed. The extensive simulations demonstrate that our algorithm can obtain near optimal solutions. Index Terms data transfer scheduling, big-data computing, throughput maximization, data center. 1 INTRODUCTION MANY big-data computing applications have been deployed in cloud platforms, e.g., Amazon s Elastic Compute Cloud (EC2), Windows Azure, IBM Cloud etc. In big-data computing under MapReduce framework [1], tasks run on computing nodes in parallel. But, the data may not be stored in the same nodes as they are processed for a variety of reasons. For instance, when those nodes have insufficient computing capacity, or when they are not preferred by other objectives (e.g., load balancing and energy saving). In Data Center Networks (DCN), data are usually replicated for redundancy and robustness, e.g., in HDFS, every data block has two replicas in addition to the original one [2]. Furthermore, from each data node, multiple paths are available for data transfer, sometimes, all of which are shortest paths, due to path redundancy in DCN [3], [4]. It is important to select the best node and the best path to retrieve a non-local data. This is the data retrieval problem. Different selections of nodes and paths may result in different data retrieval time. It is important to find the selection leading to the least data retrieval time, because long data retrieval time of a computing task may result in long completion time for the application to whom this task belongs. However, the existing method to retrieve data, which is used in current HDFS systems and DCN, cannot obtain the least data retrieval time. In the existing method, when a non-local data is required, a request is sent to any one of the closest replicas [2]. Then, the data is transferred from the selected node through any one of the shortest paths, determined by routing protocols like Equal-Cost Multipath Routing (ECMP) [5] or per-flow Valiant Load Balancing (VLB) [4]. It is noted that many tasks are retrieving data Ruitao Xie and Xiaohua Jia are with the Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong. concurrently. This method may result in heavy congestions on some links, leading to long data retrieval time, because it ignores link bandwidths and the overlaps of selected nodes and paths. Some researchers proposed flow scheduling systems to avoid path collisions [6], [7]. However, they only exploited path diversity but not data replica diversity. To minimize the data retrieval time (i.e., to maximize the throughput) of an application consisting of concurrent tasks, we propose a max-throughput data transfer scheduling, utilizing both replica and path diversities. In our method, the problem is formulated into mixed integer programming, and an approximation algorithm is proposed, with its approximation ratio analyzed. We also solve the data retrieval problem for the case of multiple applications. Our simulations demonstrate that the approximation results are almost as good as the optimal solutions. We also show that the availability of a small number of additional data replicas can be greatly beneficial in many cases, regardless of path diversity. The rest of this paper is organized as follows. Section 2 presents the overview and the motivation of the data retrieval problem. Section 3 presents problem formulations, for the scenarios of a single application and multiple applications separately. Section 4 presents an approximation algorithm and the analyses on its approximation ratio. The simulations and performance evaluations are presented in Section 5. Section 6 presents the related works on the data retrieval in cloud. Section 7 concludes the paper. 2 PROBLEM OVERVIEW AND MOTIVATION In cloud, although data is distributed among computing nodes, not all data can be obtained locally, so some nodes may need to retrieve data from distant nodes. A requested data can be retrieved from one of the nodes where its replica is stored. When a node is chosen for data retrieval, a path from it to the requesting node needs to be specified for

2 2 data transfer. A reasonable choice would be the shortest path (in terms of the number of hops). However, there may exist multiple shortest paths, so one of them must be selected. It is noted that we select only one node and one path for each requested data, because otherwise it would result in multipath TCP, which suffers from high jitter and is not widely deployed in DCN yet. A naive method is to select nodes and paths randomly, but it may result in heavy congestions on some links, leading to long data retrieval time, because it ignores link bandwidths and the overlaps of selected paths and nodes. For example, considering the case of a single application, in a topology as shown in Fig. 1, two data objects (a and b) are stored with replication factor of 3, and each link has the bandwidth of 1 data per second. Note that it takes at least 1 second to transfer a data between any two nodes. Suppose at the same time, node v a is about to retrieve data a and v b is about to retrieve data b, both belonging to the same application, then the naive method may result in the data retrieval time of 2 seconds; while the optimal solution only takes 1 second. The naive method has a worse performance, because both data transfers pass through some common links, becoming bottlenecks (as shown in Fig. 1). The optimal solution takes the least time, because by selecting nodes and paths both data transfers pass through disjoint sets of links (as shown in Fig. 1). This motivates us to investigate the optimal data retrieval method, where nodes and paths are selected carefully. Our objective is to select nodes and paths such that the data retrieval time of an application is minimized. The naive method also falls short when multiple applicacore switch aggregation switch ToR switch a b a b a b v a v b computing node Fig. 2. A typical topology of data center network: a fat-tree topology composed of 4-port switches. a b a b a b v a v b Fig. 1. An example to motivate the optimal data retrieval method. Node v a is retrieving data a (red dash lines) and v b is retrieving data b (green solid lines). In both data transfers share common links, which has more traffic and may lead to longer transmission time, while in they passes through disjoint sets of links, resulting in shorter data retrieval time. tions are running in the cloud, where different applications may have different requirements, i.e., the upper bound of data retrieval time. As it may be impossible to satisfy all requirements, we minimize the penalties of applications. Thus our problem is to select nodes and paths for each required data such that the penalties of applications are minimized. 3 PROBLEM FORMULATION In this section, we formulate the data retrieval problem. We start with the case of a single application, and deal with multiple applications later. 3.1 Single Application A Data Center Network (DCN) consists of switches and computing nodes, as illustrated in Fig. 2. The DCN is represented as a graph with nodes V and edges E, i.e., G = V, E, where each edge represents a bi-directional link. This is a typical network configuration in DCN. V consists of both computing nodes V C and switch nodes V X, that is V = V C V X. Let B e denote the bandwidth on link e. Suppose an application processes a set of data objects D = {d 1, d 2,..., d k,..., d m }, which have been stored in computing nodes. For simplicity, we assume that all data objects have the same size S. A data object is replicated in multiple nodes for redundancy. Let V Ck V C denote the set of nodes that have data object d k. Let A jk denote whether v j needs data object d k to run the tasks assigned on it. If v j requires d k (i.e., A jk = 1), we have to select a node v i V Ck from which d k can be retrieved. Each selection forms a flow f k ij, which represents the data transfer of d k from v i to v j. The set of possible flows of transferring data d k to node v j is denoted by F jk. Hence, the set of all flows is F = (j,k) where A jk =1 F jk. (1) Because one and only one node is selected for each retrieval, exactly one flow in F jk can be used. Let binary variable x f denote whether or not flow f is used, then f F jk x f = 1 (j, k) where A jk = 1. (2) After selecting a flow, we have to find a path for it. To shorten data retrieval time, we use shortest paths. Let P (f k ij ) denote the set of paths that can be used to actuate f k ij, which are all shortest paths from v i to v j. Let binary variable y fp

3 3 denote whether path p is selected to actuate flow f, where p P (f), then, y fp = x f f F. (3) p P (f) Only one path will be selected if flow f is used (i.e., x f = 1), and none otherwise (i.e., x f = ). Since all flows are transferred concurrently, the data retrieval time of an application (which is the total time to complete all data transfers) is dominated by the longest data transfer time among all flows. Let t f denote the data transfer time of flow f, then data retrieval time t is computed as t = max{t f, f F }. (4) Let r f denote the sending rate of flow f, then the data transfer time of flow f equals to transmission delay as t f = S r f. (5) The other delays, such as processing delay, propagation delay and queueing delay, are all negligible, because the flows of transferring data objects are large in size, e.g., 64 MB in HDFS. Ideally, the least data retrieval time can be simply obtained by setting the sending rates of all flows to the maximum values possible. However, as multiple flows may pass through the same link, which has limited bandwidth, the sending rates of these flows may not be able to reach the maximum values simultaneously. Suppose F e is the set of all flows passing through link e, then the aggregate data rate on link e (i.e., f F e r f ) is bounded by its bandwidth as f F e r f B e. (6) Let P e denote the set of all paths passing through link e. A flow f passes through link e if its selected path passes through link e, i.e., p P (f) P e such that y fp = 1. Thus, r f = r f y fp B e. (7) f F e f F p P (f) P e Replacing r f in (7) with S/t f, we get f F p P (f) P e S t f y fp B e. (8) Note that this constraint is not linear, thus together with t f t we transform it to a linear one, f F p P (f) P e S t y fp B e. (9) This new constraint means the amount of traffic passing through each link (i.e., f F p P (f) P e S y fp ) is bounded by the maximum amount of data that can be transmitted within data retrieval time (i.e., B e t). In other words, the data retrieval time is the maximum data transmission time over all links. Theorem 1. Constraints (8) and (9) are equivalent. Proof. To prove the equivalence, we have to prove that any feasible t f to (8) and t to (4) are also feasible to (9), and vice versa. Firstly, since t t f, f F, if (8) is satisfied, then (9) is also satisfied. Secondly, for any feasible t to (9), we can easily set all t f to t, then both (4) and (8) are satisfied. To sum up, the data retrieval problem is to select nodes and paths for all requests, such that the data retrieval time is minimized. The data retrieval time is affected by those selections through the resulted amount of traffic on each link. It is formulated into an MIP as follows, min t (1a) s.t. f F jk x f = 1 (j, k) where A jk = 1 p P (f) (1b) y fp = x f f F (1c) S y fp B e t e E (1d) f F p P (f) P e x f {, 1} f F (1e) y fp {, 1} f F, p P (f) (1f) t. (1g) 3.2 Multiple Applications In cloud, multiple applications run on shared resources simultaneously to utilize resources more efficiently. To select nodes and paths for the requests of each application, a simple approach is to treat all applications as a single one and to solve the problem using the above model. This is equivalent to minimize the maximum data retrieval time among all applications. However, different applications may have different requirements on their data retrieval time, the naive approach ignores the difference of requirements. Thus, instead of minimizing data retrieval time, we minimize a penalty. Given a set of applications U, suppose application u U has an upper bound t u on its data retrieval time t u, penalty c u is defined as c u = max{ t u t u t u, }. (11) A penalty is induced if t u exceeds a threshold, no penalty otherwise. The same to the single-application model, t u of application u is dominated by the longest data transfer time among all its flows, that is t u = max{t f, f F u }, (12) where F u is the set of possible flows of application u. To maintain fairness among applications, we minimize the maximum penalty denoted by c, c = max{c u, u U}. (13) Thus, our problem is to select nodes and paths for the requests of each application, such that c is minimized. The same to the single-application model, the selections affect which flows pass through each link and the resulted aggregate data rate, restricted by the link bandwidth. But here aggregate data rate is aggregated over the flows in

4 4 all applications rather than a single one. Let re u denote the aggregate data rate of application u on link e, then re u B e. (14) u U Following the same rule of (7) re u is computed as r u e = f F u p P (f) P e r f y fp. (15) Recall that r f = S/t f, combining above two constraints we obtain S y fp B e. (16) t f u U f F u p P (f) P e Note that this constraint is not linear, thus together with (11), (13) and (12), we transform it to a linear one, u U f F u p P (f) P e S (c + 1) t u y fp B e. (17) Theorem 2. Constraints (16) and (17) are equivalent. Proof. To prove the equivalence, we have to prove that any feasible t f and c to the set of constraints (11), (12), (13) and (16) are also feasible to (17), and vice versa. Firstly, for a flow f of application u, i.e., f F u, its data transfer time t f must satisfy the following, t f t u (c u + 1) t u (c + 1) t u. (18) In above deduction, the first inequality is obtained from (12), the second inequality is obtained from (11), and the last one is obtained from (13). Thus, if all t f satisfy constraint (16), then c satisfies constraint (17). Secondly, for any maximum penalty c which satisfies (17), we can build a set of t f satisfying (16), setting t f to the maximum possible value (c + 1) t u where f F u as follows, t f = t u = (c u + 1) t u = (c + 1) t u. (19) That is, all flows of an application have the same data transfer time, being proportional to the upper bound of the application, and all applications have the same penalty. All these results satisfy (11), (13), (12) and (16). Due to the equivalence proved above, (17) possesses the same meaning as (14), that is the aggregated data rate of all applications on each link is limited by its bandwidth. The difference is that in (17) we set the data transfer time of each flow to the maximum possible value, i.e., (c + 1) t u. Besides bandwidth constraints, the same as in the case of a single application, exactly one flow can be used for each request, and exactly one path can be selected for each used flow. To sum up, for the case of multiple applications the data retrieval problem is to select nodes and paths for the requests of each application, such that the maximum penalty among all applications is minimized. It is formulated into an MIP, as follows, min c (2a) s.t. x f = 1 (j, k) where A jk = 1 (2b) f F jk y fp = x f f F (2c) p P (f) u U f F u p P (f) P e S t u y fp B e (c + 1) e E (2d) x f {, 1} f F (2e) y fp {, 1} f F, p P (f) (2f) c (2g) 3.3 Discussion on Complexity The data retrieval problems for both cases are NP-hard, because even when source nodes are determined, the remaining problem to select paths is NP-hard. When a source node is determined for each request, a set of commodities are formed. Here we call a triple consisting of a source, a destination, and a demand (i.e., the amount of data to be routed) a commodity. For the case of a single application, given the set of commodities and a network, then our problem is to compute the maximum value 1/t for which there is a feasible multicommodity flow in the network with all demands multiplied by 1/t, which is a concurrent flow problem [8]. Since we have the additional restriction that each commodity must be routed on one path, the problem is an unsplittable concurrent flow problem, which is NP-hard [8]. It is the same for the case of multiple applications. 4 APPROXIMATION ALGORITHM We propose an approximation algorithm to solve the data retrieval problem. 4.1 Max-throughput Algorithm Given a data retrieval problem and its MIP formulation, our algorithm has three major steps: 1) solve its LP relaxation; 2) construct an integral solution from the relaxation solution using a rounding procedure; 3) analytically compute the data sending rate of each flow for scheduling. The LP relaxation can be obtained by relaxing binary variables x f and y fp, whose optimal solution is denoted by x f and y fp. The rounding procedure constructs an integral solution x A f and ya fp from fractional solution x f and y fp, by keeping objective value changing as less as possible and two sets of constraints still satisfied: 1) one-flow constraints, that is exactly one flow is used for each request, i.e., (1b) or (2b); 2) one-path constraints, that is exactly one path is used for each flow, i.e., (1c) or (2c). We first select the flow that has the largest x f for each request to construct xa f satisfying one-flow constraints. Then we select the path that has the largest yfp for each flow in use to construct ya fp satisfying one-path constraints. With nodes and paths determined, the objective value of a data retrieval problem can be derived analytically, so

5 5 is the data sending rate of each flow. For the case of a single application, with variables x f and y fp having been determined, its MIP becomes min t (21a) s.t. S y fp B e t e E (21b) f F p P (f) P e t. (21c) Let Y denote the set of y fp, then we have t(y ) = max{ f F p P (f) P e S y fp, e E}, (22) B e that is the largest data transmission time among all links. As t f t(y ), we have r f = S t f S t(y ). To ensure bandwidth constraints, the data sending rate of each flow can be set to the same value as the slowest flow, that is, r f = S t(y ). (23) For the case of multiple applications, with variables x f and y fp having been determined, the amount of the traffic of application u on link e (denoted by β u e (Y )) is also determined as The MIP becomes β u e (Y ) = f F u p P (f) P e S y fp. (24) min c (25a) s.t. βe u (Y ) B e (c + 1) t u e E (25b) u U c. Then we have c(y ) = max{, max{ (25c) u U βu e (Y )/t u B e, e E } 1}. (26) To ensure bandwidth constraints, the data sending rate of each flow can be set as t f = (c(y ) + 1) t u, where f F u, following the analyses in Theorem 2. We analyze that our algorithm has polynomial complexity. First, it takes polynomial time to solve a LP relaxation in the first step. Since given a data retrieval problem, there are polynomial number of variables and constraints in its LP formulation, and a LP can be solved in polynomial time [9]. Second, the rounding procedure in the second step takes polynomial time. Since in this procedure each x is compared once, so is each y. Finally, the last analytical step takes O( E ) time to compute (22) or (26). 4.2 Analysis on Approximation Ratio Next, we analyze the approximation ratio of the above algorithm Single Application The above approximation algorithm has an approximation ratio of RL, where R is the replication factor (the number of replicas) of data, and L is the largest number of candidate paths that can be used by a flow. Let t A denote the objective value of an approximation solution, and (M IP ) denote the optimal value for an MIP instance, then we have In other words, t A RL (MIP ). (27) (MIP ) 1 RL ta. (28) The derivation of the approximation ratio is based on two facts: 1) the optimal value of the LP relaxation (denoted by t ) provides a lower bound for the optimal value of an MIP, that is (MIP ) t, (29) since the solution space of the LP relaxation becomes larger due to relaxation; 2) the optimal fractional solution y fp and an approximation solution y A fp satisfy y fp 1 RL ya fp. (3) This can be derived from the process of building approximation solutions. In constructing yfp A for flow f, we have y fp 1 P (f) x f y A fp, p P (f). (31) For yfp A =, y fp is always valid; for ya fp = 1, yfp 1 P (f) x f should be valid; otherwise, one-path constraints would be contradicted. This is because for flow f, path p whose yfp is the largest is selected, so if y fp where p is selected (i.e., yfp A = 1) is less than 1 P (f) x f, then any other yfp where p is not selected is also less than 1 P (f) x f, and finally p P (f) y fp < 1 p P (f) P (f) x f = x f, which contradicts the one-path constraint. Furthermore, in constructing x A f we have that for flow f being selected (i.e., x A f = 1), x f 1 R. (32) Otherwise, one-flow constraints would be contradicted. This is because for a request, flow f whose x f is the largest is selected, so if x f whose flow is selected is less than 1 R, then any other x f whose flow is not selected is also less than 1 R, and finally f F (q) x f < f F (q) 1 R = 1, which contradicts with the one-flow constraint. Recall that if yfp A is 1, then x A f must be 1, so when ya fp is 1, (32) must be satisfied. Thus, we have y fp 1 P (f) x f y A fp 1 P (f) 1 R ya fp 1 RL ya fp. (33) Based on the above two facts, we can derive the approximation ratio. Let e denote the bottleneck link having the maximum data transmission time in the optimal factional

6 6 solution, and e A denote the counterpart in an approximation solution, then we have (MIP ) t (34a) S yfp f F p P (f) P e = (34b) B e S yfp f F p P (f) P e A B e A f F p P (f) P e A B e A S 1 RL ya fp (34c) (34d) 1 = RL ta. (34e) In the above derivation, (34b) and (34e) are obtained from (22), (34c) is because e rather than e A is the bottleneck link in a fractional solution, and (34d) is from the second fact (3). Therefore, the data retrieval time obtained by the approximation algorithm is at most RL times greater than the optimal value Multiple Applications In the same manner, we can obtain that the approximation algorithm has the same approximation ratio of RL for the case of multiple applications, but now the approximation ratio affects penalty plus 1, i.e., c+1, rather than c, as follows, c A + 1 RL ((MIP ) + 1), (35) where c A is the objective value of an approximation solution, and (M IP ) is the optimal value for an MIP instance. (35) means that the worst time ratio obtained by the approximation algorithm is at most RL times greater than the optimal value. As (M IP ), RHS of (35) RL 1, thus (35) is valid if c A =. Next we prove it is valid if c A >. Let e denote the bottleneck link having the largest u U βu e (Y )/t u in the optimal fractional solution, and e A denote the counterpart in an approximation solution, and let c denote the optimal value of the LP relaxation, then we have (MIP ) c βe u (Y ) u U t u u U u U B e β u e A (Y ) t u B e A RL βu e A (Y A ) t u B e A 1 (36a) (36b) (36c) (36d) 1 = RL (ca + 1) 1. (36e) (36a) is from the fact that the optimal value of the LP relaxation provides a lower bound for the optimal value of an MIP. (36b) is obtained from (26). (36c) is because e rather than e A is the bottleneck link in the optimal fractional solution. (36d) is because βe u (Y ) 1 RL βu e (Y A ), obtained from (24) and the fact yfp 1 definition when c A >. RL ya fp. (36e) is from the 4.3 Best Approximation Result As analyzed previously, approximation results are upper bounded by approximation ratio RL and the optimal value of an MIP instance. The lower the upper bound is, the better approximation results may be. Thus, we may obtain better approximation results by reducing the upper bound. We propose a preprocessing procedure to reduce approximation ratio RL. That is, for each request we randomly select a subset of nodes as source candidates (R in total), and for each flow we randomly select a subset of paths as routing candidates (L in total), as the inputs of an MIP formulation. As such, RL is less than the value in the case without preprocessing. However, since data replicas and routing paths are selected from a subset of nodes and paths rather than the whole set, the optimal value of the current MIP instance may be worse than that in the case without preprocessing. If only one node is selected for each request as candidates, and only one path is selected for each flow as candidates, then the solution to the resulted MIP instance is obvious, which is comparable to the result of naive random selection method. Thus, with the preprocessing, as R or L decreases, the approximation ratio decreases, but the optimal value of the resulted MIP instance may increase, as illustrated in Fig. 3 where R varies and L is fixed (changes are similar when L varies and R is fixed). So there exists a pair of R and L, such that the upper bound is lowest. With this motivation, we take the preprocessing procedure with various pairs of R and L, and train the approximation algorithm on the resulted MIP instances. Then, the best pair of R and L is selected, which leads to the best approximation result. It is practical, since it takes polynomial time to run the algorithm. (MIPRL) 1 R max R R maxl Fig. 3. (MIP RL ) and approximation ratio RL may change as R varies, with L fixed, where MIP RL is an MIP instance formulated after preprocessing with parameters R and L. It is noted that some flows may be unable to have more than one candidate paths, which are all shortest paths, e.g., any two nodes in the same rack have only one shortest path in between. In this case, the preprocessing is unnecessary for those flows, and the value of L is not affected, since it is determined by the largest set of candidate paths. 4.4 Discussion on Deployment When deploying our scheduling algorithms in real systems, the data retrieval problem can be solved once an application has been distributed in computing nodes, under the bandwidth conditions at that time. Once the application starts running, its data retrieval is scheduled according to precomputed results. During runtime, bandwidth conditions may change due to finishing or starting of other applications. To L Approximation Ratio RL

7 7 4 core switchs 8 aggregation switches 32 ToR switches 128 computing nodes Fig. 4. A three-tier topology of data center network. 4 core switchs 4 aggregation switches 8 ToR switches 16 computing nodes 4x1G 8 4x1G 2x1G x1G 4x1G Fig. 5. A VL2 topology of data center network. adapt to dynamic bandwidth conditions, the data retrieval problem can be resolved periodically. The length of the period depends on the tradeoff between computation overhead and how quickly the scheduling responses to changing bandwidth. 5 PERFORMANCE EVALUATION In this section, we evaluate the performance extensively and demonstrate that our algorithm can obtain near optimal solutions, with the availability of additional data replicas and abundant paths. We also show that in many cases even one additional replica can improve performance greatly. 5.1 Methods for Comparison We take two methods in comparison: 1) the optimal algorithm by solving the MIP ( for short); 2) the existing method by randomly selecting nodes and paths ( for short), where a node is chosen from all available ones. In our simulations, both the MIP and the LP relaxation are solved by Gurobi [1] with a default setting. 5.2 Simulation Setups We first introduce network topologies and parameters, and then discuss data retrieval setups. Our simulation testbed has three types of Data Center topologies: 1) a fat-tree topology built from 8-port switches; 2) a three-tier topology (as shown in Fig. 4) in which every 4 neighboring hosts are connected to a ToR switch, and every 4 neighboring ToR switches are connected to an aggregation switch, and all 8 aggregation switches are connected to each core switch (4 in total); 3) a VL2 topology [4] with 8- port aggregation and 4-port core switches, as shown in Fig. 5. Both the fat-tree topology and the three-tier topology have 128 computing nodes, while the VL2 topology has 16 instead Data retrieval time (s)14 R =1 R =2 R = L Fig. 6. The performance of APX with different R and L, where the dashed lines represent the results of (red) and (black). The APX with R being 2 or 3 perform close to. We set link bandwidths as follows: in the fat-tree and the three-tier topologies each link is 1 Gbps; in the VL2 topology each server link is 1 Gbps, while each switch link is 1 Gbps, the same as in [4]. Since many different applications may share the network fabric in data centers, we simulate under two settings: full bandwidth and partial bandwidth in the presence of background traffic. We generate background traffic by injecting flows between random pairs of computing nodes, each of which passes along a random path in between and consumes 1 Mbps. In order to ensure that each link has at least 1 Mbps remained, we accept a background flow only if the currently available bandwidth of every link along its path is at least 2 Mbps. Otherwise we reject. The amount of remaining bandwidth depends on how many background flows are accepted. In our simulations, we inject 4 flows. The inputs of a data retrieval problem include the locations of tasks and data objects, as well as access relationships between them. We generate these inputs synthetically. We place data first, and each data object is stored with replication factor of 3, as follows: 1) it is first placed in a randomly chosen node; 2) its first replica is stored in a distinctive node in the same rack; 3) its second replica is stored in a remote rack. This rule is the same to that in HDFS [2]. In real applications, most of the data is accessed locally, not affecting data retrieval time. Thus in simulations we only consider the non-local data requiring transfers. We assign tasks randomly on the nodes not having their required data. There are two types of access relationships between tasks and data objects: 1) one-to-one, where each task accesses a unique data object; 2) many-to-one, where multiple tasks access a common data object. Note that in both cases, each task only requires one data object to work with, and this is because other relationships (one-to-many and many-tomany) are special cases of our consideration. In many-toone setting, when a node has several tasks which access the same data, we assume that the node initiates only one data request and the retrieved data can be used by all the tasks. Both the amount of data and the amount of tasks are varied in simulations. For one-to-one setting, the number of tasks equals to the number of data objects, both varying from 1 to 1. For many-to-one setting, the number of data objects is set to 1, and the number of tasks is varied from 1 to 1. All simulation results are averaged over 2 runs.

8 (c) (d) Fig. 7. The reduction ratio of data retrieval time over for a fat-tree topology. is for full bandwidth and one-to-one access relationship; is for full bandwidth and many-to-one access relationship; (c) is for partial bandwidth and one-to-one access relationship; (d) is for partial bandwidth and many-to-one access relationship. Percentage bottom bottom middle middle top top Percentage bottom bottom middle middle top top Fig. 8. The percentages of each link-type in bottleneck links of. is for one-to-one access relationship; is for many-to-one access relationship. represents uplink, and represents downlink Data retrieval time (s)16 -Rmax-L1 -R1-Lmax Data retrieval time (s)6 -Rmax-L1 -R1-Lmax Fig. 9. The data retrieval time of an application for a fat-tree topology, using the methods discussed in Section is for full bandwidth and many-to-one access relationship, is for partial bandwidth and one-to-one access relationship. 5.3 Single Application We first evaluate the data retrieval time of a single application. It is obtained from objective value for, while for other algorithms it can be computed as in (22). We use the reduction ratio of data retrieval time over to evaluate the performance improvement of our algorithm (i.e., (t t)/t, where t is the data retrieval time of the algorithm being evaluated) Fat-tree Topology We first simulate for a fat-tree topology. We start with running simulations to find optimal R and L where APX performs best. As now R max is 3 and L max is 16, we try 9 pairs, where R is chosen from {1, 2, 3} and L is chosen from {1, 4, 16}, both adjusted by the preprocessing of random selections discussed in Section 4.3. We simulate a scenario of having 5 tasks and 1 data objects with full bandwidth and many-to-one access relationship. The performance of the approximation algorithm (APX for short) in 9 settings are shown in Fig. 6, where the results of and are represented by red line and black line respectively. It is observed that the APX algorithms with R being 2 or 3 perform close to. Thus we choose to evaluate APX with R = 2 and L = 1 ( for short), besides the one without the preprocessing ( for short). The results are shown in Fig. 7. It is observed that performs almost as good as, much better than, and a bit better than. In the topology with full bandwidth, the reduction ratio is around 3% 13% for one-to-one setting; while for manyto-one setting it increases significantly (2% 4%) as more tasks access each data at the same time. This is because, in, when many tasks access the same set of data objects, some tasks are possible to choose the same node to retrieve a common data, resulting in heavy congestions near the selected node. The more the tasks, the worse the congestions, and the longer the retrieval time. To verify the locations of the congestions in, we calculate the percentages of each link-type in bottleneck links (that is where data transmission time equals to the data retrieval time). There are 6 link-types belonging to 3 layers and 2 directions. The results are shown in Fig. 8. Fig. 8b demonstrates that for many-to-one access relationship as the number of tasks increases, the bottlenecks mainly consist of the uplinks in the bottom layer. In comparison, for one-to-one access relationship, congestions probably happen at any layer of links, as demonstrated in Fig. 8a. For the topology with partial bandwidth, the reduction ratios are significant, as shown in Fig. 7c and Fig. 7d, i.e., roughly around 4% for one-to-one setting, and 5% for many-to-one setting. Both ratios are higher than that in the case of full bandwidth, because ignores different bandwidths, in addition to path and replica diversities, which should have been considered in selecting nodes and paths, as our algorithm does. We demonstrate that data retrieval time cannot be minimized when we ignore replica diversity. We simulate two methods. One fully utilizes replica diversity but randomly chooses a path for each possible flow formed by each replica (-Rmax-L1 for short), and the other randomly chooses a replica for each request (ignoring replica diversity) but fully utilizes path diversity (-R1-Lmax for short). They are formulated into MIPs and solved optimally. We compare them to and, and show the results in Fig. 9. It is

9 Fig. 1. The reduction ratio of data retrieval time over, for a three-tier topology with partial bandwidth. is for one-to-one access relationship; is for many-to-one access relationship. Fig. 11. The reduction ratio of data retrieval time over, for a VL2 topology with partial bandwidth. is for one-to-one access relationship; is for many-to-one access relationship. observed that -Rmax-L1 performs as good as, but -R1-Lmax performs much worse than, even close to, in various simulation settings. Thus, it validates the motivation to exploit replica diversity. In addition, Fig. 9a illustrates that in the topology with full bandwidth, path diversity may not improve performance; Fig. 9b illustrates that in a topology with partial bandwidth, path diversity may improve performance but not as much as replica diversity does Three-tier Topology We also simulate for a three-tier topology. Now R max is 3 and L max is 4. We simulate and the APX with two replicas and one path randomly selected in the preprocessing (i.e., ) as above. We show reduction ratios in Fig. 1. The results are for a topology with partial bandwidth, and the results for the topology with full bandwidth are similar, which are omitted due to limited space. It is demonstrated that both APX algorithms perform almost as good as, around 6% 17% better than in one-to-one setting, and around 2% 32% better than in many-to-one setting. The reduction ratios in both cases are lower than that for fat-tree topology, because in three-tier topology the difference made by replica selections or path selections is not as significant as that in fat-tree topology. Specifically, in three-tier topology, congestions always happen on the links between ToR layer and aggregation layer due to oversubscribed bandwidths, since each of those links is shared by four nodes connected to its ToR switch. To mitigate congestions, we can select replicas such that the traffic on those congestion links can be balanced. However, such mitigation is less than that for fat-tree topology, because two replicas in the same rack share a common link between ToR and aggregation layers, leading to the same traffic. In addition, we cannot mitigate congestions by selecting paths, because the data transfers below aggregation layer have only one path to use VL2 Topology For the VL2 topology as shown in Fig. 5, we simulate APX- R2-L1 and where R max is 3 and L max is 16. We show reduction ratios in Fig. 11. The results are for a topology with partial bandwidth, and the results for the topology with full bandwidth are similar, which are omitted. It is demonstrated that both APX algorithms perform almost as good as, around 25% 5% better than in oneto-one setting, and around 3% 6% better than in R =1 R =2 R = L Fig. 12. The performance of APX with different R and L, where the dashed lines represent the results of (red) and (black). The APX with R = 2 and L = 1 performs closest to. many-to-one setting. The results are similar to that in the fattree topology, since both topologies have richly connected links and full-bisection bandwidths. 5.4 Multiple Applications In this section, we evaluate the performance of multiple applications. The same to the case of a single application, we take and for comparison. The worst penalty among all applications is used for evaluation. It is directly obtained from objective value for, while for APX and it can be computed as in (26) Fat-tree Topology We simulate two applications having separate data retrieval setups, each of which is generated as introduced in Section 5.2. We set different upper bounds on their data retrieval time. The optimal data retrieval time in the case of a single application (obtained from ) is used as a baseline. One upper bound is set to 1.2 times, and the other is set to 1.8 times as much as the baseline. The settings are listed in Table 1, averaged over 2 runs. We also start with running simulations to find optimal R and L where APX performs best. 9 pairs of R and L are tried, where R is chosen from {1, 2, 3} and L is chosen from {1, 4, 16}. We simulate a scenario of two applications each having 5 tasks and 1 data objects with many-to-one access relationship under full bandwidth. The performance of APX in 9 settings are shown in Fig. 12, where the results of and are represented by red line and black line respectively. It is observed that the APX with R = 2 and L = 1 performs closest to. Thus we choose to evaluate APX with this setting (i.e., ), besides the one without the preprocessing (i.e., ).

10 1 TABLE 1 The settings of the upper bounds on data retrieval time in the simulation of two applications (s) the number of tasks topology bandwidth access fat-tree partial many-to-one (6, 9) (7, 11) (9, 13) (13, 2) (16, 24) (17, 25) (19, 29) (22, 32) (23, 35) (26, 38) fat-tree partial one-to-one (5, 8) (8, 12) (12, 18) (15, 23) (16, 24) (23, 35) (23, 35) (26, 39) (29, 44) (31, 47) fat-tree full many-to-one (2, 4) (3, 5) (4, 6) (5, 8) (6, 9) (7, 1) (8, 11) (8, 12) (9, 13) (9, 14) fat-tree full one-to-one (2, 4) (4, 5) (4, 6) (5, 8) (6, 9) (7, 1) (8, 12) (9, 13) (9, 13) (11, 16) three-tier partial many-to-one (37, 56) (64, 96) (9, 135) (112, 169) (137, 26) (155, 233) (171, 256) (194, 291) (214, 321) (239, 359) three-tier partial one-to-one (38, 57) (62, 94) (92, 138) (112, 168) (136, 24) (157, 235) (181, 271) (27, 311) (228, 342) (241, 361) VL2 partial many-to-one (4, 6) (6, 1) (8, 12) (8, 13) (1, 15) (11, 17) (13, 19) (15, 23) (16, 23) (15, 23) VL2 partial one-to-one (4, 6) (7, 1) (8, 12) (1, 15) (1, 15) (14, 21) (15, 23) (19, 28) (2, 29) (22, 33) (c) (d) Fig. 13. The worst penalty of two applications for a fat-tree topology. is for full bandwidth and one-to-one access relationship; is for full bandwidth and many-to-one access relationship; (c) is for partial bandwidth and one-to-one access relationship; (d) is for partial bandwidth and many-to-one access relationship. Next we evaluate the performance for two-application scenario. We show results in Fig. 13. For full bandwidth as shown in Fig. 13a and Fig. 13b, it is observed that APX- R2-L1 performs almost as good as, and is much worse than, with being in between. In one-to-one setting, reduces penalty by 15 percentage points roughly, and in many-to-one setting, it reduces penalty by 2 6 percentage points. APX-Rmax- Lmax is usually better than, and is comparable with in some settings of few tasks. For partial bandwidth as shown in Fig. 13c and Fig. 15b, it is observed that APX algorithms with both parameters perform close to, and significantly better than. In one-to-one setting, both APXs reduce penalty by 25 9 percentage points, and in many-to-one setting, they reduce penalty by 7 2 percentage points. We also demonstrate that for the case of multiple applications, penalty cannot be minimized if we ignore replica diversity. We simulate -Rmax-L1 and -R1-Lmax (discussed in Section 5.3.1), and compare them to and. Remind that in -R1-Lmax, replica diversity is ignored, as data replica is randomly selected. The results for the fat-tree topology with full bandwidth and many-toone access setting are shown in Fig. 14, and the results for other settings are similar, which are omitted due to limited space. It is observed that -Rmax-L1 performs as good as, but -R1-Lmax performs much worse than, similar to. Thus, it validates our motivation to exploit replica diversity Three-tier Topology and VL2 Topology We also simulate the case of two applications in a three-tier topology and a VL2 topology. The upper bounds on their data retrieval time are set as discussed previously, listed in Rmax-L1 -R1-Lmax Fig. 14. The worst penalty of two applications for the fat-tree topology with full bandwidth and many-to-one access relationship, using the methods discussed in Section Table 1. The simulation results for the three-tier topology are shown in Fig. 15, and that for the VL2 topology are shown in Fig. 16. We only show the results for partial bandwidth, and the results for full bandwidth are similar, which are omitted due to limited space. It is observed that both APX algorithms perform close to, much better than. For the three-tier topology, APX algorithms reduce penalty by 5 2 percentage points in one-to-one setting, and reduce penalty by 2 5 percentage points in many-to-one setting. For the VL2 topology, they reduce penalty by percentage points in one-to-one setting, and reduce penalty by percentage points. All of these results demonstrate that our algorithm is effective. 6 RELATED WORKS We review the existing works related to the data retrieval problem, and categorize them into three groups: traffic scheduling (in three levels), replica selection, and joint replica selection and routing. Packet-level Traffic Scheduling. Dixit et al. in [11] argued that packet-level traffic splitting, where packets of a

11 Fig. 15. The worst penalty of two applications for a three-tier topology with partial bandwidth. is for one-to-one access relationship; is for many-to-one access relationship. Fig. 16. The worst penalty of two applications for a VL2 topology with partial bandwidth. is for one-to-one access relationship; is for many-to-one access relationship. flow are sprayed through all available paths, would lead to a better load-balanced network and much higher throughput compared to ECMP. Tso et al. in [12] proposed to improve link utilization by implementing penalizing exponential flow-spliting algorithm in data center. Flow-level Traffic Scheduling. Greenberg et al. in [4] proposed using per-flow Valiant Load Balancing to spread traffic uniformly across network paths. Benson et al. in [13] developed a technique, MicroTE, that leverages the existence of short-term predictable traffic to mitigate the impact of congestion due to the unpredictable traffic. Al- Fares et al. in [6] proposed a flow scheduling system, exploiting path diversity in data center, to avoid path collisions. Cui et al. in [14] proposed Distributed Flow Scheduling (DiFS) to balance flows among different links and improves bandwidth utilization for data center networks. Alizadeh et al. in [15] proposed a very simple design that decouples flow scheduling from rate control, to provide near-optimal performance. Job-level Traffic Scheduling. Chowdhury et al. in [7] proposed a global network management architecture and algorithms to improve data transfer time in cluster computing. They focused on the massive data transfer between successive processing stages, such as shuffle between the map and reduce stages in MapReduce. Dogar et al. in [16] designed a decentralized task-aware scheduling system to reduce task completion time for data center applications, by grouping flows of a task and scheduling them together. Although these traffic scheduling methods can be used to schedule flows in data retrievals, but they do not optimize replica selections for flows, which we focus on. Replica Selection in Data Grids. AL-Mistarihi et al. in [17] researched on the replica selection problem in a Grid environment that decides which replica location is the best for Grid users. Their aim is to establish fairness among users in selections. Rahman et al. in [18] proposed replica selection strategies to minimize access latency by selecting the best replica. These works do not consider the impact of route selections on data transfer time. Joint Replica Selection and Routing. Valancius et al. in [19] designed a system that performs joint content and network routing for dynamic online services. The system controls both the selection of replicas and the routes between the clients and their associated replicas. They demonstrated the benefits of joint optimization. With similar goals, Narayana et al. in [2] proposed to coordinate modular mapping and routing systems, already owned by OSP, to achieve global performance and cost goals of a joint system. Xu et al. in [21] distributed the joint optimization for scale. These works for wide area networks are inapplicable to data center networks, because data centers have much different traffic and data access patterns. 7 CONCLUSION In this paper, we investigate the data retrieval problem in the DCN, that is to jointly select data replicas and paths for concurrent data transfers such that data retrieval time is minimized (i.e., throughput is maximized). We propose an approximation algorithm to solve the problem, with an approximation ratio of RL, where R is the replication factor of data and L is the largest number of candidate paths. We also solve the data retrieval problem for the case of multiple applications, keeping fairness among them. The simulations demonstrate that our algorithm can obtain nearoptimal performance with the best R and L. ACKNOWLEDGMENTS This work was supported by grant from Research Grants Council of Hong Kong [Project No. CityU ]. REFERENCES [1] J. Dean and S. Ghemawat, Mapreduce: Simplified data processing on large clusters, Commun. ACM, vol. 51, no. 1, pp , Jan. 28. [2] D. Borthakur, HDFS architecture guide, Hadoop Apache Project, p. 53, 28. [3] C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian, Y. Zhang, and S. Lu, BCube: A high performance, server-centric network architecture for modular data centers, in SIGCOMM, 29, pp [4] A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta, VL2: A scalable and flexible data center network, in SIGCOMM, 29, pp [5] C. Hopps, Analysis of an equal-cost multi-path algorithm, RFC 2992, IETF, 2. [6] M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat, Hedera: Dynamic flow scheduling for data center networks, in NSDI, 21, pp [7] M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica, Managing data transfers in computer clusters with orchestra, in SIGCOMM, 211, pp [8] J. L. Gross, J. Yellen, and P. Zhang, Handbook of Graph Theory, Second Edition, 2nd ed. Chapman & Hall/CRC, 213. [9] R. J. Vanderbei, Linear programming: Foundations and extensions, 1996.

12 12 [1] I. Gurobi Optimization, Gurobi optimizer reference manual, 215. [Online]. Available: [11] A. Dixit, P. Prakash, and R. R. Kompella, On the efficacy of fine-grained traffic splitting protocolsin data center networks, in SIGCOMM, 211, pp [12] F. P. Tso and D. Pezaros, Improving data center network utilization using near-optimal traffic engineering, IEEE Trans. Parallel Distrib. Syst., vol. 24, no. 6, pp , June 213. [13] T. Benson, A. Anand, A. Akella, and M. Zhang, MicroTE: Fine grained traffic engineering for data centers, in Proceedings of the Seventh COnference on Emerging Networking EXperiments and Technologies, 211, pp. 8:1 8:12. [14] W. Cui and C. Qian, DiFS: Distributed flow scheduling for adaptive routing in hierarchical data center networks, in Proceedings of the Tenth ACM/IEEE Symposium on Architectures for Networking and Communications Systems, 214, pp [15] M. Alizadeh, S. Yang, M. Sharif, S. Katti, N. McKeown, B. Prabhakar, and S. Shenker, pfabric: Minimal near-optimal datacenter transport, in SIGCOMM, 213, pp [16] F. R. Dogar, T. Karagiannis, H. Ballani, and A. Rowstron, Decentralized task-aware scheduling for data center networks, in SIGCOMM, 214, pp [17] H. H. E. AL-Mistarihi and C. H. Yong, On fairness, optimizing replica selection in data grids, IEEE Trans. Parallel Distrib. Syst., vol. 2, no. 8, pp , 29. [18] R. M. Rahman, R. Alhajj, and K. Barker, Replica selection strategies in data grid, Journal of Parallel and Distributed Computing, vol. 68, no. 12, pp , 28. [19] V. Valancius, B. Ravi, N. Feamster, and A. C. Snoeren, Quantifying the benefits of joint content and network routing, in SIGMETRICS, 213, pp [2] S. Narayana, W. Jiang, J. Rexford, and M. Chiang, Joint server selection and routing for geo-replicated services, in Proceedings of the 213 IEEE/ACM 6th International Conference on Utility and Cloud Computing, 213, pp [21] H. Xu and B. Li, Joint request mapping and response routing for geo-distributed cloud services, in INFOCOM, 213, pp Ruitao Xie received her PhD degree in Computer Science from City University of Hong Kong in 214, and BEng degree from Beijing University of Posts and Telecommunications in 28. She is currently a senior research associate in Department of Computer Science at City University of Hong Kong. Her research interests include cloud computing, distributed systems and wireless sensor networks. Xiaohua Jia received his BSc (1984) and MEng (1987) from University of Science and Technology of China, and DSc (1991) in Information Science from University of Tokyo. He is currently Chair Professor with Dept of Computer Science at City University of Hong Kong. His research interests include cloud computing and distributed systems, computer networks and mobile computing. Prof. Jia is an editor of IEEE Internet of Things, IEEE Trans. on Parallel and Distributed Systems (26-29), Wireless Networks, Journal of World Wide Web, Journal of Combinatorial Optimization, etc. He is the General Chair of ACM MobiHoc 28, TPC Co-Chair of IEEE GlobeCom 21 Ad Hoc and Sensor Networking Symp, Area-Chair of IEEE INFOCOM 21 and 215. He is Fellow of IEEE.

Load Balancing Mechanisms in Data Center Networks

Load Balancing Mechanisms in Data Center Networks Load Balancing Mechanisms in Data Center Networks Santosh Mahapatra Xin Yuan Department of Computer Science, Florida State University, Tallahassee, FL 33 {mahapatr,xyuan}@cs.fsu.edu Abstract We consider

More information

International Journal of Emerging Technology in Computer Science & Electronics (IJETCSE) ISSN: 0976-1353 Volume 8 Issue 1 APRIL 2014.

International Journal of Emerging Technology in Computer Science & Electronics (IJETCSE) ISSN: 0976-1353 Volume 8 Issue 1 APRIL 2014. IMPROVING LINK UTILIZATION IN DATA CENTER NETWORK USING NEAR OPTIMAL TRAFFIC ENGINEERING TECHNIQUES L. Priyadharshini 1, S. Rajanarayanan, M.E (Ph.D) 2 1 Final Year M.E-CSE, 2 Assistant Professor 1&2 Selvam

More information

Data Center Network Topologies: FatTree

Data Center Network Topologies: FatTree Data Center Network Topologies: FatTree Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and Networking September 22, 2014 Slides used and adapted judiciously

More information

A Reliability Analysis of Datacenter Topologies

A Reliability Analysis of Datacenter Topologies A Reliability Analysis of Datacenter Topologies Rodrigo S. Couto, Miguel Elias M. Campista, and Luís Henrique M. K. Costa Universidade Federal do Rio de Janeiro - PEE/COPPE/GTA - DEL/POLI Email:{souza,miguel,luish}@gta.ufrj.br

More information

Data Center Network Architectures

Data Center Network Architectures Servers Servers Servers Data Center Network Architectures Juha Salo Aalto University School of Science and Technology juha.salo@aalto.fi Abstract Data centers have become increasingly essential part of

More information

Data Center Network Topologies: VL2 (Virtual Layer 2)

Data Center Network Topologies: VL2 (Virtual Layer 2) Data Center Network Topologies: VL2 (Virtual Layer 2) Hakim Weatherspoon Assistant Professor, Dept of Computer cience C 5413: High Performance ystems and Networking eptember 26, 2014 lides used and adapted

More information

OpenFlow based Load Balancing for Fat-Tree Networks with Multipath Support

OpenFlow based Load Balancing for Fat-Tree Networks with Multipath Support OpenFlow based Load Balancing for Fat-Tree Networks with Multipath Support Yu Li and Deng Pan Florida International University Miami, FL Abstract Data center networks are designed for satisfying the data

More information

Enabling Flow-based Routing Control in Data Center Networks using Probe and ECMP

Enabling Flow-based Routing Control in Data Center Networks using Probe and ECMP IEEE INFOCOM 2011 Workshop on Cloud Computing Enabling Flow-based Routing Control in Data Center Networks using Probe and ECMP Kang Xi, Yulei Liu and H. Jonathan Chao Polytechnic Institute of New York

More information

TinyFlow: Breaking Elephants Down Into Mice in Data Center Networks

TinyFlow: Breaking Elephants Down Into Mice in Data Center Networks TinyFlow: Breaking Elephants Down Into Mice in Data Center Networks Hong Xu, Baochun Li henry.xu@cityu.edu.hk, bli@ece.toronto.edu Department of Computer Science, City University of Hong Kong Department

More information

On Tackling Virtual Data Center Embedding Problem

On Tackling Virtual Data Center Embedding Problem On Tackling Virtual Data Center Embedding Problem Md Golam Rabbani, Rafael Esteves, Maxim Podlesny, Gwendal Simon Lisandro Zambenedetti Granville, Raouf Boutaba D.R. Cheriton School of Computer Science,

More information

Load Balancing in Data Center Networks

Load Balancing in Data Center Networks Load Balancing in Data Center Networks Henry Xu Computer Science City University of Hong Kong HKUST, March 2, 2015 Background Aggregator Aggregator Aggregator Worker Worker Worker Worker Low latency for

More information

Resolving Packet Loss in a Computer Centre Applications

Resolving Packet Loss in a Computer Centre Applications International Journal of Computer Applications (975 8887) olume 74 No., July 3 Resolving Packet Loss in a Computer Centre Applications M. Rajalakshmi C.Angel K. M. Brindha Shree ABSTRACT The modern data

More information

On the effect of forwarding table size on SDN network utilization

On the effect of forwarding table size on SDN network utilization IBM Haifa Research Lab On the effect of forwarding table size on SDN network utilization Rami Cohen IBM Haifa Research Lab Liane Lewin Eytan Yahoo Research, Haifa Seffi Naor CS Technion, Israel Danny Raz

More information

Software Defined Networking-based Traffic Engineering for Data Center Networks

Software Defined Networking-based Traffic Engineering for Data Center Networks Software Defined Networking-based Traffic Engineering for Data Center Networks Yoonseon Han, Sin-seok Seo, Jian Li, Jonghwan Hyun, Jae-Hyoung Yoo, James Won-Ki Hong Division of IT Convergence Engineering,

More information

Tech Report TR-WP3-6-2.9.2013 Analyzing Virtualized Datacenter Hadoop Deployments Version 1.0

Tech Report TR-WP3-6-2.9.2013 Analyzing Virtualized Datacenter Hadoop Deployments Version 1.0 Longitudinal Analytics of Web Archive data European Commission Seventh Framework Programme Call: FP7-ICT-2009-5, Activity: ICT-2009.1.6 Contract No: 258105 Tech Report TR-WP3-6-2.9.2013 Analyzing Virtualized

More information

Data Center Network Structure using Hybrid Optoelectronic Routers

Data Center Network Structure using Hybrid Optoelectronic Routers Data Center Network Structure using Hybrid Optoelectronic Routers Yuichi Ohsita, and Masayuki Murata Graduate School of Information Science and Technology, Osaka University Osaka, Japan {y-ohsita, murata}@ist.osaka-u.ac.jp

More information

A Hybrid Electrical and Optical Networking Topology of Data Center for Big Data Network

A Hybrid Electrical and Optical Networking Topology of Data Center for Big Data Network ASEE 2014 Zone I Conference, April 3-5, 2014, University of Bridgeport, Bridgpeort, CT, USA A Hybrid Electrical and Optical Networking Topology of Data Center for Big Data Network Mohammad Naimur Rahman

More information

Depth-First Worst-Fit Search based Multipath Routing for Data Center Networks

Depth-First Worst-Fit Search based Multipath Routing for Data Center Networks Depth-First Worst-Fit Search based Multipath Routing for Data Center Networks Tosmate Cheocherngngarn, Hao Jin, Jean Andrian, Deng Pan, and Jason Liu Florida International University Miami, FL Abstract

More information

Wireless Link Scheduling for Data Center Networks

Wireless Link Scheduling for Data Center Networks Wireless Link Scheduling for Data Center Networks Yong Cui Tsinghua University Beijing, 10084, P.R.China cuiyong@tsinghua.edu.cn Hongyi Wang Tsinghua University Beijing, 10084, P.R.China wanghongyi09@mails.

More information

Energy Optimizations for Data Center Network: Formulation and its Solution

Energy Optimizations for Data Center Network: Formulation and its Solution Energy Optimizations for Data Center Network: Formulation and its Solution Shuo Fang, Hui Li, Chuan Heng Foh, Yonggang Wen School of Computer Engineering Nanyang Technological University Singapore Khin

More information

Bandwidth Allocation in a Network Virtualization Environment

Bandwidth Allocation in a Network Virtualization Environment Bandwidth Allocation in a Network Virtualization Environment Juan Felipe Botero jfbotero@entel.upc.edu Xavier Hesselbach xavierh@entel.upc.edu Department of Telematics Technical University of Catalonia

More information

Multipath TCP in Data Centres (work in progress)

Multipath TCP in Data Centres (work in progress) Multipath TCP in Data Centres (work in progress) Costin Raiciu Joint work with Christopher Pluntke, Adam Greenhalgh, Sebastien Barre, Mark Handley, Damon Wischik Data Centre Trends Cloud services are driving

More information

Data Center Networking with Multipath TCP

Data Center Networking with Multipath TCP Data Center Networking with Costin Raiciu, Christopher Pluntke, Sebastien Barre, Adam Greenhalgh, Damon Wischik, Mark Handley University College London, Universite Catholique de Louvain ABSTRACT Recently

More information

IMPACT OF DISTRIBUTED SYSTEMS IN MANAGING CLOUD APPLICATION

IMPACT OF DISTRIBUTED SYSTEMS IN MANAGING CLOUD APPLICATION INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND SCIENCE IMPACT OF DISTRIBUTED SYSTEMS IN MANAGING CLOUD APPLICATION N.Vijaya Sunder Sagar 1, M.Dileep Kumar 2, M.Nagesh 3, Lunavath Gandhi

More information

Load Balancing and Switch Scheduling

Load Balancing and Switch Scheduling EE384Y Project Final Report Load Balancing and Switch Scheduling Xiangheng Liu Department of Electrical Engineering Stanford University, Stanford CA 94305 Email: liuxh@systems.stanford.edu Abstract Load

More information

Advanced Computer Networks. Datacenter Network Fabric

Advanced Computer Networks. Datacenter Network Fabric Advanced Computer Networks 263 3501 00 Datacenter Network Fabric Patrick Stuedi Spring Semester 2014 Oriana Riva, Department of Computer Science ETH Zürich 1 Outline Last week Today Supercomputer networking

More information

Computer Networks COSC 6377

Computer Networks COSC 6377 Computer Networks COSC 6377 Lecture 25 Fall 2011 November 30, 2011 1 Announcements Grades will be sent to each student for verificagon P2 deadline extended 2 Large- scale computagon Search Engine Tasks

More information

Scaling 10Gb/s Clustering at Wire-Speed

Scaling 10Gb/s Clustering at Wire-Speed Scaling 10Gb/s Clustering at Wire-Speed InfiniBand offers cost-effective wire-speed scaling with deterministic performance Mellanox Technologies Inc. 2900 Stender Way, Santa Clara, CA 95054 Tel: 408-970-3400

More information

On Tackling Virtual Data Center Embedding Problem

On Tackling Virtual Data Center Embedding Problem On Tackling Virtual Data Center Embedding Problem Md Golam Rabbani, Rafael Pereira Esteves, Maxim Podlesny, Gwendal Simon Lisandro Zambenedetti Granville, Raouf Boutaba D.R. Cheriton School of Computer

More information

Diamond: An Improved Fat-tree Architecture for Largescale

Diamond: An Improved Fat-tree Architecture for Largescale Diamond: An Improved Fat-tree Architecture for Largescale Data Centers Yantao Sun, Jing Chen, Qiang Liu, and Weiwei Fang School of Computer and Information Technology, Beijing Jiaotong University, Beijng

More information

Data Center Netwokring with Multipath TCP

Data Center Netwokring with Multipath TCP UCL DEPARTMENT OF COMPUTER SCIENCE Research Note RN/1/3 Data Center Netwokring with 1/7/21 Costin Raiciu Christopher Pluntke Sebastien Barre Adam Greenhalgh Damon Wischik Mark Handley Abstract Data centre

More information

PCube: Improving Power Efficiency in Data Center Networks

PCube: Improving Power Efficiency in Data Center Networks PCube: Improving Power Efficiency in Data Center Networks Lei Huang, Qin Jia, Xin Wang Fudan University Shanghai, China 08300240053@fudan.edu.cn 08300240080@fudan.edu.cn xinw@fudan.edu.cn Shuang Yang Stanford

More information

Decentralized Task-Aware Scheduling for Data Center Networks

Decentralized Task-Aware Scheduling for Data Center Networks Decentralized Task-Aware Scheduling for Data Center Networks Fahad R. Dogar, Thomas Karagiannis, Hitesh Ballani, Ant Rowstron Presented by Eric Dong (yd2dong) October 30, 2015 Tasks in data centers Applications

More information

102 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEBRUARY 2011

102 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEBRUARY 2011 102 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEBRUARY 2011 Scalable and Cost-Effective Interconnection of Data-Center Servers Using Dual Server Ports Dan Li, Member, IEEE, Chuanxiong Guo, Haitao

More information

Data Center Load Balancing. 11.11.2015 Kristian Hartikainen

Data Center Load Balancing. 11.11.2015 Kristian Hartikainen Data Center Load Balancing 11.11.2015 Kristian Hartikainen Load Balancing in Computing Efficient distribution of the workload across the available computing resources Distributing computation over multiple

More information

Stability of QOS. Avinash Varadarajan, Subhransu Maji {avinash,smaji}@cs.berkeley.edu

Stability of QOS. Avinash Varadarajan, Subhransu Maji {avinash,smaji}@cs.berkeley.edu Stability of QOS Avinash Varadarajan, Subhransu Maji {avinash,smaji}@cs.berkeley.edu Abstract Given a choice between two services, rest of the things being equal, it is natural to prefer the one with more

More information

Data Center Networking with Multipath TCP

Data Center Networking with Multipath TCP Data Center Networking with Multipath TCP Costin Raiciu, Christopher Pluntke, Sebastien Barre, Adam Greenhalgh, Damon Wischik, Mark Handley Hotnets 2010 報 告 者 : 莊 延 安 Outline Introduction Analysis Conclusion

More information

Xiaoqiao Meng, Vasileios Pappas, Li Zhang IBM T.J. Watson Research Center Presented by: Payman Khani

Xiaoqiao Meng, Vasileios Pappas, Li Zhang IBM T.J. Watson Research Center Presented by: Payman Khani Improving the Scalability of Data Center Networks with Traffic-aware Virtual Machine Placement Xiaoqiao Meng, Vasileios Pappas, Li Zhang IBM T.J. Watson Research Center Presented by: Payman Khani Overview:

More information

QUALITY OF SERVICE METRICS FOR DATA TRANSMISSION IN MESH TOPOLOGIES

QUALITY OF SERVICE METRICS FOR DATA TRANSMISSION IN MESH TOPOLOGIES QUALITY OF SERVICE METRICS FOR DATA TRANSMISSION IN MESH TOPOLOGIES SWATHI NANDURI * ZAHOOR-UL-HUQ * Master of Technology, Associate Professor, G. Pulla Reddy Engineering College, G. Pulla Reddy Engineering

More information

CROSS LAYER BASED MULTIPATH ROUTING FOR LOAD BALANCING

CROSS LAYER BASED MULTIPATH ROUTING FOR LOAD BALANCING CHAPTER 6 CROSS LAYER BASED MULTIPATH ROUTING FOR LOAD BALANCING 6.1 INTRODUCTION The technical challenges in WMNs are load balancing, optimal routing, fairness, network auto-configuration and mobility

More information

AIN: A Blueprint for an All-IP Data Center Network

AIN: A Blueprint for an All-IP Data Center Network AIN: A Blueprint for an All-IP Data Center Network Vasileios Pappas Hani Jamjoom Dan Williams IBM T. J. Watson Research Center, Yorktown Heights, NY Abstract With both Ethernet and IP powering Data Center

More information

Performance of networks containing both MaxNet and SumNet links

Performance of networks containing both MaxNet and SumNet links Performance of networks containing both MaxNet and SumNet links Lachlan L. H. Andrew and Bartek P. Wydrowski Abstract Both MaxNet and SumNet are distributed congestion control architectures suitable for

More information

Load Balanced Optical-Network-Unit (ONU) Placement Algorithm in Wireless-Optical Broadband Access Networks

Load Balanced Optical-Network-Unit (ONU) Placement Algorithm in Wireless-Optical Broadband Access Networks Load Balanced Optical-Network-Unit (ONU Placement Algorithm in Wireless-Optical Broadband Access Networks Bing Li, Yejun Liu, and Lei Guo Abstract With the broadband services increasing, such as video

More information

DESIGN AND DEVELOPMENT OF LOAD SHARING MULTIPATH ROUTING PROTCOL FOR MOBILE AD HOC NETWORKS

DESIGN AND DEVELOPMENT OF LOAD SHARING MULTIPATH ROUTING PROTCOL FOR MOBILE AD HOC NETWORKS DESIGN AND DEVELOPMENT OF LOAD SHARING MULTIPATH ROUTING PROTCOL FOR MOBILE AD HOC NETWORKS K.V. Narayanaswamy 1, C.H. Subbarao 2 1 Professor, Head Division of TLL, MSRUAS, Bangalore, INDIA, 2 Associate

More information

Hedera: Dynamic Flow Scheduling for Data Center Networks

Hedera: Dynamic Flow Scheduling for Data Center Networks Hedera: Dynamic Flow Scheduling for Data Center Networks Mohammad Al-Fares Sivasankar Radhakrishnan Barath Raghavan * Nelson Huang Amin Vahdat UC San Diego * Williams College - USENIX NSDI 2010 - Motivation!"#$%&'($)*

More information

On implementation of DCTCP on three tier and fat tree data center network topologies

On implementation of DCTCP on three tier and fat tree data center network topologies DOI 10.1186/s40064-016-2454-4 RESEARCH Open Access On implementation of DCTCP on three tier and fat tree data center network topologies Saima Zafar 1*, Abeer Bashir 1 and Shafique Ahmad Chaudhry 2 *Correspondence:

More information

Non-blocking Switching in the Cloud Computing Era

Non-blocking Switching in the Cloud Computing Era Non-blocking Switching in the Cloud Computing Era Contents 1 Foreword... 3 2 Networks Must Go With the Flow in the Cloud Computing Era... 3 3 Fat-tree Architecture Achieves a Non-blocking Data Center Network...

More information

Quality of Service Routing Network and Performance Evaluation*

Quality of Service Routing Network and Performance Evaluation* Quality of Service Routing Network and Performance Evaluation* Shen Lin, Cui Yong, Xu Ming-wei, and Xu Ke Department of Computer Science, Tsinghua University, Beijing, P.R.China, 100084 {shenlin, cy, xmw,

More information

Investigation and Comparison of MPLS QoS Solution and Differentiated Services QoS Solutions

Investigation and Comparison of MPLS QoS Solution and Differentiated Services QoS Solutions Investigation and Comparison of MPLS QoS Solution and Differentiated Services QoS Solutions Steve Gennaoui, Jianhua Yin, Samuel Swinton, and * Vasil Hnatyshin Department of Computer Science Rowan University

More information

Evaluating the Impact of Data Center Network Architectures on Application Performance in Virtualized Environments

Evaluating the Impact of Data Center Network Architectures on Application Performance in Virtualized Environments Evaluating the Impact of Data Center Network Architectures on Application Performance in Virtualized Environments Yueping Zhang NEC Labs America, Inc. Princeton, NJ 854, USA Email: yueping@nec-labs.com

More information

A Comparative Study of Data Center Network Architectures

A Comparative Study of Data Center Network Architectures A Comparative Study of Data Center Network Architectures Kashif Bilal Fargo, ND 58108, USA Kashif.Bilal@ndsu.edu Limin Zhang Fargo, ND 58108, USA limin.zhang@ndsu.edu Nasro Min-Allah COMSATS Institute

More information

Advanced Computer Networks. Scheduling

Advanced Computer Networks. Scheduling Oriana Riva, Department of Computer Science ETH Zürich Advanced Computer Networks 263-3501-00 Scheduling Patrick Stuedi, Qin Yin and Timothy Roscoe Spring Semester 2015 Outline Last time Load balancing

More information

Longer is Better? Exploiting Path Diversity in Data Centre Networks

Longer is Better? Exploiting Path Diversity in Data Centre Networks Longer is Better? Exploiting Path Diversity in Data Centre Networks Fung Po (Posco) Tso, Gregg Hamilton, Rene Weber, Colin S. Perkins and Dimitrios P. Pezaros University of Glasgow Cloud Data Centres Are

More information

Outline. VL2: A Scalable and Flexible Data Center Network. Problem. Introduction 11/26/2012

Outline. VL2: A Scalable and Flexible Data Center Network. Problem. Introduction 11/26/2012 VL2: A Scalable and Flexible Data Center Network 15744: Computer Networks, Fall 2012 Presented by Naveen Chekuri Outline Introduction Solution Approach Design Decisions Addressing and Routing Evaluation

More information

[Yawen comments]: The first author Ruoyan Liu is a visting student coming from our collaborator, A/Prof. Huaxi Gu s research group, Xidian

[Yawen comments]: The first author Ruoyan Liu is a visting student coming from our collaborator, A/Prof. Huaxi Gu s research group, Xidian [Yawen comments]: The first author Ruoyan Liu is a visting student coming from our collaborator, A/Prof. Huaxi Gu s research group, Xidian Universtity. He stays in the University of Otago for 2 mongths

More information

Demand-Aware Flow Allocation in Data Center Networks

Demand-Aware Flow Allocation in Data Center Networks Demand-Aware Flow Allocation in Data Center Networks Dmitriy Kuptsov Aalto University/HIIT Espoo, Finland dmitriy.kuptsov@hiit.fi Boris Nechaev Aalto University/HIIT Espoo, Finland boris.nechaev@hiit.fi

More information

A Cloud Data Center Optimization Approach Using Dynamic Data Interchanges

A Cloud Data Center Optimization Approach Using Dynamic Data Interchanges A Cloud Data Center Optimization Approach Using Dynamic Data Interchanges Efstratios Rappos Institute for Information and Communication Technologies, Haute Ecole d Ingénierie et de Geston du Canton de

More information

Shareability and Locality Aware Scheduling Algorithm in Hadoop for Mobile Cloud Computing

Shareability and Locality Aware Scheduling Algorithm in Hadoop for Mobile Cloud Computing Shareability and Locality Aware Scheduling Algorithm in Hadoop for Mobile Cloud Computing Hsin-Wen Wei 1,2, Che-Wei Hsu 2, Tin-Yu Wu 3, Wei-Tsong Lee 1 1 Department of Electrical Engineering, Tamkang University

More information

Multi-layer Structure of Data Center Based on Steiner Triple System

Multi-layer Structure of Data Center Based on Steiner Triple System Journal of Computational Information Systems 9: 11 (2013) 4371 4378 Available at http://www.jofcis.com Multi-layer Structure of Data Center Based on Steiner Triple System Jianfei ZHANG 1, Zhiyi FANG 1,

More information

A Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems

A Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems A Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Aysan Rasooli Department of Computing and Software McMaster University Hamilton, Canada Email: rasooa@mcmaster.ca Douglas G. Down

More information

Hadoop Technology for Flow Analysis of the Internet Traffic

Hadoop Technology for Flow Analysis of the Internet Traffic Hadoop Technology for Flow Analysis of the Internet Traffic Rakshitha Kiran P PG Scholar, Dept. of C.S, Shree Devi Institute of Technology, Mangalore, Karnataka, India ABSTRACT: Flow analysis of the internet

More information

Scafida: A Scale-Free Network Inspired Data Center Architecture

Scafida: A Scale-Free Network Inspired Data Center Architecture Scafida: A Scale-Free Network Inspired Data Center Architecture László Gyarmati, Tuan Anh Trinh Network Economics Group Department of Telecommunications and Media Informatics Budapest University of Technology

More information

Efficient Data Replication Scheme based on Hadoop Distributed File System

Efficient Data Replication Scheme based on Hadoop Distributed File System , pp. 177-186 http://dx.doi.org/10.14257/ijseia.2015.9.12.16 Efficient Data Replication Scheme based on Hadoop Distributed File System Jungha Lee 1, Jaehwa Chung 2 and Daewon Lee 3* 1 Division of Supercomputing,

More information

Experimental Framework for Mobile Cloud Computing System

Experimental Framework for Mobile Cloud Computing System Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 00 (2015) 000 000 www.elsevier.com/locate/procedia First International Workshop on Mobile Cloud Computing Systems, Management,

More information

Green Routing in Data Center Network: Modeling and Algorithm Design

Green Routing in Data Center Network: Modeling and Algorithm Design Green Routing in Data Center Network: Modeling and Algorithm Design Yunfei Shang, Dan Li, Mingwei Xu Tsinghua University Beijing China, {shangyunfei, lidan, xmw}@csnet1.cs.tsinghua.edu.cn ABSTRACT The

More information

Transport layer issues in ad hoc wireless networks Dmitrij Lagutin, dlagutin@cc.hut.fi

Transport layer issues in ad hoc wireless networks Dmitrij Lagutin, dlagutin@cc.hut.fi Transport layer issues in ad hoc wireless networks Dmitrij Lagutin, dlagutin@cc.hut.fi 1. Introduction Ad hoc wireless networks pose a big challenge for transport layer protocol and transport layer protocols

More information

Performance Analysis of AQM Schemes in Wired and Wireless Networks based on TCP flow

Performance Analysis of AQM Schemes in Wired and Wireless Networks based on TCP flow International Journal of Soft Computing and Engineering (IJSCE) Performance Analysis of AQM Schemes in Wired and Wireless Networks based on TCP flow Abdullah Al Masud, Hossain Md. Shamim, Amina Akhter

More information

VIRTUALIZATION [1] has been touted as a revolutionary

VIRTUALIZATION [1] has been touted as a revolutionary Supporting Seamless Virtual Machine Migration via Named Data Networking in Cloud Data Center Ruitao Xie, Yonggang Wen, Xiaohua Jia, Fellow, IEEE and Haiyong Xie 1 Abstract Virtual machine migration has

More information

Chapter 4. VoIP Metric based Traffic Engineering to Support the Service Quality over the Internet (Inter-domain IP network)

Chapter 4. VoIP Metric based Traffic Engineering to Support the Service Quality over the Internet (Inter-domain IP network) Chapter 4 VoIP Metric based Traffic Engineering to Support the Service Quality over the Internet (Inter-domain IP network) 4.1 Introduction Traffic Engineering can be defined as a task of mapping traffic

More information

Path Selection Methods for Localized Quality of Service Routing

Path Selection Methods for Localized Quality of Service Routing Path Selection Methods for Localized Quality of Service Routing Xin Yuan and Arif Saifee Department of Computer Science, Florida State University, Tallahassee, FL Abstract Localized Quality of Service

More information

Multi-service Load Balancing in a Heterogeneous Network with Vertical Handover

Multi-service Load Balancing in a Heterogeneous Network with Vertical Handover 1 Multi-service Load Balancing in a Heterogeneous Network with Vertical Handover Jie Xu, Member, IEEE, Yuming Jiang, Member, IEEE, and Andrew Perkis, Member, IEEE Abstract In this paper we investigate

More information

CURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING

CURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING Journal homepage: http://www.journalijar.com INTERNATIONAL JOURNAL OF ADVANCED RESEARCH RESEARCH ARTICLE CURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING R.Kohila

More information

The QoS of the Edge Router based on DiffServ

The QoS of the Edge Router based on DiffServ The QoS of the Edge Router based on DiffServ Zhang Nan 1, Mao Pengxuan 1, Xiao Yang 1, Kiseon Kim 2 1 Institute of Information and Science, Beijing Jiaotong University, Beijing 100044, China 2 Dept. of

More information

Multipath TCP design, and application to data centers. Damon Wischik, Mark Handley, Costin Raiciu, Christopher Pluntke

Multipath TCP design, and application to data centers. Damon Wischik, Mark Handley, Costin Raiciu, Christopher Pluntke Multipath TCP design, and application to data centers Damon Wischik, Mark Handley, Costin Raiciu, Christopher Pluntke Packet switching pools circuits. Multipath pools links : it is Packet Switching 2.0.

More information

A Routing Metric for Load-Balancing in Wireless Mesh Networks

A Routing Metric for Load-Balancing in Wireless Mesh Networks A Routing Metric for Load-Balancing in Wireless Mesh Networks Liang Ma and Mieso K. Denko Department of Computing and Information Science University of Guelph, Guelph, Ontario, Canada, N1G 2W1 email: {lma02;mdenko}@uoguelph.ca

More information

Dynamic Congestion-Based Load Balanced Routing in Optical Burst-Switched Networks

Dynamic Congestion-Based Load Balanced Routing in Optical Burst-Switched Networks Dynamic Congestion-Based Load Balanced Routing in Optical Burst-Switched Networks Guru P.V. Thodime, Vinod M. Vokkarane, and Jason P. Jue The University of Texas at Dallas, Richardson, TX 75083-0688 vgt015000,

More information

Implementing Replication for Predictability within Apache Thrift

Implementing Replication for Predictability within Apache Thrift Implementing Replication for Predictability within Apache Thrift Jianwei Tu The Ohio State University tu.118@osu.edu ABSTRACT Interactive applications, such as search, social networking and retail, hosted

More information

Link-State Routing Can Achieve Optimal Traffic Engineering: From Entropy To IP

Link-State Routing Can Achieve Optimal Traffic Engineering: From Entropy To IP Link-State Routing Can Achieve Optimal Traffic Engineering: From Entropy To IP Dahai Xu, Ph.D. Florham Park AT&T Labs - Research Joint work with Mung Chiang and Jennifer Rexford (Princeton University)

More information

Falloc: Fair Network Bandwidth Allocation in IaaS Datacenters via a Bargaining Game Approach

Falloc: Fair Network Bandwidth Allocation in IaaS Datacenters via a Bargaining Game Approach Falloc: Fair Network Bandwidth Allocation in IaaS Datacenters via a Bargaining Game Approach Fangming Liu 1,2 In collaboration with Jian Guo 1,2, Haowen Tang 1,2, Yingnan Lian 1,2, Hai Jin 2 and John C.S.

More information

Adaptive Routing for Layer-2 Load Balancing in Data Center Networks

Adaptive Routing for Layer-2 Load Balancing in Data Center Networks Adaptive Routing for Layer-2 Load Balancing in Data Center Networks Renuga Kanagavelu, 2 Bu-Sung Lee, Francis, 3 Vasanth Ragavendran, Khin Mi Mi Aung,* Corresponding Author Data Storage Institute, Singapore.E-mail:

More information

Comparative Analysis of Congestion Control Algorithms Using ns-2

Comparative Analysis of Congestion Control Algorithms Using ns-2 www.ijcsi.org 89 Comparative Analysis of Congestion Control Algorithms Using ns-2 Sanjeev Patel 1, P. K. Gupta 2, Arjun Garg 3, Prateek Mehrotra 4 and Manish Chhabra 5 1 Deptt. of Computer Sc. & Engg,

More information

AN OVERVIEW OF QUALITY OF SERVICE COMPUTER NETWORK

AN OVERVIEW OF QUALITY OF SERVICE COMPUTER NETWORK Abstract AN OVERVIEW OF QUALITY OF SERVICE COMPUTER NETWORK Mrs. Amandeep Kaur, Assistant Professor, Department of Computer Application, Apeejay Institute of Management, Ramamandi, Jalandhar-144001, Punjab,

More information

Minimizing Disaster Backup Window for Geo-Distributed Multi-Datacenter Cloud Systems

Minimizing Disaster Backup Window for Geo-Distributed Multi-Datacenter Cloud Systems Minimizing Disaster Backup Window for Geo-Distributed Multi-Datacenter Cloud Systems Jingjing Yao, Ping Lu, Zuqing Zhu School of Information Science and Technology University of Science and Technology

More information

2013 IEEE 14th International Conference on High Performance Switching and Routing

2013 IEEE 14th International Conference on High Performance Switching and Routing 03 IEEE 4th International Conference on High Performance Switching and Routing Cost and Delay Tradeoff in Three-Stage Switch Architecture for Data Center Networks Shu Fu, Bin Wu, Xiaohong Jiang *, Achille

More information

Improving Flow Completion Time for Short Flows in Datacenter Networks

Improving Flow Completion Time for Short Flows in Datacenter Networks Improving Flow Completion Time for Short Flows in Datacenter Networks Sijo Joy, Amiya Nayak School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, Canada {sjoy028, nayak}@uottawa.ca

More information

Network-Aware Scheduling of MapReduce Framework on Distributed Clusters over High Speed Networks

Network-Aware Scheduling of MapReduce Framework on Distributed Clusters over High Speed Networks Network-Aware Scheduling of MapReduce Framework on Distributed Clusters over High Speed Networks Praveenkumar Kondikoppa, Chui-Hui Chiu, Cheng Cui, Lin Xue and Seung-Jong Park Department of Computer Science,

More information

Influence of Load Balancing on Quality of Real Time Data Transmission*

Influence of Load Balancing on Quality of Real Time Data Transmission* SERBIAN JOURNAL OF ELECTRICAL ENGINEERING Vol. 6, No. 3, December 2009, 515-524 UDK: 004.738.2 Influence of Load Balancing on Quality of Real Time Data Transmission* Nataša Maksić 1,a, Petar Knežević 2,

More information

Generalized DCell Structure for Load-Balanced Data Center Networks

Generalized DCell Structure for Load-Balanced Data Center Networks Generalized DCell Structure for Load-Balanced Data Center Networks Markus Kliegl, Jason Lee,JunLi, Xinchao Zhang, Chuanxiong Guo,DavidRincón Swarthmore College, Duke University, Fudan University, Shanghai

More information

Topology Switching for Data Center Networks

Topology Switching for Data Center Networks Topology Switching for Data Center Networks Kevin C. Webb, Alex C. Snoeren, and Kenneth Yocum UC San Diego Abstract Emerging data-center network designs seek to provide physical topologies with high bandwidth,

More information

Fairness in Routing and Load Balancing

Fairness in Routing and Load Balancing Fairness in Routing and Load Balancing Jon Kleinberg Yuval Rabani Éva Tardos Abstract We consider the issue of network routing subject to explicit fairness conditions. The optimization of fairness criteria

More information

Residual Traffic Based Task Scheduling in Hadoop

Residual Traffic Based Task Scheduling in Hadoop Residual Traffic Based Task Scheduling in Hadoop Daichi Tanaka University of Tsukuba Graduate School of Library, Information and Media Studies Tsukuba, Japan e-mail: s1421593@u.tsukuba.ac.jp Masatoshi

More information

Quality of Service Routing for Supporting Multimedia Applications

Quality of Service Routing for Supporting Multimedia Applications Quality of Service Routing for Supporting Multimedia Applications Zheng Wang and Jon Crowcroft Department of Computer Science, University College London Gower Street, London WC1E 6BT, United Kingdom ABSTRACT

More information

3D On-chip Data Center Networks Using Circuit Switches and Packet Switches

3D On-chip Data Center Networks Using Circuit Switches and Packet Switches 3D On-chip Data Center Networks Using Circuit Switches and Packet Switches Takahide Ikeda Yuichi Ohsita, and Masayuki Murata Graduate School of Information Science and Technology, Osaka University Osaka,

More information

CIT 668: System Architecture

CIT 668: System Architecture CIT 668: System Architecture Data Centers II Topics 1. Containers 2. Data Center Network 3. Reliability 4. Economics Containers 1 Containers Data Center in a shipping container. 4-10X normal data center

More information

Paolo Costa costa@imperial.ac.uk

Paolo Costa costa@imperial.ac.uk joint work with Ant Rowstron, Austin Donnelly, and Greg O Shea (MSR Cambridge) Hussam Abu-Libdeh, Simon Schubert (Interns) Paolo Costa costa@imperial.ac.uk Paolo Costa CamCube - Rethinking the Data Center

More information

Router Scheduling Configuration Based on the Maximization of Benefit and Carried Best Effort Traffic

Router Scheduling Configuration Based on the Maximization of Benefit and Carried Best Effort Traffic Telecommunication Systems 24:2 4, 275 292, 2003 2003 Kluwer Academic Publishers. Manufactured in The Netherlands. Router Scheduling Configuration Based on the Maximization of Benefit and Carried Best Effort

More information

Operating Systems. Cloud Computing and Data Centers

Operating Systems. Cloud Computing and Data Centers Operating ystems Fall 2014 Cloud Computing and Data Centers Myungjin Lee myungjin.lee@ed.ac.uk 2 Google data center locations 3 A closer look 4 Inside data center 5 A datacenter has 50-250 containers A

More information

A Network Flow Approach in Cloud Computing

A Network Flow Approach in Cloud Computing 1 A Network Flow Approach in Cloud Computing Soheil Feizi, Amy Zhang, Muriel Médard RLE at MIT Abstract In this paper, by using network flow principles, we propose algorithms to address various challenges

More information

RDCM: Reliable Data Center Multicast

RDCM: Reliable Data Center Multicast This paper was presented as part of the Mini-Conference at IEEE INFOCOM 2011 RDCM: Reliable Data Center Multicast Dan Li, Mingwei Xu, Ming-chen Zhao, Chuanxiong Guo, Yongguang Zhang, Min-you Wu Tsinghua

More information

Towards Predictable Datacenter Networks

Towards Predictable Datacenter Networks Towards Predictable Datacenter Networks Hitesh Ballani, Paolo Costa, Thomas Karagiannis and Ant Rowstron Microsoft Research, Cambridge This talk is about Guaranteeing network performance for tenants in

More information