MANY bigdata computing applications have been deployed


 Lucy Hall
 1 years ago
 Views:
Transcription
1 1 Data Transfer Scheduling for Maximizing Throughput of BigData Computing in Cloud Systems Ruitao Xie and Xiaohua Jia, Fellow, IEEE Abstract Many bigdata computing applications have been deployed in cloud platforms. These applications normally demand concurrent data transfers among computing nodes for parallel processing. It is important to find the best transfer scheduling leading to the least data retrieval time the maximum throughput in other words. However, the existing methods cannot achieve this, because they ignore link bandwidths and the diversity of data replicas and paths. In this paper, we aim to develop a maxthroughput data transfer scheduling to minimize the data retrieval time of applications. Specifically, the problem is formulated into mixed integer programming, and an approximation algorithm is proposed, with its approximation ratio analyzed. The extensive simulations demonstrate that our algorithm can obtain near optimal solutions. Index Terms data transfer scheduling, bigdata computing, throughput maximization, data center. 1 INTRODUCTION MANY bigdata computing applications have been deployed in cloud platforms, e.g., Amazon s Elastic Compute Cloud (EC2), Windows Azure, IBM Cloud etc. In bigdata computing under MapReduce framework [1], tasks run on computing nodes in parallel. But, the data may not be stored in the same nodes as they are processed for a variety of reasons. For instance, when those nodes have insufficient computing capacity, or when they are not preferred by other objectives (e.g., load balancing and energy saving). In Data Center Networks (DCN), data are usually replicated for redundancy and robustness, e.g., in HDFS, every data block has two replicas in addition to the original one [2]. Furthermore, from each data node, multiple paths are available for data transfer, sometimes, all of which are shortest paths, due to path redundancy in DCN [3], [4]. It is important to select the best node and the best path to retrieve a nonlocal data. This is the data retrieval problem. Different selections of nodes and paths may result in different data retrieval time. It is important to find the selection leading to the least data retrieval time, because long data retrieval time of a computing task may result in long completion time for the application to whom this task belongs. However, the existing method to retrieve data, which is used in current HDFS systems and DCN, cannot obtain the least data retrieval time. In the existing method, when a nonlocal data is required, a request is sent to any one of the closest replicas [2]. Then, the data is transferred from the selected node through any one of the shortest paths, determined by routing protocols like EqualCost Multipath Routing (ECMP) [5] or perflow Valiant Load Balancing (VLB) [4]. It is noted that many tasks are retrieving data Ruitao Xie and Xiaohua Jia are with the Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong. concurrently. This method may result in heavy congestions on some links, leading to long data retrieval time, because it ignores link bandwidths and the overlaps of selected nodes and paths. Some researchers proposed flow scheduling systems to avoid path collisions [6], [7]. However, they only exploited path diversity but not data replica diversity. To minimize the data retrieval time (i.e., to maximize the throughput) of an application consisting of concurrent tasks, we propose a maxthroughput data transfer scheduling, utilizing both replica and path diversities. In our method, the problem is formulated into mixed integer programming, and an approximation algorithm is proposed, with its approximation ratio analyzed. We also solve the data retrieval problem for the case of multiple applications. Our simulations demonstrate that the approximation results are almost as good as the optimal solutions. We also show that the availability of a small number of additional data replicas can be greatly beneficial in many cases, regardless of path diversity. The rest of this paper is organized as follows. Section 2 presents the overview and the motivation of the data retrieval problem. Section 3 presents problem formulations, for the scenarios of a single application and multiple applications separately. Section 4 presents an approximation algorithm and the analyses on its approximation ratio. The simulations and performance evaluations are presented in Section 5. Section 6 presents the related works on the data retrieval in cloud. Section 7 concludes the paper. 2 PROBLEM OVERVIEW AND MOTIVATION In cloud, although data is distributed among computing nodes, not all data can be obtained locally, so some nodes may need to retrieve data from distant nodes. A requested data can be retrieved from one of the nodes where its replica is stored. When a node is chosen for data retrieval, a path from it to the requesting node needs to be specified for
2 2 data transfer. A reasonable choice would be the shortest path (in terms of the number of hops). However, there may exist multiple shortest paths, so one of them must be selected. It is noted that we select only one node and one path for each requested data, because otherwise it would result in multipath TCP, which suffers from high jitter and is not widely deployed in DCN yet. A naive method is to select nodes and paths randomly, but it may result in heavy congestions on some links, leading to long data retrieval time, because it ignores link bandwidths and the overlaps of selected paths and nodes. For example, considering the case of a single application, in a topology as shown in Fig. 1, two data objects (a and b) are stored with replication factor of 3, and each link has the bandwidth of 1 data per second. Note that it takes at least 1 second to transfer a data between any two nodes. Suppose at the same time, node v a is about to retrieve data a and v b is about to retrieve data b, both belonging to the same application, then the naive method may result in the data retrieval time of 2 seconds; while the optimal solution only takes 1 second. The naive method has a worse performance, because both data transfers pass through some common links, becoming bottlenecks (as shown in Fig. 1). The optimal solution takes the least time, because by selecting nodes and paths both data transfers pass through disjoint sets of links (as shown in Fig. 1). This motivates us to investigate the optimal data retrieval method, where nodes and paths are selected carefully. Our objective is to select nodes and paths such that the data retrieval time of an application is minimized. The naive method also falls short when multiple applicacore switch aggregation switch ToR switch a b a b a b v a v b computing node Fig. 2. A typical topology of data center network: a fattree topology composed of 4port switches. a b a b a b v a v b Fig. 1. An example to motivate the optimal data retrieval method. Node v a is retrieving data a (red dash lines) and v b is retrieving data b (green solid lines). In both data transfers share common links, which has more traffic and may lead to longer transmission time, while in they passes through disjoint sets of links, resulting in shorter data retrieval time. tions are running in the cloud, where different applications may have different requirements, i.e., the upper bound of data retrieval time. As it may be impossible to satisfy all requirements, we minimize the penalties of applications. Thus our problem is to select nodes and paths for each required data such that the penalties of applications are minimized. 3 PROBLEM FORMULATION In this section, we formulate the data retrieval problem. We start with the case of a single application, and deal with multiple applications later. 3.1 Single Application A Data Center Network (DCN) consists of switches and computing nodes, as illustrated in Fig. 2. The DCN is represented as a graph with nodes V and edges E, i.e., G = V, E, where each edge represents a bidirectional link. This is a typical network configuration in DCN. V consists of both computing nodes V C and switch nodes V X, that is V = V C V X. Let B e denote the bandwidth on link e. Suppose an application processes a set of data objects D = {d 1, d 2,..., d k,..., d m }, which have been stored in computing nodes. For simplicity, we assume that all data objects have the same size S. A data object is replicated in multiple nodes for redundancy. Let V Ck V C denote the set of nodes that have data object d k. Let A jk denote whether v j needs data object d k to run the tasks assigned on it. If v j requires d k (i.e., A jk = 1), we have to select a node v i V Ck from which d k can be retrieved. Each selection forms a flow f k ij, which represents the data transfer of d k from v i to v j. The set of possible flows of transferring data d k to node v j is denoted by F jk. Hence, the set of all flows is F = (j,k) where A jk =1 F jk. (1) Because one and only one node is selected for each retrieval, exactly one flow in F jk can be used. Let binary variable x f denote whether or not flow f is used, then f F jk x f = 1 (j, k) where A jk = 1. (2) After selecting a flow, we have to find a path for it. To shorten data retrieval time, we use shortest paths. Let P (f k ij ) denote the set of paths that can be used to actuate f k ij, which are all shortest paths from v i to v j. Let binary variable y fp
3 3 denote whether path p is selected to actuate flow f, where p P (f), then, y fp = x f f F. (3) p P (f) Only one path will be selected if flow f is used (i.e., x f = 1), and none otherwise (i.e., x f = ). Since all flows are transferred concurrently, the data retrieval time of an application (which is the total time to complete all data transfers) is dominated by the longest data transfer time among all flows. Let t f denote the data transfer time of flow f, then data retrieval time t is computed as t = max{t f, f F }. (4) Let r f denote the sending rate of flow f, then the data transfer time of flow f equals to transmission delay as t f = S r f. (5) The other delays, such as processing delay, propagation delay and queueing delay, are all negligible, because the flows of transferring data objects are large in size, e.g., 64 MB in HDFS. Ideally, the least data retrieval time can be simply obtained by setting the sending rates of all flows to the maximum values possible. However, as multiple flows may pass through the same link, which has limited bandwidth, the sending rates of these flows may not be able to reach the maximum values simultaneously. Suppose F e is the set of all flows passing through link e, then the aggregate data rate on link e (i.e., f F e r f ) is bounded by its bandwidth as f F e r f B e. (6) Let P e denote the set of all paths passing through link e. A flow f passes through link e if its selected path passes through link e, i.e., p P (f) P e such that y fp = 1. Thus, r f = r f y fp B e. (7) f F e f F p P (f) P e Replacing r f in (7) with S/t f, we get f F p P (f) P e S t f y fp B e. (8) Note that this constraint is not linear, thus together with t f t we transform it to a linear one, f F p P (f) P e S t y fp B e. (9) This new constraint means the amount of traffic passing through each link (i.e., f F p P (f) P e S y fp ) is bounded by the maximum amount of data that can be transmitted within data retrieval time (i.e., B e t). In other words, the data retrieval time is the maximum data transmission time over all links. Theorem 1. Constraints (8) and (9) are equivalent. Proof. To prove the equivalence, we have to prove that any feasible t f to (8) and t to (4) are also feasible to (9), and vice versa. Firstly, since t t f, f F, if (8) is satisfied, then (9) is also satisfied. Secondly, for any feasible t to (9), we can easily set all t f to t, then both (4) and (8) are satisfied. To sum up, the data retrieval problem is to select nodes and paths for all requests, such that the data retrieval time is minimized. The data retrieval time is affected by those selections through the resulted amount of traffic on each link. It is formulated into an MIP as follows, min t (1a) s.t. f F jk x f = 1 (j, k) where A jk = 1 p P (f) (1b) y fp = x f f F (1c) S y fp B e t e E (1d) f F p P (f) P e x f {, 1} f F (1e) y fp {, 1} f F, p P (f) (1f) t. (1g) 3.2 Multiple Applications In cloud, multiple applications run on shared resources simultaneously to utilize resources more efficiently. To select nodes and paths for the requests of each application, a simple approach is to treat all applications as a single one and to solve the problem using the above model. This is equivalent to minimize the maximum data retrieval time among all applications. However, different applications may have different requirements on their data retrieval time, the naive approach ignores the difference of requirements. Thus, instead of minimizing data retrieval time, we minimize a penalty. Given a set of applications U, suppose application u U has an upper bound t u on its data retrieval time t u, penalty c u is defined as c u = max{ t u t u t u, }. (11) A penalty is induced if t u exceeds a threshold, no penalty otherwise. The same to the singleapplication model, t u of application u is dominated by the longest data transfer time among all its flows, that is t u = max{t f, f F u }, (12) where F u is the set of possible flows of application u. To maintain fairness among applications, we minimize the maximum penalty denoted by c, c = max{c u, u U}. (13) Thus, our problem is to select nodes and paths for the requests of each application, such that c is minimized. The same to the singleapplication model, the selections affect which flows pass through each link and the resulted aggregate data rate, restricted by the link bandwidth. But here aggregate data rate is aggregated over the flows in
4 4 all applications rather than a single one. Let re u denote the aggregate data rate of application u on link e, then re u B e. (14) u U Following the same rule of (7) re u is computed as r u e = f F u p P (f) P e r f y fp. (15) Recall that r f = S/t f, combining above two constraints we obtain S y fp B e. (16) t f u U f F u p P (f) P e Note that this constraint is not linear, thus together with (11), (13) and (12), we transform it to a linear one, u U f F u p P (f) P e S (c + 1) t u y fp B e. (17) Theorem 2. Constraints (16) and (17) are equivalent. Proof. To prove the equivalence, we have to prove that any feasible t f and c to the set of constraints (11), (12), (13) and (16) are also feasible to (17), and vice versa. Firstly, for a flow f of application u, i.e., f F u, its data transfer time t f must satisfy the following, t f t u (c u + 1) t u (c + 1) t u. (18) In above deduction, the first inequality is obtained from (12), the second inequality is obtained from (11), and the last one is obtained from (13). Thus, if all t f satisfy constraint (16), then c satisfies constraint (17). Secondly, for any maximum penalty c which satisfies (17), we can build a set of t f satisfying (16), setting t f to the maximum possible value (c + 1) t u where f F u as follows, t f = t u = (c u + 1) t u = (c + 1) t u. (19) That is, all flows of an application have the same data transfer time, being proportional to the upper bound of the application, and all applications have the same penalty. All these results satisfy (11), (13), (12) and (16). Due to the equivalence proved above, (17) possesses the same meaning as (14), that is the aggregated data rate of all applications on each link is limited by its bandwidth. The difference is that in (17) we set the data transfer time of each flow to the maximum possible value, i.e., (c + 1) t u. Besides bandwidth constraints, the same as in the case of a single application, exactly one flow can be used for each request, and exactly one path can be selected for each used flow. To sum up, for the case of multiple applications the data retrieval problem is to select nodes and paths for the requests of each application, such that the maximum penalty among all applications is minimized. It is formulated into an MIP, as follows, min c (2a) s.t. x f = 1 (j, k) where A jk = 1 (2b) f F jk y fp = x f f F (2c) p P (f) u U f F u p P (f) P e S t u y fp B e (c + 1) e E (2d) x f {, 1} f F (2e) y fp {, 1} f F, p P (f) (2f) c (2g) 3.3 Discussion on Complexity The data retrieval problems for both cases are NPhard, because even when source nodes are determined, the remaining problem to select paths is NPhard. When a source node is determined for each request, a set of commodities are formed. Here we call a triple consisting of a source, a destination, and a demand (i.e., the amount of data to be routed) a commodity. For the case of a single application, given the set of commodities and a network, then our problem is to compute the maximum value 1/t for which there is a feasible multicommodity flow in the network with all demands multiplied by 1/t, which is a concurrent flow problem [8]. Since we have the additional restriction that each commodity must be routed on one path, the problem is an unsplittable concurrent flow problem, which is NPhard [8]. It is the same for the case of multiple applications. 4 APPROXIMATION ALGORITHM We propose an approximation algorithm to solve the data retrieval problem. 4.1 Maxthroughput Algorithm Given a data retrieval problem and its MIP formulation, our algorithm has three major steps: 1) solve its LP relaxation; 2) construct an integral solution from the relaxation solution using a rounding procedure; 3) analytically compute the data sending rate of each flow for scheduling. The LP relaxation can be obtained by relaxing binary variables x f and y fp, whose optimal solution is denoted by x f and y fp. The rounding procedure constructs an integral solution x A f and ya fp from fractional solution x f and y fp, by keeping objective value changing as less as possible and two sets of constraints still satisfied: 1) oneflow constraints, that is exactly one flow is used for each request, i.e., (1b) or (2b); 2) onepath constraints, that is exactly one path is used for each flow, i.e., (1c) or (2c). We first select the flow that has the largest x f for each request to construct xa f satisfying oneflow constraints. Then we select the path that has the largest yfp for each flow in use to construct ya fp satisfying onepath constraints. With nodes and paths determined, the objective value of a data retrieval problem can be derived analytically, so
5 5 is the data sending rate of each flow. For the case of a single application, with variables x f and y fp having been determined, its MIP becomes min t (21a) s.t. S y fp B e t e E (21b) f F p P (f) P e t. (21c) Let Y denote the set of y fp, then we have t(y ) = max{ f F p P (f) P e S y fp, e E}, (22) B e that is the largest data transmission time among all links. As t f t(y ), we have r f = S t f S t(y ). To ensure bandwidth constraints, the data sending rate of each flow can be set to the same value as the slowest flow, that is, r f = S t(y ). (23) For the case of multiple applications, with variables x f and y fp having been determined, the amount of the traffic of application u on link e (denoted by β u e (Y )) is also determined as The MIP becomes β u e (Y ) = f F u p P (f) P e S y fp. (24) min c (25a) s.t. βe u (Y ) B e (c + 1) t u e E (25b) u U c. Then we have c(y ) = max{, max{ (25c) u U βu e (Y )/t u B e, e E } 1}. (26) To ensure bandwidth constraints, the data sending rate of each flow can be set as t f = (c(y ) + 1) t u, where f F u, following the analyses in Theorem 2. We analyze that our algorithm has polynomial complexity. First, it takes polynomial time to solve a LP relaxation in the first step. Since given a data retrieval problem, there are polynomial number of variables and constraints in its LP formulation, and a LP can be solved in polynomial time [9]. Second, the rounding procedure in the second step takes polynomial time. Since in this procedure each x is compared once, so is each y. Finally, the last analytical step takes O( E ) time to compute (22) or (26). 4.2 Analysis on Approximation Ratio Next, we analyze the approximation ratio of the above algorithm Single Application The above approximation algorithm has an approximation ratio of RL, where R is the replication factor (the number of replicas) of data, and L is the largest number of candidate paths that can be used by a flow. Let t A denote the objective value of an approximation solution, and (M IP ) denote the optimal value for an MIP instance, then we have In other words, t A RL (MIP ). (27) (MIP ) 1 RL ta. (28) The derivation of the approximation ratio is based on two facts: 1) the optimal value of the LP relaxation (denoted by t ) provides a lower bound for the optimal value of an MIP, that is (MIP ) t, (29) since the solution space of the LP relaxation becomes larger due to relaxation; 2) the optimal fractional solution y fp and an approximation solution y A fp satisfy y fp 1 RL ya fp. (3) This can be derived from the process of building approximation solutions. In constructing yfp A for flow f, we have y fp 1 P (f) x f y A fp, p P (f). (31) For yfp A =, y fp is always valid; for ya fp = 1, yfp 1 P (f) x f should be valid; otherwise, onepath constraints would be contradicted. This is because for flow f, path p whose yfp is the largest is selected, so if y fp where p is selected (i.e., yfp A = 1) is less than 1 P (f) x f, then any other yfp where p is not selected is also less than 1 P (f) x f, and finally p P (f) y fp < 1 p P (f) P (f) x f = x f, which contradicts the onepath constraint. Furthermore, in constructing x A f we have that for flow f being selected (i.e., x A f = 1), x f 1 R. (32) Otherwise, oneflow constraints would be contradicted. This is because for a request, flow f whose x f is the largest is selected, so if x f whose flow is selected is less than 1 R, then any other x f whose flow is not selected is also less than 1 R, and finally f F (q) x f < f F (q) 1 R = 1, which contradicts with the oneflow constraint. Recall that if yfp A is 1, then x A f must be 1, so when ya fp is 1, (32) must be satisfied. Thus, we have y fp 1 P (f) x f y A fp 1 P (f) 1 R ya fp 1 RL ya fp. (33) Based on the above two facts, we can derive the approximation ratio. Let e denote the bottleneck link having the maximum data transmission time in the optimal factional
6 6 solution, and e A denote the counterpart in an approximation solution, then we have (MIP ) t (34a) S yfp f F p P (f) P e = (34b) B e S yfp f F p P (f) P e A B e A f F p P (f) P e A B e A S 1 RL ya fp (34c) (34d) 1 = RL ta. (34e) In the above derivation, (34b) and (34e) are obtained from (22), (34c) is because e rather than e A is the bottleneck link in a fractional solution, and (34d) is from the second fact (3). Therefore, the data retrieval time obtained by the approximation algorithm is at most RL times greater than the optimal value Multiple Applications In the same manner, we can obtain that the approximation algorithm has the same approximation ratio of RL for the case of multiple applications, but now the approximation ratio affects penalty plus 1, i.e., c+1, rather than c, as follows, c A + 1 RL ((MIP ) + 1), (35) where c A is the objective value of an approximation solution, and (M IP ) is the optimal value for an MIP instance. (35) means that the worst time ratio obtained by the approximation algorithm is at most RL times greater than the optimal value. As (M IP ), RHS of (35) RL 1, thus (35) is valid if c A =. Next we prove it is valid if c A >. Let e denote the bottleneck link having the largest u U βu e (Y )/t u in the optimal fractional solution, and e A denote the counterpart in an approximation solution, and let c denote the optimal value of the LP relaxation, then we have (MIP ) c βe u (Y ) u U t u u U u U B e β u e A (Y ) t u B e A RL βu e A (Y A ) t u B e A 1 (36a) (36b) (36c) (36d) 1 = RL (ca + 1) 1. (36e) (36a) is from the fact that the optimal value of the LP relaxation provides a lower bound for the optimal value of an MIP. (36b) is obtained from (26). (36c) is because e rather than e A is the bottleneck link in the optimal fractional solution. (36d) is because βe u (Y ) 1 RL βu e (Y A ), obtained from (24) and the fact yfp 1 definition when c A >. RL ya fp. (36e) is from the 4.3 Best Approximation Result As analyzed previously, approximation results are upper bounded by approximation ratio RL and the optimal value of an MIP instance. The lower the upper bound is, the better approximation results may be. Thus, we may obtain better approximation results by reducing the upper bound. We propose a preprocessing procedure to reduce approximation ratio RL. That is, for each request we randomly select a subset of nodes as source candidates (R in total), and for each flow we randomly select a subset of paths as routing candidates (L in total), as the inputs of an MIP formulation. As such, RL is less than the value in the case without preprocessing. However, since data replicas and routing paths are selected from a subset of nodes and paths rather than the whole set, the optimal value of the current MIP instance may be worse than that in the case without preprocessing. If only one node is selected for each request as candidates, and only one path is selected for each flow as candidates, then the solution to the resulted MIP instance is obvious, which is comparable to the result of naive random selection method. Thus, with the preprocessing, as R or L decreases, the approximation ratio decreases, but the optimal value of the resulted MIP instance may increase, as illustrated in Fig. 3 where R varies and L is fixed (changes are similar when L varies and R is fixed). So there exists a pair of R and L, such that the upper bound is lowest. With this motivation, we take the preprocessing procedure with various pairs of R and L, and train the approximation algorithm on the resulted MIP instances. Then, the best pair of R and L is selected, which leads to the best approximation result. It is practical, since it takes polynomial time to run the algorithm. (MIPRL) 1 R max R R maxl Fig. 3. (MIP RL ) and approximation ratio RL may change as R varies, with L fixed, where MIP RL is an MIP instance formulated after preprocessing with parameters R and L. It is noted that some flows may be unable to have more than one candidate paths, which are all shortest paths, e.g., any two nodes in the same rack have only one shortest path in between. In this case, the preprocessing is unnecessary for those flows, and the value of L is not affected, since it is determined by the largest set of candidate paths. 4.4 Discussion on Deployment When deploying our scheduling algorithms in real systems, the data retrieval problem can be solved once an application has been distributed in computing nodes, under the bandwidth conditions at that time. Once the application starts running, its data retrieval is scheduled according to precomputed results. During runtime, bandwidth conditions may change due to finishing or starting of other applications. To L Approximation Ratio RL
7 7 4 core switchs 8 aggregation switches 32 ToR switches 128 computing nodes Fig. 4. A threetier topology of data center network. 4 core switchs 4 aggregation switches 8 ToR switches 16 computing nodes 4x1G 8 4x1G 2x1G x1G 4x1G Fig. 5. A VL2 topology of data center network. adapt to dynamic bandwidth conditions, the data retrieval problem can be resolved periodically. The length of the period depends on the tradeoff between computation overhead and how quickly the scheduling responses to changing bandwidth. 5 PERFORMANCE EVALUATION In this section, we evaluate the performance extensively and demonstrate that our algorithm can obtain near optimal solutions, with the availability of additional data replicas and abundant paths. We also show that in many cases even one additional replica can improve performance greatly. 5.1 Methods for Comparison We take two methods in comparison: 1) the optimal algorithm by solving the MIP ( for short); 2) the existing method by randomly selecting nodes and paths ( for short), where a node is chosen from all available ones. In our simulations, both the MIP and the LP relaxation are solved by Gurobi [1] with a default setting. 5.2 Simulation Setups We first introduce network topologies and parameters, and then discuss data retrieval setups. Our simulation testbed has three types of Data Center topologies: 1) a fattree topology built from 8port switches; 2) a threetier topology (as shown in Fig. 4) in which every 4 neighboring hosts are connected to a ToR switch, and every 4 neighboring ToR switches are connected to an aggregation switch, and all 8 aggregation switches are connected to each core switch (4 in total); 3) a VL2 topology [4] with 8 port aggregation and 4port core switches, as shown in Fig. 5. Both the fattree topology and the threetier topology have 128 computing nodes, while the VL2 topology has 16 instead Data retrieval time (s)14 R =1 R =2 R = L Fig. 6. The performance of APX with different R and L, where the dashed lines represent the results of (red) and (black). The APX with R being 2 or 3 perform close to. We set link bandwidths as follows: in the fattree and the threetier topologies each link is 1 Gbps; in the VL2 topology each server link is 1 Gbps, while each switch link is 1 Gbps, the same as in [4]. Since many different applications may share the network fabric in data centers, we simulate under two settings: full bandwidth and partial bandwidth in the presence of background traffic. We generate background traffic by injecting flows between random pairs of computing nodes, each of which passes along a random path in between and consumes 1 Mbps. In order to ensure that each link has at least 1 Mbps remained, we accept a background flow only if the currently available bandwidth of every link along its path is at least 2 Mbps. Otherwise we reject. The amount of remaining bandwidth depends on how many background flows are accepted. In our simulations, we inject 4 flows. The inputs of a data retrieval problem include the locations of tasks and data objects, as well as access relationships between them. We generate these inputs synthetically. We place data first, and each data object is stored with replication factor of 3, as follows: 1) it is first placed in a randomly chosen node; 2) its first replica is stored in a distinctive node in the same rack; 3) its second replica is stored in a remote rack. This rule is the same to that in HDFS [2]. In real applications, most of the data is accessed locally, not affecting data retrieval time. Thus in simulations we only consider the nonlocal data requiring transfers. We assign tasks randomly on the nodes not having their required data. There are two types of access relationships between tasks and data objects: 1) onetoone, where each task accesses a unique data object; 2) manytoone, where multiple tasks access a common data object. Note that in both cases, each task only requires one data object to work with, and this is because other relationships (onetomany and manytomany) are special cases of our consideration. In manytoone setting, when a node has several tasks which access the same data, we assume that the node initiates only one data request and the retrieved data can be used by all the tasks. Both the amount of data and the amount of tasks are varied in simulations. For onetoone setting, the number of tasks equals to the number of data objects, both varying from 1 to 1. For manytoone setting, the number of data objects is set to 1, and the number of tasks is varied from 1 to 1. All simulation results are averaged over 2 runs.
8 (c) (d) Fig. 7. The reduction ratio of data retrieval time over for a fattree topology. is for full bandwidth and onetoone access relationship; is for full bandwidth and manytoone access relationship; (c) is for partial bandwidth and onetoone access relationship; (d) is for partial bandwidth and manytoone access relationship. Percentage bottom bottom middle middle top top Percentage bottom bottom middle middle top top Fig. 8. The percentages of each linktype in bottleneck links of. is for onetoone access relationship; is for manytoone access relationship. represents uplink, and represents downlink Data retrieval time (s)16 RmaxL1 R1Lmax Data retrieval time (s)6 RmaxL1 R1Lmax Fig. 9. The data retrieval time of an application for a fattree topology, using the methods discussed in Section is for full bandwidth and manytoone access relationship, is for partial bandwidth and onetoone access relationship. 5.3 Single Application We first evaluate the data retrieval time of a single application. It is obtained from objective value for, while for other algorithms it can be computed as in (22). We use the reduction ratio of data retrieval time over to evaluate the performance improvement of our algorithm (i.e., (t t)/t, where t is the data retrieval time of the algorithm being evaluated) Fattree Topology We first simulate for a fattree topology. We start with running simulations to find optimal R and L where APX performs best. As now R max is 3 and L max is 16, we try 9 pairs, where R is chosen from {1, 2, 3} and L is chosen from {1, 4, 16}, both adjusted by the preprocessing of random selections discussed in Section 4.3. We simulate a scenario of having 5 tasks and 1 data objects with full bandwidth and manytoone access relationship. The performance of the approximation algorithm (APX for short) in 9 settings are shown in Fig. 6, where the results of and are represented by red line and black line respectively. It is observed that the APX algorithms with R being 2 or 3 perform close to. Thus we choose to evaluate APX with R = 2 and L = 1 ( for short), besides the one without the preprocessing ( for short). The results are shown in Fig. 7. It is observed that performs almost as good as, much better than, and a bit better than. In the topology with full bandwidth, the reduction ratio is around 3% 13% for onetoone setting; while for manytoone setting it increases significantly (2% 4%) as more tasks access each data at the same time. This is because, in, when many tasks access the same set of data objects, some tasks are possible to choose the same node to retrieve a common data, resulting in heavy congestions near the selected node. The more the tasks, the worse the congestions, and the longer the retrieval time. To verify the locations of the congestions in, we calculate the percentages of each linktype in bottleneck links (that is where data transmission time equals to the data retrieval time). There are 6 linktypes belonging to 3 layers and 2 directions. The results are shown in Fig. 8. Fig. 8b demonstrates that for manytoone access relationship as the number of tasks increases, the bottlenecks mainly consist of the uplinks in the bottom layer. In comparison, for onetoone access relationship, congestions probably happen at any layer of links, as demonstrated in Fig. 8a. For the topology with partial bandwidth, the reduction ratios are significant, as shown in Fig. 7c and Fig. 7d, i.e., roughly around 4% for onetoone setting, and 5% for manytoone setting. Both ratios are higher than that in the case of full bandwidth, because ignores different bandwidths, in addition to path and replica diversities, which should have been considered in selecting nodes and paths, as our algorithm does. We demonstrate that data retrieval time cannot be minimized when we ignore replica diversity. We simulate two methods. One fully utilizes replica diversity but randomly chooses a path for each possible flow formed by each replica (RmaxL1 for short), and the other randomly chooses a replica for each request (ignoring replica diversity) but fully utilizes path diversity (R1Lmax for short). They are formulated into MIPs and solved optimally. We compare them to and, and show the results in Fig. 9. It is
9 Fig. 1. The reduction ratio of data retrieval time over, for a threetier topology with partial bandwidth. is for onetoone access relationship; is for manytoone access relationship. Fig. 11. The reduction ratio of data retrieval time over, for a VL2 topology with partial bandwidth. is for onetoone access relationship; is for manytoone access relationship. observed that RmaxL1 performs as good as, but R1Lmax performs much worse than, even close to, in various simulation settings. Thus, it validates the motivation to exploit replica diversity. In addition, Fig. 9a illustrates that in the topology with full bandwidth, path diversity may not improve performance; Fig. 9b illustrates that in a topology with partial bandwidth, path diversity may improve performance but not as much as replica diversity does Threetier Topology We also simulate for a threetier topology. Now R max is 3 and L max is 4. We simulate and the APX with two replicas and one path randomly selected in the preprocessing (i.e., ) as above. We show reduction ratios in Fig. 1. The results are for a topology with partial bandwidth, and the results for the topology with full bandwidth are similar, which are omitted due to limited space. It is demonstrated that both APX algorithms perform almost as good as, around 6% 17% better than in onetoone setting, and around 2% 32% better than in manytoone setting. The reduction ratios in both cases are lower than that for fattree topology, because in threetier topology the difference made by replica selections or path selections is not as significant as that in fattree topology. Specifically, in threetier topology, congestions always happen on the links between ToR layer and aggregation layer due to oversubscribed bandwidths, since each of those links is shared by four nodes connected to its ToR switch. To mitigate congestions, we can select replicas such that the traffic on those congestion links can be balanced. However, such mitigation is less than that for fattree topology, because two replicas in the same rack share a common link between ToR and aggregation layers, leading to the same traffic. In addition, we cannot mitigate congestions by selecting paths, because the data transfers below aggregation layer have only one path to use VL2 Topology For the VL2 topology as shown in Fig. 5, we simulate APX R2L1 and where R max is 3 and L max is 16. We show reduction ratios in Fig. 11. The results are for a topology with partial bandwidth, and the results for the topology with full bandwidth are similar, which are omitted. It is demonstrated that both APX algorithms perform almost as good as, around 25% 5% better than in onetoone setting, and around 3% 6% better than in R =1 R =2 R = L Fig. 12. The performance of APX with different R and L, where the dashed lines represent the results of (red) and (black). The APX with R = 2 and L = 1 performs closest to. manytoone setting. The results are similar to that in the fattree topology, since both topologies have richly connected links and fullbisection bandwidths. 5.4 Multiple Applications In this section, we evaluate the performance of multiple applications. The same to the case of a single application, we take and for comparison. The worst penalty among all applications is used for evaluation. It is directly obtained from objective value for, while for APX and it can be computed as in (26) Fattree Topology We simulate two applications having separate data retrieval setups, each of which is generated as introduced in Section 5.2. We set different upper bounds on their data retrieval time. The optimal data retrieval time in the case of a single application (obtained from ) is used as a baseline. One upper bound is set to 1.2 times, and the other is set to 1.8 times as much as the baseline. The settings are listed in Table 1, averaged over 2 runs. We also start with running simulations to find optimal R and L where APX performs best. 9 pairs of R and L are tried, where R is chosen from {1, 2, 3} and L is chosen from {1, 4, 16}. We simulate a scenario of two applications each having 5 tasks and 1 data objects with manytoone access relationship under full bandwidth. The performance of APX in 9 settings are shown in Fig. 12, where the results of and are represented by red line and black line respectively. It is observed that the APX with R = 2 and L = 1 performs closest to. Thus we choose to evaluate APX with this setting (i.e., ), besides the one without the preprocessing (i.e., ).
10 1 TABLE 1 The settings of the upper bounds on data retrieval time in the simulation of two applications (s) the number of tasks topology bandwidth access fattree partial manytoone (6, 9) (7, 11) (9, 13) (13, 2) (16, 24) (17, 25) (19, 29) (22, 32) (23, 35) (26, 38) fattree partial onetoone (5, 8) (8, 12) (12, 18) (15, 23) (16, 24) (23, 35) (23, 35) (26, 39) (29, 44) (31, 47) fattree full manytoone (2, 4) (3, 5) (4, 6) (5, 8) (6, 9) (7, 1) (8, 11) (8, 12) (9, 13) (9, 14) fattree full onetoone (2, 4) (4, 5) (4, 6) (5, 8) (6, 9) (7, 1) (8, 12) (9, 13) (9, 13) (11, 16) threetier partial manytoone (37, 56) (64, 96) (9, 135) (112, 169) (137, 26) (155, 233) (171, 256) (194, 291) (214, 321) (239, 359) threetier partial onetoone (38, 57) (62, 94) (92, 138) (112, 168) (136, 24) (157, 235) (181, 271) (27, 311) (228, 342) (241, 361) VL2 partial manytoone (4, 6) (6, 1) (8, 12) (8, 13) (1, 15) (11, 17) (13, 19) (15, 23) (16, 23) (15, 23) VL2 partial onetoone (4, 6) (7, 1) (8, 12) (1, 15) (1, 15) (14, 21) (15, 23) (19, 28) (2, 29) (22, 33) (c) (d) Fig. 13. The worst penalty of two applications for a fattree topology. is for full bandwidth and onetoone access relationship; is for full bandwidth and manytoone access relationship; (c) is for partial bandwidth and onetoone access relationship; (d) is for partial bandwidth and manytoone access relationship. Next we evaluate the performance for twoapplication scenario. We show results in Fig. 13. For full bandwidth as shown in Fig. 13a and Fig. 13b, it is observed that APX R2L1 performs almost as good as, and is much worse than, with being in between. In onetoone setting, reduces penalty by 15 percentage points roughly, and in manytoone setting, it reduces penalty by 2 6 percentage points. APXRmax Lmax is usually better than, and is comparable with in some settings of few tasks. For partial bandwidth as shown in Fig. 13c and Fig. 15b, it is observed that APX algorithms with both parameters perform close to, and significantly better than. In onetoone setting, both APXs reduce penalty by 25 9 percentage points, and in manytoone setting, they reduce penalty by 7 2 percentage points. We also demonstrate that for the case of multiple applications, penalty cannot be minimized if we ignore replica diversity. We simulate RmaxL1 and R1Lmax (discussed in Section 5.3.1), and compare them to and. Remind that in R1Lmax, replica diversity is ignored, as data replica is randomly selected. The results for the fattree topology with full bandwidth and manytoone access setting are shown in Fig. 14, and the results for other settings are similar, which are omitted due to limited space. It is observed that RmaxL1 performs as good as, but R1Lmax performs much worse than, similar to. Thus, it validates our motivation to exploit replica diversity Threetier Topology and VL2 Topology We also simulate the case of two applications in a threetier topology and a VL2 topology. The upper bounds on their data retrieval time are set as discussed previously, listed in RmaxL1 R1Lmax Fig. 14. The worst penalty of two applications for the fattree topology with full bandwidth and manytoone access relationship, using the methods discussed in Section Table 1. The simulation results for the threetier topology are shown in Fig. 15, and that for the VL2 topology are shown in Fig. 16. We only show the results for partial bandwidth, and the results for full bandwidth are similar, which are omitted due to limited space. It is observed that both APX algorithms perform close to, much better than. For the threetier topology, APX algorithms reduce penalty by 5 2 percentage points in onetoone setting, and reduce penalty by 2 5 percentage points in manytoone setting. For the VL2 topology, they reduce penalty by percentage points in onetoone setting, and reduce penalty by percentage points. All of these results demonstrate that our algorithm is effective. 6 RELATED WORKS We review the existing works related to the data retrieval problem, and categorize them into three groups: traffic scheduling (in three levels), replica selection, and joint replica selection and routing. Packetlevel Traffic Scheduling. Dixit et al. in [11] argued that packetlevel traffic splitting, where packets of a
11 Fig. 15. The worst penalty of two applications for a threetier topology with partial bandwidth. is for onetoone access relationship; is for manytoone access relationship. Fig. 16. The worst penalty of two applications for a VL2 topology with partial bandwidth. is for onetoone access relationship; is for manytoone access relationship. flow are sprayed through all available paths, would lead to a better loadbalanced network and much higher throughput compared to ECMP. Tso et al. in [12] proposed to improve link utilization by implementing penalizing exponential flowspliting algorithm in data center. Flowlevel Traffic Scheduling. Greenberg et al. in [4] proposed using perflow Valiant Load Balancing to spread traffic uniformly across network paths. Benson et al. in [13] developed a technique, MicroTE, that leverages the existence of shortterm predictable traffic to mitigate the impact of congestion due to the unpredictable traffic. Al Fares et al. in [6] proposed a flow scheduling system, exploiting path diversity in data center, to avoid path collisions. Cui et al. in [14] proposed Distributed Flow Scheduling (DiFS) to balance flows among different links and improves bandwidth utilization for data center networks. Alizadeh et al. in [15] proposed a very simple design that decouples flow scheduling from rate control, to provide nearoptimal performance. Joblevel Traffic Scheduling. Chowdhury et al. in [7] proposed a global network management architecture and algorithms to improve data transfer time in cluster computing. They focused on the massive data transfer between successive processing stages, such as shuffle between the map and reduce stages in MapReduce. Dogar et al. in [16] designed a decentralized taskaware scheduling system to reduce task completion time for data center applications, by grouping flows of a task and scheduling them together. Although these traffic scheduling methods can be used to schedule flows in data retrievals, but they do not optimize replica selections for flows, which we focus on. Replica Selection in Data Grids. ALMistarihi et al. in [17] researched on the replica selection problem in a Grid environment that decides which replica location is the best for Grid users. Their aim is to establish fairness among users in selections. Rahman et al. in [18] proposed replica selection strategies to minimize access latency by selecting the best replica. These works do not consider the impact of route selections on data transfer time. Joint Replica Selection and Routing. Valancius et al. in [19] designed a system that performs joint content and network routing for dynamic online services. The system controls both the selection of replicas and the routes between the clients and their associated replicas. They demonstrated the benefits of joint optimization. With similar goals, Narayana et al. in [2] proposed to coordinate modular mapping and routing systems, already owned by OSP, to achieve global performance and cost goals of a joint system. Xu et al. in [21] distributed the joint optimization for scale. These works for wide area networks are inapplicable to data center networks, because data centers have much different traffic and data access patterns. 7 CONCLUSION In this paper, we investigate the data retrieval problem in the DCN, that is to jointly select data replicas and paths for concurrent data transfers such that data retrieval time is minimized (i.e., throughput is maximized). We propose an approximation algorithm to solve the problem, with an approximation ratio of RL, where R is the replication factor of data and L is the largest number of candidate paths. We also solve the data retrieval problem for the case of multiple applications, keeping fairness among them. The simulations demonstrate that our algorithm can obtain nearoptimal performance with the best R and L. ACKNOWLEDGMENTS This work was supported by grant from Research Grants Council of Hong Kong [Project No. CityU ]. REFERENCES [1] J. Dean and S. Ghemawat, Mapreduce: Simplified data processing on large clusters, Commun. ACM, vol. 51, no. 1, pp , Jan. 28. [2] D. Borthakur, HDFS architecture guide, Hadoop Apache Project, p. 53, 28. [3] C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian, Y. Zhang, and S. Lu, BCube: A high performance, servercentric network architecture for modular data centers, in SIGCOMM, 29, pp [4] A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta, VL2: A scalable and flexible data center network, in SIGCOMM, 29, pp [5] C. Hopps, Analysis of an equalcost multipath algorithm, RFC 2992, IETF, 2. [6] M. AlFares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat, Hedera: Dynamic flow scheduling for data center networks, in NSDI, 21, pp [7] M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica, Managing data transfers in computer clusters with orchestra, in SIGCOMM, 211, pp [8] J. L. Gross, J. Yellen, and P. Zhang, Handbook of Graph Theory, Second Edition, 2nd ed. Chapman & Hall/CRC, 213. [9] R. J. Vanderbei, Linear programming: Foundations and extensions, 1996.
12 12 [1] I. Gurobi Optimization, Gurobi optimizer reference manual, 215. [Online]. Available: [11] A. Dixit, P. Prakash, and R. R. Kompella, On the efficacy of finegrained traffic splitting protocolsin data center networks, in SIGCOMM, 211, pp [12] F. P. Tso and D. Pezaros, Improving data center network utilization using nearoptimal traffic engineering, IEEE Trans. Parallel Distrib. Syst., vol. 24, no. 6, pp , June 213. [13] T. Benson, A. Anand, A. Akella, and M. Zhang, MicroTE: Fine grained traffic engineering for data centers, in Proceedings of the Seventh COnference on Emerging Networking EXperiments and Technologies, 211, pp. 8:1 8:12. [14] W. Cui and C. Qian, DiFS: Distributed flow scheduling for adaptive routing in hierarchical data center networks, in Proceedings of the Tenth ACM/IEEE Symposium on Architectures for Networking and Communications Systems, 214, pp [15] M. Alizadeh, S. Yang, M. Sharif, S. Katti, N. McKeown, B. Prabhakar, and S. Shenker, pfabric: Minimal nearoptimal datacenter transport, in SIGCOMM, 213, pp [16] F. R. Dogar, T. Karagiannis, H. Ballani, and A. Rowstron, Decentralized taskaware scheduling for data center networks, in SIGCOMM, 214, pp [17] H. H. E. ALMistarihi and C. H. Yong, On fairness, optimizing replica selection in data grids, IEEE Trans. Parallel Distrib. Syst., vol. 2, no. 8, pp , 29. [18] R. M. Rahman, R. Alhajj, and K. Barker, Replica selection strategies in data grid, Journal of Parallel and Distributed Computing, vol. 68, no. 12, pp , 28. [19] V. Valancius, B. Ravi, N. Feamster, and A. C. Snoeren, Quantifying the benefits of joint content and network routing, in SIGMETRICS, 213, pp [2] S. Narayana, W. Jiang, J. Rexford, and M. Chiang, Joint server selection and routing for georeplicated services, in Proceedings of the 213 IEEE/ACM 6th International Conference on Utility and Cloud Computing, 213, pp [21] H. Xu and B. Li, Joint request mapping and response routing for geodistributed cloud services, in INFOCOM, 213, pp Ruitao Xie received her PhD degree in Computer Science from City University of Hong Kong in 214, and BEng degree from Beijing University of Posts and Telecommunications in 28. She is currently a senior research associate in Department of Computer Science at City University of Hong Kong. Her research interests include cloud computing, distributed systems and wireless sensor networks. Xiaohua Jia received his BSc (1984) and MEng (1987) from University of Science and Technology of China, and DSc (1991) in Information Science from University of Tokyo. He is currently Chair Professor with Dept of Computer Science at City University of Hong Kong. His research interests include cloud computing and distributed systems, computer networks and mobile computing. Prof. Jia is an editor of IEEE Internet of Things, IEEE Trans. on Parallel and Distributed Systems (2629), Wireless Networks, Journal of World Wide Web, Journal of Combinatorial Optimization, etc. He is the General Chair of ACM MobiHoc 28, TPC CoChair of IEEE GlobeCom 21 Ad Hoc and Sensor Networking Symp, AreaChair of IEEE INFOCOM 21 and 215. He is Fellow of IEEE.
Revisiting FlowBased Load Balancing: Stateless Path Selection in Data Center Networks
Revisiting FlowBased Load Balancing: Stateless Path Selection in Data Center Networks Gregory Detal a, Christoph Paasch a, Simon van der Linden a, Pascal Mérindol b, Gildas Avoine a, Olivier Bonaventure
More informationCracking Network Monitoring in DCNs with SDN
Cracking Network Monitoring in DCNs with SDN Zhiming Hu and Jun Luo School of Computer Engineering, Nanyang Technological University, Singapore Email: {zhu7, junluo}@ntu.edu.sg Abstract The outputs of
More informationTransparent and Flexible Network Management for Big Data Processing in the Cloud
Transparent and Flexible Network Management for Big Data Processing in the Cloud Anupam Das, Cristian Lumezanu, Yueping Zhang, Vishal Singh, Guofei Jiang, Curtis Yu UIUC NEC Labs UC Riverside Abstract
More informationTransparent and Flexible Network Management for Big Data Processing in the Cloud
Transparent and Flexible Network Management for Big Data Processing in the Cloud Anupam Das, Cristian Lumezanu, Yueping Zhang, Vishal Singh, Guofei Jiang, Curtis Yu UIUC NEC Laboratories America UC Riverside
More informationGreenhead: Virtual Data Center Embedding Across Distributed Infrastructures
: Virtual Data Center Embedding Across Distributed Infrastructures Ahmed Amokrane, Mohamed Faten Zhani, Rami Langar, Raouf Boutaba, Guy Pujolle LIP6 / UPMC  University of Pierre and Marie Curie; 4 Place
More informationBCube: A High Performance, Servercentric Network Architecture for Modular Data Centers
BCube: A High Performance, Servercentric Network Architecture for Modular Data Centers Chuanxiong Guo 1, Guohan Lu 1, Dan Li 1, Haitao Wu 1, Xuan Zhang 1,2, Yunfeng Shi 1,3, Chen Tian 1,4, Yongguang Zhang
More informationCost Minimization for Big Data Processing in GeoDistributed Data Centers
1 Cost Minimization for Big Data Processing in GeoDistributed Data Centers Lin Gu, Student Member, IEEE, Deze Zeng, Member, IEEE, Peng Li, Member, IEEE and Song Guo, Senior Member, IEEE Abstract The explosive
More informationDeconstructing Datacenter Packet Transport
Deconstructing Datacenter Packet Transport Mohammad Alizadeh, Shuang Yang, Sachin Katti, Nick McKeown, Balaji Prabhakar, and Scott Schenker Stanford University U.C. Berkeley / ICSI {alizade, shyang, skatti,
More informationAchieving High Utilization with SoftwareDriven WAN
Achieving High Utilization with SoftwareDriven WAN ChiYao Hong (UIUC) Srikanth Kandula Ratul Mahajan Ming Zhang Vijay Gill Mohan Nanduri Roger Wattenhofer (ETH) Microsoft Abstract We present SWAN, a
More informationCLOUD computing has recently gained significant popularity
IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 1, NO. 1, JANUARYJUNE 2013 1 Greenhead: Virtual Data Center Embedding across Distributed Infrastructures Ahmed Amokrane, Student Member, IEEE, Mohamed Faten
More informationData LocalityAware Query Evaluation for Big Data Analytics in Distributed Clouds
Data LocalityAware Query Evaluation for Big Data Analytics in Distributed Clouds Qiufen Xia, Weifa Liang and Zichuan Xu Research School of Computer Science Australian National University, Canberra, ACT
More informationChord: A Scalable Peertopeer Lookup Service for Internet Applications
Chord: A Scalable Peertopeer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT Laboratory for Computer Science chord@lcs.mit.edu
More informationTowards Efficient Load Distribution in Big Data Cloud
Towards Efficient Load Distribution in Big Data Cloud Zhi Liu *, Xiang Wang *, Weishen Pan *, Baohua Yang, Xiaohe Hu * and Jun Li * Department of Automation, Tsinghua University, Beijing, China Research
More informationThe TCP Outcast Problem: Exposing Unfairness in Data Center Networks
The TCP Outcast Problem: Exposing Unfairness in Data Center Networks Pawan Prakash, Advait Dixit, Y. Charlie Hu, Ramana Kompella Purdue University {pprakash, dixit, ychu, rkompella}@purdue.edu Abstract
More informationMobility Increases the Capacity of Ad Hoc Wireless Networks
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 10, NO. 4, AUGUST 2002 477 Mobility Increases the Capacity of Ad Hoc Wireless Networks Matthias Grossglauser and David N. C. Tse Abstract The capacity of ad hoc
More informationNetwork Layer Support for Service Discovery in Mobile Ad Hoc Networks
Network Layer Support for Service Discovery in Mobile Ad Hoc Networks Ulaş C. Kozat and Leandros Tassiulas Department of Electrical and Computer Engineering and Institute for Systems Research University
More informationPolitecnico di Torino. Porto Institutional Repository
Politecnico di Torino Porto Institutional Repository [Proceeding] Efficient Uplink Bandwidth Utilization in P2PTV Streaming Systems Original Citation: A. Carta,M. Mellia,M. Meo,S. Traverso (21). Efficient
More informationInstituto de Engenharia de Sistemas e Computadores de Coimbra. Institute of Systems Engineering and Computers INESC  Coimbra
Instituto de Engenharia de Sistemas e Computadores de Coimbra Institute of Systems Engineering and Computers INESC  Coimbra Carlos Simões, Teresa Gomes, José Craveirinha, and João Clímaco A BiObjective
More informationMesos: A Platform for FineGrained Resource Sharing in the Data Center
: A Platform for FineGrained Resource Sharing in the Data Center Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy Katz, Scott Shenker, Ion Stoica University of California,
More informationIEEE/ACM TRANSACTIONS ON NETWORKING 1
IEEE/ACM TRANSACTIONS ON NETWORKING 1 SelfChord: A BioInspired P2P Framework for SelfOrganizing Distributed Systems Agostino Forestiero, Associate Member, IEEE, Emilio Leonardi, Senior Member, IEEE,
More informationOptimal Positioning of Active and Passive Monitoring Devices
Optimal Positioning of Active and Passive Monitoring Devices Claude Chaudet Claude.Chaudet@enst.fr GET/ENST LTCIUMR 5141 CNRS 46, rue Barrault 75634 Paris, France Eric Fleury, Isabelle Guérin Lassous
More informationLoad Shedding for Aggregation Queries over Data Streams
Load Shedding for Aggregation Queries over Data Streams Brian Babcock Mayur Datar Rajeev Motwani Department of Computer Science Stanford University, Stanford, CA 94305 {babcock, datar, rajeev}@cs.stanford.edu
More informationVDC Planner: Dynamic MigrationAware Virtual Data Center Embedding for Clouds
VDC Planner: Dynamic MigrationAware Virtual Data Center Embedding for Clouds Mohamed Faten Zhani,QiZhang, Gwendal Simon and Raouf Boutaba David R. Cheriton School of Computer Science, University of Waterloo,
More informationSpeeding up Distributed RequestResponse Workflows
Speeding up Distributed RequestResponse Workflows Virajith Jalaparti (UIUC) Peter Bodik Srikanth Kandula Ishai Menache Mikhail Rybalkin (Steklov Math Inst.) Chenyu Yan Microsoft Abstract We found that
More informationEnergyaware joint management of networks and cloud infrastructures
Energyaware joint management of networks and cloud infrastructures Bernardetta Addis 1, Danilo Ardagna 2, Antonio Capone 2, Giuliana Carello 2 1 LORIA, Université de Lorraine, France 2 Dipartimento di
More informationToward Optimal Resource Provisioning for Cloud MapReduce and Hybrid Cloud Applications
Preliminary version. Final version to appear in 8th IEEE International Conference on Cloud Computing (IEEE Cloud 2015), June 27  July 2, 2015, New York, USA Toward Optimal Resource Provisioning for Cloud
More informationRobust Set Reconciliation
Robust Set Reconciliation Di Chen 1 Christian Konrad 2 Ke Yi 1 Wei Yu 3 Qin Zhang 4 1 Hong Kong University of Science and Technology, Hong Kong, China 2 Reykjavik University, Reykjavik, Iceland 3 Aarhus
More informationOptimal pipeline decomposition and adaptive network mapping to support distributed remote visualization
J Parallel Distrib Comput 67 (2007) 947 956 wwwelseviercom/locate/jpdc Optimal pipeline decomposition and adaptive network mapping to support distributed remote visualization Mengxia Zhu a, Qishi Wu b,,
More informationWorkloadAware Load Balancing for Clustered Web Servers
WorkloadAware Load Balancing for Clustered Web Servers Qi Zhang Alma Riska 2 Wei Sun 3 Evgenia Smirni Gianfranco Ciardo 4 Department of Computer Science, College of William and Mary 2 Interfaces and Architecture
More informationGreenCloud: a packetlevel simulator of energyaware cloud computing data centers
J Supercomput DOI 10.1007/s1122701005041 GreenCloud: a packetlevel simulator of energyaware cloud computing data centers Dzmitry Kliazovich Pascal Bouvry Samee Ullah Khan Springer Science+Business
More information