Improved Routing in the Data Centre Networks HCN and BCN



Similar documents
Load Balancing Mechanisms in Data Center Networks

Scafida: A Scale-Free Network Inspired Data Center Architecture

A Reliability Analysis of Datacenter Topologies

Diamond: An Improved Fat-tree Architecture for Largescale

Generalized DCell Structure for Load-Balanced Data Center Networks

Evaluating the Impact of Data Center Network Architectures on Application Performance in Virtualized Environments

Enabling Flow-based Routing Control in Data Center Networks using Probe and ECMP

Wireless Link Scheduling for Data Center Networks

Data Center Network Topologies: FatTree

A Comparative Study of Data Center Network Architectures

MEGA data centers have emerged as infrastructures for

PCube: Improving Power Efficiency in Data Center Networks

Experimental Framework for Mobile Cloud Computing System

Resolving Packet Loss in a Computer Centre Applications

AIN: A Blueprint for an All-IP Data Center Network

Green Routing in Data Center Network: Modeling and Algorithm Design

Energy-aware Routing in Data Center Network

102 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEBRUARY 2011

International Journal of Emerging Technology in Computer Science & Electronics (IJETCSE) ISSN: Volume 8 Issue 1 APRIL 2014.

OpenFlow based Load Balancing for Fat-Tree Networks with Multipath Support

Multi-layer Structure of Data Center Based on Steiner Triple System

Error Tolerant Address Configuration for Data Center Networks with Malfunctioning Devices

Autonomous Fault Detection and Recovery System in Large-scale Networks

Multi-Constrained Multi-Path Routing for Server-Centric Data Center Networks

Building Mega Data Center from Heterogeneous Containers

Depth-First Worst-Fit Search based Multipath Routing for Data Center Networks

Data Center Network Structure using Hybrid Optoelectronic Routers

Data Center Network Architectures

On Tackling Virtual Data Center Embedding Problem

2013 IEEE 14th International Conference on High Performance Switching and Routing

Integrating Servers and Networking using an XOR-based Flat Routing Mechanism in 3-cube Server-centric Data Centers

Application-aware Virtual Machine Migration in Data Centers

Scaling 10Gb/s Clustering at Wire-Speed

Channel Allocation in Wireless Data Center Networks

Chapter 6. Paper Study: Data Center Networking

Impact of Ethernet Multipath Routing on Data Center Network Consolidations

Applying NOX to the Datacenter

Adaptive Routing for Layer-2 Load Balancing in Data Center Networks

On implementation of DCTCP on three tier and fat tree data center network topologies

A Hybrid Electrical and Optical Networking Topology of Data Center for Big Data Network

Poisson Shot-Noise Process Based Flow-Level Traffic Matrix Generation for Data Center Networks

Energy Optimizations for Data Center Network: Formulation and its Solution

TeachCloud: A Cloud Computing Educational Toolkit

Rethinking the architecture design of data center networks

BURSTING DATA BETWEEN DATA CENTERS CASE FOR TRANSPORT SDN

Data Center Network Topologies: VL2 (Virtual Layer 2)

On Tackling Virtual Data Center Embedding Problem

Data Center Networks

arxiv: v1 [cs.dc] 5 May 2016

C. Hu M. Yang K. Zheng K. Chen X. Zhang B. Liu X. Guan

Ph.D. Research Plan. Designing a Data Center Network Based on Software Defined Networking

Dual-Centric Data Center Network Architectures

RDCM: Reliable Data Center Multicast

Layer-3 Multipathing in Commodity-based Data Center Networks

PortLand:! A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Performance Metrics for Data Center Communication Systems

Hyper Node Torus: A New Interconnection Network for High Speed Packet Processors

DATA center infrastructure design has recently been receiving

Secure Cloud Computing with a Virtualized Network Infrastructure

Survey on Routing in Data Centers: Insights and Future Directions

MDCube: A High Performance Network Structure for Modular Data Center Interconnection

2004 Networks UK Publishers. Reprinted with permission.

A ROUTING ALGORITHM FOR MPLS TRAFFIC ENGINEERING IN LEO SATELLITE CONSTELLATION NETWORK. Received September 2012; revised January 2013

Network (Tree) Topology Inference Based on Prüfer Sequence

[Sathish Kumar, 4(3): March, 2015] ISSN: Scientific Journal Impact Factor: (ISRA), Impact Factor: 2.114

Data Center Networking with Multipath TCP

How To Build A Low Cost Data Center Network With Two Ports And A Backup Port

Performance of networks containing both MaxNet and SumNet links

Firewall Verification and Redundancy Checking are Equivalent

IN THIS PAPER, we study the delay and capacity trade-offs

Green Data Center Networks: Challenges and Opportunities

Data Center Architectures: Challenges and Opportunities

A Catechistic Method for Traffic Pattern Discovery in MANET

Joint Virtual Machine Assignment and Traffic Engineering for Green Data Center Networks

A Fast Path Recovery Mechanism for MPLS Networks

Network Aware Resource Allocation in Distributed Clouds

On Reliability of Dynamic Addressing Routing Protocols in Mobile Ad Hoc Networks

Topology Switching for Data Center Networks

Xiaoqiao Meng, Vasileios Pappas, Li Zhang IBM T.J. Watson Research Center Presented by: Payman Khani

Load Balancing and Switch Scheduling

Labeling outerplanar graphs with maximum degree three

Factors to Consider When Designing a Network

How To Balance Network Load In A Wireless Sensor Network

Minimizing Energy Consumption of Fat-Tree Data Center. Network

MANY big-data computing applications have been deployed

InfiniBand Clustering

Free-Scaling Your Data Center

IMPACT OF DISTRIBUTED SYSTEMS IN MANAGING CLOUD APPLICATION

How To Find Influence Between Two Concepts In A Network

In-network, Push-based Network Resource Monitoring

Radhika Niranjan Mysore, Andreas Pamboris, Nathan Farrington, Nelson Huang, Pardis Miri, Sivasankar Radhakrishnan, Vikram Subramanya and Amin Vahdat

IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 15, NO. 1, FIRST QUARTER

How To Improve Traffic Engineering

Security-Aware Beacon Based Network Monitoring

Dynamic Scheduling for Wireless Data Center Networks

International Journal of Advanced Research in Computer Science and Software Engineering

Demand-Aware Flow Allocation in Data Center Networks

Portland: how to use the topology feature of the datacenter network to scale routing and forwarding

A Network Flow Approach in Cloud Computing

Transcription:

Improved Routing in the Data Centre Networks HCN and BCN Iain A. Stewart School of Engineering and Computing Sciences, Durham University, South Road, Durham DH 3LE, U.K. Email: i.a.stewart@durham.ac.uk Abstract We present improved one-to-one routing algorithms in the data centre networks HCN and BCN, in that our routing algorithms result in much shorter paths when compared with existing algorithms. We also present a much tighter analysis of HCN and BCN by observing that there is a very close relationship between the data centre networks HCN and the interconnection networks known as WK-recursive networks. We use existing results for WK-recursive networks to prove the optimality of our new routing algorithm for HCN and also to significantly aid the implementation of our routing algorithms in both HCN and BCN. Keywords-data centre networks; HCN; BCN; one-to-one routing; WK-recursive networks. I. INTRODUCTION The traditional architecture of a data centre network (DCN) is switch-centric whereby the primary structure is a topology (almost always tree-based) of switches with the switches possessing interconnection intelligence. The DCNs Fat-Tree [], VL [5] and Portland [] are typical of such DCNs. A more recent and alternative architecture is servercentric whereby the interconnection intelligence resides within the servers and the switches are dumb crossbars (so, there are no switch-switch links). The DCNs DCell [7], FiConn [9], BCube [6], MDCube [] and HCN and BCN [8] are typical of server-centric DCNs. The server-centric architecture possesses a number of advantages when compared with the switch-centric architecture such as: the underlying topologies are better suited than the switch-centric tree-based topologies to support traffic patterns prevalent in data centres (such as one-to-all and all-to-all); the switches can be chosen to be commodity switches as they require no intelligence; and multiple network interface controller (NIC) ports on servers can be utilized so that more varied topologies can be constructed (see, for example, [3], [8], [0]). Whilst multiple NIC ports can be used when building DCNs, commodity servers usually only have a small number of NIC ports, often only two. Motivated by the desire to use commodity servers only, Guo, Chen, Li, Li, Liu and Chen introduced and evaluated the DCNs HCN and BCN [8]. The general construction is that the DCNs HCN are recursivelydefined networks, with the DCNs BCN built using DCNs from HCN by including an additional layer of interconnecting links. A number of routing algorithms (including one-toone, multipath and fault-tolerant algorithms) were developed and evaluated, primarily in comparison with FiConn and according to a number of basic metrics. We pursue the analysis of the DCNs HCN and BCN in this paper. In particular, we present significantly improved oneto-one routing algorithms in both HCN and BCN, in that our routing algorithms result in much shorter paths than those in [8]. We also present a much tighter analysis of HCN and BCN by observing that there is close relationship between the DCNs HCN and the interconnection networks known as WK-recursive networks which originated in [4] and which have been well studied as general interconnection networks. We use existing results concerning WK-recursive networks to prove the optimality of our new routing algorithm for HCN and to significantly aid the implementation of our routing algorithms in both HCN and BCN. II. THE DCNS HCN In this section we define the DCN HCN(n,h) from [8], where n and h 0: the parameter n is the degree of the base n-star in the recursive construction (the base n- star takes the form of a switch-node with n adjacent servernodes); and the parameter h is the depth of the recursion (we reiterate that in the server-centric DCN architecture, all DCNs consist of a mix of switch-nodes and server-nodes so that every switch-node is adjacent only to server-nodes). For clarity, we give full definitions of the complex DCNs HCN and BCN. We then place these definitions within the context of WK-recursive networks, first defined in [4]. A. The recursive construction We begin with a base DCN G 0 consisting of an n-star with the hub-node 0 being the solitary switch-node and the other nodes,,...,n being server-nodes. We fix α and β 0 so that α + β = n: the nodes,,...,α are called the master-nodes; and the nodes α+,α+,...,n the slave-nodes. We suppress reference to slave-nodes below. We next take α disjoint copies of G 0, namely G 0,G0,...,G 0 α, and refer to (master-) node j of G0 i as node (i,j), for i,j {,,...,α} (in what follows, all indices come from {,,...,α} and switch-nodes and slave-nodes play no role in the construction). For i,j {,,...,α}, where i j, we join nodes (i,j) G 0 i and (j,i) G 0 j via an additional link. Note that no additional link involves any node of {(i,i) : i =,,...,α}. Denote the resulting

network byg, with master-nodes and switch-nodes (as well as slave-nodes) inherited from G 0,G0,...,G0 α but so that any node (i,j) G 0 i, where i j, becomes a used-node in G. We call the α(α ) used-nodes the used-nodes at level and the additional links we introduced the level links. New links introduced in the subsequent construction are not incident with used-nodes. We can iterate the process above as follows. Take α disjoint copies of G, namely G,G,...,G α, and refer to node (j,k) of G i as node (i,j,k). Note that each copy G i has α master-nodes. For i,j {,,...,α}, where i j, we join nodes (i,j,j) and (j,i,i) via an additional link. Denote the resulting network by G. Note that any node (i,i,i) has degree and any node (i,j,k) where it is not the case that i = j = k has degree. In G, the switch-nodes and used-nodes (at level ) are inherited from G,G,...,G α (as are the slave-nodes) but any masternode (i,j,j) G i where i j becomes a used-node in G. We call the α(α ) newly-designated used-nodes the used-nodes at level and the additional links we introduced the level links. As before, new links introduced in the subsequent construction are not incident with used-nodes. We proceed similarly to construct G 3,G 4,...,G h and obtain used-nodes and links at levels 3,4,...,h. We refer to the identification of a (used- or master-) node of some G i as a tuple of i + digits as the index of the node; indeed, henceforth we equate a node with its index. The DCN HCN(n,h) is defined to be G h. Note that given some index (i h,i h,...,i,i 0 ) of a master-node or a used-node, the index of the switch-node to which it is adjacent can be obtained as (i h,i h,...,i,0). The DCN HCN(7,) can be visualized as in Fig., where α = 4 and β = 3. The slave-nodes are in white, the masterand used-nodes are in black and the index of any masternode or used-node is obtained by replacing the right-most component of the index of the adjacent switch-node by the node s number from {,,3,4}. In general, the index of a used- or master-node in HCN(n, h) is obtained by replacing the right-most component of the index of the adjacent switch-node by the node s number from {,,...,α}. B. Recursive structure and node enumerations Within HCN(n,h), there is a natural indexing of the slave-nodes as {(i h,i h,...,i,y) : i h,i h,...,i {,,...,α},y {α +,α +,...,n}}. So, all nodes of HCN(n,h) of index (i h,i h,...,i,z), for some fixed i h,i h,...,i {,,...,α} (with z {0,,,...,n}) induce a copy of G 0. Similarly, if 0 γ < h and we fix i h,i h,...,i γ+ {,,...,α}, then all nodes of HCN(n,h) of index (i h,i h,...,i γ+,j γ,j γ,...,j,z), wherej γ,j γ,..., j {,,...,α} and where z {0,,...,n}, induce a copy of HCN(n,γ). Note that there are α h γ such copies of HCN(n, γ) within HCN(n, h), with the copy above 0 4 3 0 0 0 4 3 4 3 4 3 40 30 40 30 4 3 4 3 4 3 4 3 40 40 30 30 4 3 4 3 4 3 4 3 440 430 340 330 4 3 4 3 4 3 4 3 Figure. The network HCN(7, ). identified by the tuple (i h,i h,...,i γ+ ). These are the canonical copies of HCN(n, γ) in HCN(n, h). So far, we have identified nodes with indices. However, we also refer to nodes by their names. Suppose that (i h,i h,...,i,i 0 ) {,,...,α} h+ is the index of some master- or used-node of HCN(n,h). We say that this node has nameid((i h,i h,...,i,i 0 )) = Σ h l=0 (i l )α l + (we suppress the parameters α and h in the denotation of the function id). The function id is clearly a bijection from the set of master- and used-nodes of HCN(n,h) to the set {,,...,α h+ }. The function id can also be used to name the copies of HCN(n,γ) within HCN(n,h), where γ < h, as {,,...,α h γ }, and also the switch-nodes of HCN(n,h) as {,,...,α h } by stripping away the rightmost component of the index of any switch-node. Consider a slave-node (i h,i h,...,i,y) in HCN(n,h), wherei h,i h,...,i {,,...,α} andy {α+,α+,...,n}. We define the function id as id ((i h,i h,..., i,y)) = (id((i h,i h,...,i )) )β + (y α) (again, α, β and h are suppressed). This function id is a bijection from the set of slave-nodes of HCN(n,h) to the set {,,...,α h β}. C. WK-recursive networks As is stated in [8], if two (master- or used-) nodes of HCN(n, h) are adjacent to the same switch-node then it is assumed that the length of a shortest path joining these two nodes is (the same assumption is also adopted as regards the analysis of DCell [7], FiConn [9], BCube [6] and MDCube []). This is equivalent to removing every switchnode from HCN(n, h) and assuming that the master-nodes and used-nodes adjacent to some switch-node are joined as a clique (of α nodes). What remains is a WK-recursive network that was first defined in [4] (we are ignoring slavenodes recall). WK-recursive networks have been extensively

studied and, as we shall see later, we can use the analysis of these networks in order to better understand the topological properties of the DCNs HCN and BCN. More formally, the WK-recursive network WK(α, h) is defined so that: it has node-set {,,...,α} h+ ; there are links ((i h,i h,...,i,i,x),(i h,i h,...,i,i,x )), for i,i,...,i h,x,x {,,...,α}, where x x ; and links ((i h, i h,..., i j+, i j, i j,... j times..., i j ), (i h,i h,..., i j+,i j,i j,... j times...,i j )), for j {,,...,h} and for i h,i h,...,i j+,i j,i j {,,...,α}, where i j i j. III. THE DCNS BCN We now explain how the DCNs BCN from Sections 3. and 3.3 of [8] can be constructed from the DCNs HCN of the previous section. Whereas it was the master-nodes (of the n-stars) that were used to build the DCNs HCN, now we construct the DCNs BCN using the slave-nodes. Let h 0, γ 0, α and n = α+β be given. Case (a): BCN(α,β,h,γ), where h < γ. The network BCN(α,β,h,γ) is defined to be HCN(n,h); so, there are β slave-nodes hanging off each switch-node (as in Fig. ; the parameter γ plays no role when h < γ). Case (b): BCN(α,β,h,h). Set s = α h β (that is, the total number of slave-nodes in HCN(n,h)). In order to construct BCN(α,β,h,h), we take s+ disjoint copies of HCN(n,h), denoted,b,...,b s, and build the network K s+ (,B,...,B s ), where K s+ is the clique on s+ nodes, as follows: we add additional links to ensure that every slave-node of any B i is joined to exactly slave-node of some other B i so that if two slave-nodes of B i are joined to slave-nodes in B i and B i then i i. There are various ways to implement the above construction and two were highlighted in [8]. Consider some slave-node (i h,i h,...,i,y) in B u and identify it as (u,v), where u {0,,...,s} and v = id ((i h,i h,...,i,y)) {,,...,s} (recall that id was defined in Section II-B). The first method from [8] proceeds as follows: we join the slave-node (u,v) of B u to the slavenode (v,u) of B v, if u v, and to the slave-node (v,u + ) of B v if u < v. Call this construction slaveconstruction-. The second method of adding additional links that was highlighted in [8] proceeds as follows. Define the maps f,g : {0,,...,s} {,,...,s} {0,,...,s} by:f(u,v) = u+v mod (s+) andg(u,v) = s+ v. It is not difficult to show that the map (u, v) (f(u, v), g(u, v)) yields a required set of additional links through joining the slave-node (u,v) of B u to the slave-node (f(u,v),g(u,v)) of B f(u,v). Call this construction slave-construction-. Case (c): BCN(α,β,h,γ), where h > γ. We set s = α γ β (that is, the number of slave-nodes in BCN(α,β,h,γ)) and again we take s + disjoint copies of HCN(n,h), denoted,b,...,b s. Note that each B u contains within it t = α h γ (disjoint) canonical copies of HCN(n,γ), each identified by some unique index (i h,i h,...,i γ+ ) {,,...,α} h γ (see Section II-B). Thus, any HCN(n,γ) can be identified by a pair (u,v), where u {0,,...,s} and where v = id((i h,i h,...,i γ+ )) {,,...,t} (recall that id was defined in Section II-B). Furthermore, any slave-node of this copy of HCN(n, γ), indexed as (i h,i h,...,i,y), can be identified by a triple (u, v, w) with u and v as above and where w = id ((i γ,i γ,...,i,y)) {,,...,s}. 3....... HCN( α, β, γ) HCN( α, β, h) HCN( α, β, h) HCN( α, β, h) HCN( α, β, h) t B B B3 Bt B B K s+(, B,..., B s ) B s B K s+(, B,..., B s ) B s B3 K s+( 3, B 3,..., B s 3 ) B3 s B t B K s+( t, B t,..., B s t ) Figure. The network BCN(α,β,h,γ) when h > γ. Fix (i h,i h,...,i γ+ )) {,,...,α} h γ so as to define v = id((i h,i h,...,i γ+ )) {,,...,t}. In every B u, there is one copy of HCN(n,γ) identified by (u,v). Denote this copy of HCN(n,γ) by Bu v. Build the network K s+ (B0 v,bv,...,bv s ) as we did above in Case (b). Moreover, do this for every v {,,...,t}. What results is the DCN BCN(α,β,h,γ). The structure of BCN(α, β, h, γ) can be visualized in Fig., where: the lighter grey denotes the construction of some HCN(n,h); and the darker grey denotes a compound construction using a clique interconnection of some K s+ (B0,B v,...,b v s). v The superscriptv in Bu v denotes the row (from {,,...,t}) that the copy of HCN(n,γ) lies in and the subscript u the column (from {0,,...,s}). In order to traverse a column the links of some HCN(n,h) are used, and in order to traverse a row the links of some K s+ (B0,B v,...,b v s) v are used. IV. ONE-TO-ONE ROUTING IN THE DCNS HCN In this section we describe the one-to-one routing algorithm for HCN(n, h) called FdimRouting that was derived in [8] before describing an improved one-to-one, minimal routing algorithm called NewFdimRouting. In essence, the algorithm FdimRouting is that obtained in [, Section 3.] and the algorithm NewFdimRouting is actually that obtained in [, Section 3.]. Henceforth, we regard every server-node as a master-node or a slave-node according to its origin; so, we no longer have used-nodes. B s t B s

A. Routing with FdimRouting The one-to-one routing algorithm for HCN(n, h) from [8], named FdimRouting, proceeds as follows. Given a source node (i h,i h,...,i 0 ) {,,...,α} h+ and a destination node(i h,i h,...,i 0 ) {,,...,α}h+, letj be such that i h = i h,i h = i h,...,i j+ = i j+,i j i j : if such a j does not exist then the source and the destination coincide; and if j = 0 then the source and the destination are adjacent to the same switch-node. For the moment, we assume that source and destination nodes are master-nodes. If j > 0 then a path is obtained from the source to the node (i h,i h,...,i j+,i j,i j,... j times...,i j ) recursively and by working entirely within the canonical copy of HCN(n, j ) within HCN(n, h) indexed by (i h,i h,...,i j+,i j ). This path is then extended by a link at level j to the node (i h,i h,...,i j+,i j,i j,... j times...,i j ). Thus, we need a path from (i j,... j times...,i j ) to (i j,i j,...,i 0 ) in the canonical copy of HCN(n,j ) indexed by (i h,i h,...,i j+,i j ) which is obtained by proceeding recursively. What we have just described is the routing algorithm for WK(n, h) from [, Section 3.]. It was shown in Theorem 4 in [8] that FdimRouting yields a path joining any two master-nodes of HCN(n, h) of length at most h+. So, the length of a shortest path between any two master-nodes of HCN(n,h) is at most h+. Lemma. of [] yields that there exist two nodes for which a shortest path between them has length exactly h+. It is trivial to implement the algorithm FdimRouting as a source-routing algorithm so that it has O(h h ) time complexity (and not O( h ) as was stated in [], [8]; for even writing the route takes O(h h ) time where we assume that n = O()). Also, it is not difficult to see that FdimRouting can be implemented as a distributed-routing algorithm so that the time taken for each iterim node to compute the next node on the route is O(h). B. Routing with NewFdimRouting As was noted in [, Section 3.], the routing algorithm FdimRouting is not a minimal routing algorithm and can be improved. Consider applying the routing algorithm FdimRouting to the source node (,,) and the destination node (3,,) of HCN(4,). The resulting path is: (,,) (,,3) (,3,) (,3,3) (3,,) (3,,) (3,,) (3,,). However, the path (,,) (,,) (,,3) (,3,) (,3,3) (3,,) is shorter. The following algorithm, which we call GetShortest, was proven in [, Section 3.] to result in a minimal routing algorithm for WK(n, h) (and so for HCN(n, h) with master-nodes as the source and destination). We write u = (u h,u h,...,u,u 0 ) {,,...,α} h+ and v = (v h,v h,...,v,v 0 ) {,,...,α} h+. Algorithm: GetShortest input: source u, destination v with u v; compute the length l of the path obtained by executing FdimRouting with source u and destination v; let i be s.t. u i v i and u j = v j, for all j {h,h,...,i+}; for each z {,,...,α} s.t. u i z v i : compute the length lz of the path obtained by executing FdimRouting with source u and destination (u h,u h,...,u i+,u i,z,... i times...,z); compute the length lz of the path obtained by executing FdimRouting with source (v h,v h,...,v i+,v i,z,... i times...,z) and destination v; set l z = lz + lz + i + ; if l min{l z : u i z v i }: output 0; else: output z where l z = min{l z : u i z v i }; The value output by the algorithm GetShortest yields a shortest-path algorithm that we call NewFdimRouting. If the output is 0 then the shortest path from u to v is obtained by executing FdimRouting on u and v. If the output is z 0 then the shortest path from u to v is obtained by executing FdimRouting: with source u and destination (u h,u h,...,u i+,u i,z,... i times...,z); with source (u h,u h,...,u i+,z,u i,... i times...,u i ) and destination (u h,u h,...,u i+,z,v i,... i times...,v i ); and with source (u h,u h,...,u i+,v i,z,... i times...,z) and destinationv, before concatenating the resulting paths of nodes. One might think that one will have to actually (repeatedly) execute FdimRouting during an execution of GetShortest. However, by [, Lemma 3.3] the following is true. Theorem Let (z,z,... h times...,z) and (u h,u h,..., u,u 0 ) be two nodes of WK(n,h). The length of a shortest path joining these two nodes is i where i ranges over {i : i = 0,,...,h,u i z}. Consequently, we can calculate the length of a shortest path along with which route it takes without computing any actual path; a simple numeric calculation suffices. Once we have this information, we can build the actual path. Let us now return to when at least one of our source and destination nodes in HCN(n, h) is a slave-node (this was left blurred in [8]). W.l.o.g. suppose that our source is a slave-node. We calculate the length of a shortest path between every master-node adjacent to the same switch-node as the source and: the destination node, if the destination is a master-node; or to every master-node adjacent to the destination node, if the destination is a slave-node. We take the resulting path of minimal length as our shortest path.

Theorem assists significantly with this computation. As we noted above, FdimRouting can be implemented as both a source-routing and a distributed-routing algorithm. This is also true for NewFdimRouting. When implemented as a source-routing algorithm, the repeated numeric computations take O(h) time (recall, n is assumed to be O()); so, the complexity remains at O(h h ). When implemented as a distributed-routing algorithm, as well as carrying the source and the destination within the packet header, the value z, output from GetShortest, must also be carried. When it is, the time taken for each interim node to compute the next node on the route remains at O(h). V. ONE-TO-ONE ROUTING IN THE DCNS BCN In this section we describe the routing algorithms for the DCNs BCN derived in [8]. We show how these routing algorithms do not necessarily result in shortest paths before explaining how to improve the routing algorithms. A. Routing in BCN Consider BCN(α, β, h, γ) where h γ. With reference to Fig., where s = α γ β and t = α h γ, there are two cases to consider: when h = γ (and t = ); and when h > γ. As remarked earlier, the canonical copies of HCN(n, γ) in BCN(α,β,h,γ) (denoted Bu v in Fig. ) are identified by the pairs (u,v) {0,,...,s} {,,...,t}. Suppose that h = γ. The routing algorithm BdimRouting from [8] proceeds as follows. If the source and destination both reside in B u, for some u {0,,...,s}, then use FdimRouting within B u to find a route. If the source and destination reside in B u and B u, where u u, then: find the unique bridge-link (x,x ) joining a slave-node x in B u to a slave-nodex inb u ; and build the route from the source to x using FdimRouting within B u, concatenated with the link (x,x ) and concatenated with the route from x to the destination built using FdimRouting in B u. Suppose that h > γ. Two routing algorithms were proposed in [8] where the second is simply a symmetric version and so we ignore it. The main routing algorithm from [8] is called BdimRouting and proceeds as follows. If the source and destination both reside in B u, for some u {0,,...,s}, then use FdimRouting within B u to find a route. If the source and destination reside in Bu v and Bu v, where u u, then: find the unique bridge-link (x,x ) joining a slave-node x in Bu v to a slave-node x in Bu v ; and build the route from the source to x using FdimRouting within Bu v, concatenated with the link (x,x ) and concatenated with the route from x to the destination built using FdimRouting within B u. Of course, we can immediately improve this algorithm by using the algorithm NewFdimRouting instead of the algorithm FdimRouting. However, irrespective of whether we use FdimRouting or NewFdimRouting, the algorithm outlined above does not necessarily yield shortest paths within BCN(α, β, h, γ). For example, consider BCN(4, 3,, ) (we adopt the nomenclature of Case (b) of Section III). We have that s = α h β = 48. Suppose that we adopt slave-construction- and the source is the slave-node (0,48) {0,,...,48} {,,..., 48} with the destination the slave-node (,48) {0,,..., 48} {,,...,48}. According to the routing algorithm BdimRouting, we first compute the shortest path from(0, 48) to (0,) within. Denoting the master-nodes of as {,,3,4} 3, the following is such a path: (0,48), (4,4,), (4,,4),(4,,),(,4,4),(,4,),(,,4),(0,). We then concatenate on the link ((0,),(,)), and a shortest path from (,) to (,48) in B. Denoting the master-nodes of B as {,,3,4} 3, the following is such a shortest path: (,),(,,4),(,4,),(,4,4),(4,,),(4,,4),(4,4,), (, 48). This results in a path of length 5. However, the following is a path of length 3: (0,48),(48,),(48,),(,48). Suppose that we adopt slave-construction- and the source is the slave-node (0,) {0,,...,48} {,,...,48} with the destination the slave-node (48,48) {0,,...,48} {,,...,48}. According to the algorithm BdimRouting, we first compute the shortest path from (0,) to (0,48) within. Denoting the master-nodes of as {,,3,4} 3, the following is such a path: (0,), (,,4), (,4,), (,4, 4), (4,,), (4,,4), (4,4,), (0,48). We then concatenate on the link ((0, 48),(48, )), and a shortest path from (48,) to (48,48) in B 48. Denoting the master-nodes of B 48 as {,,3,4} 3, the following is such a shortest path: (48,), (,,4), (,4,), (,4,4), (4,,), (4,,4), (4,4, ), (48, 48). This results in a path of length 5. However, the following is a path of length 0: (0,), (,48), (,47), (48,), (,,4), (,4,), (,4,4), (4,,), (4,, 4), (4, 4, ), (48, 48) (here, the master-nodes are masternodes within B 48 ). B. Improved routing in BCN The routing algorithm BdimRouting for BCN(α, β, h, γ), where h = γ or h γ as appropriate, from [8], outlined above, is such that if the source is in B u and the destination is in B u, where u u, then the route derived remains entirely within B u and B u. The shorter paths in the examples given above do not have this property. Our improved routing algorithm in BCN(α,β,h,γ) is as follows. First, suppose that: h = γ; the source is in B u ; and the destination is in B u, where u u. Algorithm: NewBdimRouting for every u {0,,...,s}\{u,u }: find the unique bridge-links (x,x ) and (y,y ) from B u to B u and from B u to B u, respectively; piece together shortest paths joining the source to x in B u, x to y in B u and y to the destination in B u to get the path ρ u from the source

to the destination; build the path ρ using BdimRouting; choose the path ρ from all these paths so that its length is minimal; Suppose that: h > γ; the source is in Bu v ; and the destination is in Bu v, where u u. Algorithm: NewBdimRouting for every u {0,,...,s}\{u,u }: find the unique bridge-links (x,x ) and (y,y ) from Bu v to Bu v and from B v u to Bv u, respectively; piece together shortest paths joining the source to x in Bu, v x to y in B u and y to the destination in Bu v to get the path ρ u from the source to the destination; build the path ρ using BdimRouting; choose the path ρ from all these paths so that its length is minimal; Of course, Theorem makes the implementation of NewBdimRouting trivial. When implemented as a sourcerouting algorithm, and given our comments earlier as regards the implementation of NewFdimRouting, NewBdimRouting has time complexity O(h h ); for it is essentially 3 repetitions of NewFdimRouting. As regards the implementation of NewBdimRouting as a distributed-routing algorithm, again the time complexity is O(h). However, the packet header must also carry the 3 different z s corresponding to the 3 executions of NewFdimRouting as well as a parameter detailing which B u NewBdimRouting transits through. lengths of these paths in terms of the number of (servernode to server-node) hops. We also count the number of times savings have been made. Our results can be visualized in Figs. 3-6 for BCN (due to space limitations we do not detail the graphs for BCN(α, β, 3, γ) and HCN but just report these results). In Fig. 4, for example, h = 4 and γ = with n = 9. For each instantiation of α and β from {(7,),(6,3),(5,4),(4,5),(3,6),(,7)} and under slaveconstruction- and slave-construction-: the total saving in path-length expressed as a percentage of the total pathlength given by BdimRouting is detailed via the columns; and the total number of iterations leading to a reduction in path-length when we employ NewBdimRouting rather than BdimRouting expressed as a percentage of the number of iterations (namely 000) is detailed via the lines. We also give the number of server-nodes in BCN(α,β,4,). Figure 3. Experimental results for BCN(α, β, 4, ). VI. AN EMPIRICAL EVALUATION In this section we undertake an empirical evaluation so as to ascertain both the breadth and extent of the savings to be made by employing our new routing algorithms. A. Our experiments In what follows we describe an experiment for a particular DCN BCN(α,β,h,γ) (experiments for a DCN HCN(n,h) are analogous and more straightforward). We choose the parameters α, β, h and γ as well as the construction method (that is, slave-construction- or slave-construction- ). We have chosen h = 3,4, γ h and (α,β) {(7,),(6,3),(5,4),(4,5),(3,6),(7)} to get practically reasonable DCN sizes. Next, we decide upon the number of iterations to be undertaken (we choose 000) and in each iteration we randomly generate a source server-node and a destination server-node (so that they are distinct; these server-nodes can be either master- or slave-nodes) before employing the algorithm BdimRouting from [8] and the algorithm NewBdimRouting so as to find a path from the source to the destination. We derive the cumulative Figure 4. Experimental results for BCN(α, β, 4, ). B. Our evaluation Our evaluations of our experimental results for the DCNs HCN(n,h) and BCN(α,β,h,γ) are as follows. The percentage savings in terms of the number of hoplengths made by employing NewBdimRouting rather than BdimRouting in BCN(α,β,h,γ) is very similar in all figures in that as the value of α in (α, β) decreases, the percentage savings made increases (but only marginally);

some permanence or were to be used to transmit a significant amount of data, the transferral of which had some cost attached, then it might be worthwhile expending resource in computing a shorter path. Of course, we cannot be sure that a shorter path than that computed using BdimRouting would result but as we have seen this can be the case in over in paths in practice. Figure 5. Experimental results for BCN(α, β, 4, 3). REFERENCES [] M. Al-Fares, A. Loukissas, and A. Vahdat, A Scalable, Commodity Data Center Network Architecture, Proc. of ACM SIGCOMM, pp. 63 74, 008. [] C.-H. Chen and D.-R. Duh, Topological Properties, Communication, and Computation on WK-recursive Networks, Networks, vol. 4, no. 6, pp. 303 37, 994. [3] K. Chen, C. Hu, Z. Xin, K. Zheng, Y. Chen and A.V. Vasilakos, Survey on Routing in Data Centers: Insights and Future Directions, IEEE Networks, vol. 5, no. 4, pp. 6 0, 0. [4] G. Delia Vecchia and C. Sanges, Recursively Scalable Networks for Message Passing Architectures, Proc. of Int. Conf. on Parallel Processing and Applications, pp. 33 40, 987. Figure 6. Experimental results for BCN(α, β, 4, 4). moreover, the closer γ is to h, the better the savings made. These savings can be substantial, e.g., over 0% in BCN(4,3,3,6). The percentage savings in terms of the number of iterations where path lengths are reduced by employing NewBdimRouting rather than BdimRouting in BCN (α,β,h,γ) is very similar in all figures in that as the value of α in (α,β) decreases, the extent to which savings are made increases but there is a decline from (α,β) = (3,6) to (α,β) = (,7). These savings can be substantial, e.g., over in every source-destination pairs in BCN(4,4,5,4) results in a reduced path length. There appears to be no advantage in using the method slave-construction- over slave-construction- in the DCNs BCN and vice versa. The percentage savings in terms of the number of hoplengths made by employing NewFdimRouting rather than FdimRouting in HCN(n, h) are relatively modest as are the percentage savings in terms of the number of iterations where path lengths are reduced. In summary, there are real gains to be made in employing NewBdimRouting in BCN(α, β, h, γ) rather than BdimRouting. C. Significance As ever, there are trade-offs to be made in that when the value of s = α γ β is large, there are s alternative routes in BCN(α,β,h,γ) to try within NewBdimRouting and it can be computationally expensive to execute the algorithm. However, if the actual route to be computed were to have [5] A. Greenberg, J.R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D.A. Maltz, P. Patel and S. Sengupta, VL: A Scalable and Flexible Data Center Network, ACM SIGCOMM Comput. Commun. Rev., vol. 39, no.4, pp. 5 6, 009. [6] C. Guo, G. Lu, D. Li, H. Wu, X. Zhang and Y. Shi, BCube: A High Performance, Server-centric Network Architecture for Modular Data Centers, ACM SIGCOMM Comput. Commun. Rev., vol. 39, no. 4, pp. 63 74, 009. [7] C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang and S. Lu, DCell: A Scalable and Fault-tolerant Network Structure for Data Centers, ACM SIGCOMM Comput. Commun. Rev., vol. 38, no. 4, pp. 75 86, 008. [8] D. Guo, T. Chen, D. Li, M. Li, Y. Liu and G. Chen, Expandible and Cost-effective Network Structures for Data Centers using Dual-port Servers, IEEE Trans. Comput., vol. 6, no. 7, pp. 303 37, 03. [9] D. Li, C. Guo, H. Wu, K. Tan, Y. Zhang, S. Lu and J. Wu, Scalable and Cost-effective Interconnection of Datacenter Servers using Dual Server Ports, IEEE/ACM Trans. Network., vol. 9, no., pp. 0 4, 0. [0] Y. Liu, J.K. Muppala, M. Veeraraghavan, D. Lin and J. Katz, Data Centre Networks: Topologies, Architectures and Fault- Tolerance Characteristics, Springer, 03. [] R.N. Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya and A. Vahdat, PortLand: A Scalable Fault-Tolerant Layer Data Center Network Fabric, ACM SIGCOMM Comput. Commun. Rev., vol. 39, no. 4, pp. 39 50, 009. [] H. Wu, G. Lu, D. Li, C. Guo and Y. Zhang, MDCube: A High Performance Network Structure for Modular Data Center Interconnection, Proc. of 5th Int. Conf. on Emerging Networking Experiments and Technologies, pp. 5 36, 009.