Fault-Tolerant Routing Algorithm for BSN-Hypercube Using Unsafety Vectors

Journal of omputational Information Systems 7:2 (2011) 623-630 Available at http://www.jofcis.com Fault-Tolerant Routing Algorithm for BSN-Hypercube Using Unsafety Vectors Wenhong WEI 1,, Yong LI 2 1 School of Electronic and Information Engineering, South hina University of Technology, Guangzhou 510640, hina 2 Department of omputer Science, Dongguan University of Technology,, Dongguan 523808, hina Abstract Biswapped network (BSN) is a recently proposed network model of parallel computing, which is built of 2N copies of an N-node basis network and its basic network may be hypercube, mesh and other networks, some algorithms such as basic communication operations algorithms, matrix multiplication algorithm and parallel sorting algorithm have been developed. In this paper, we proposed fault-tolerant routing algorithm on BSN-Hypercube using unsafety vectors. Firstly, we show how each node calculates numeric unsafety vectors, and then use them to achieve efficient fault-tolerant routing. Simulation results show that proposed algorithm has significant promotion in efficiency and stability. Keywords: Biswapped Network; Fault-routing Algorithm; Unsafety Vectors 1. Introduction Biswapped network (BSN), the new network is a class of hierarchical network and is tight related to the OTIS network [1]. BSN is of more regularity than the OTIS network. BSN is built of 2N copies of an N-node basic network using a simple rule for connectivity that ensures its regularity, modularity, fault tolerance, and algorithmic efficiency. In particular, if the basic network is a ayley digraph then so is the BSN, thus a systematic construction of large, scalable, modular, and robust parallel architectures are given, while maintaining many desirable attributes of the underlying basic network that comprises its clusters. Similar to OTIS network, some topological properties of BSN have been investigated [2], and lots of algorithms such as basic communication operations algorithms, matrix multiplication algorithm, parallel sorting algorithm and fault tolerance on the BSN have been developed [3-6]. A massively parallel computer system cannot avoid having failure components in real world. A realistic data communication scheme should have the capability of fault tolerance. BSN-Hypercube network is a kind of network whose basic network is hypercube in parallel computer system, and several fault-tolerant communication schemes have been proposed in previous research works which are based on either local information [7] or global information [8] in hypercube. Local-information-based routing yields sub-optimal routes (if not routing failure) due to the insufficient information upon which the routing decisions are made. Global-information-based routing can achieve optimal or near optimal routing, but often at the expense of orresponding author. Email addresses: hquwwh@tom.com (Wenhong WEI). 1553-9105/ opyright 2011 Binary Information Press February, 2011

624 W. Wei et al. /Journal of omputational Information Systems 7:2 (2011) 623-630 high communication overhead to maintain up-to-date network-wide fault information. In [9], Lan has presented a fault-tolerant routing algorithm based on local information, and which guarantees an optimal or near optimal routing. Lee and Hayes used the concept of unsafe nodes to design a fault tolerant routing strategy [10]. J.Al-Sadi used unsafety vectors to propose a new fault-tolerant routing for the binary n-cube [11]. Based on [11], in this paper, we develop fault-tolerant routing algorithm on BSN-Hypercube using unsafety vectors. Our paper is organized as follow: In section 2 we describe the structure of BSN and some conception of unsafety vectors. Section 3 presents the proposed fault-tolerant routing algorithm for the BSN-Hypercube. Some simulation results in the last section help to show availability of our work. 2. Preliminaries Definition 1. Let Ω be a graph with the vertex set V( Ω ) = { h1, h2,..., h n } and the arc set E( Ω ). Biswapped network ΣΩ ( ) =Σ= ( V( Σ), E( Σ )) is a graph defined as follows [1]: V(Σ) = {<g, p, 0>, <g, p, 1> g, p V(Ω)} and E(Σ) = {(<g, p 1, 0>, <g, p 2, 0>), (<g, p 1, 1>, <g, p 2, 1>) (p 1, p 2 ) E(Ω), g V(Ω)} {(<g, p, 0>, <p, g, 1>), (<g, p, 1>, <p, g, 0>) g, p V(Ω)} Intuitively, if we regard the basis network as group, the definition postulates 2N groups, each group being an Ω digraph: N groups, with nodes numbered group#, processor#, 0, form part 0 of the bipartite graph, and N groups constitute part 1, with associated node numbers group#, processor#, 1. Each group p in either part of Σ has the same internal connectivity as Ω (intra-group edges, forming the first set in the definition of E(Σ)). In addition, node g of group p in part 0/1 is connected to node p in group g of part 1/0 (inter-group or swap edges in the second set in the definition for E(Σ)). The name Biswapped network (BSN) arises from two defining properties of the network just introduced: when groups are viewed as super-nodes, the resulting graph of super-nodes is a complete 2N-node bipartite graph, and the inter-group links connect nodes in which the group number and the node number within group are interchanged or swapped. When Ω= H 4 is a hypercube having 4 nodes, an example of the network Σ(H 4 ) is denoted in Figure 1. Fig.1 An Example of BSN Whose Basic Network is H 4

W. Wei et al. /Journal of omputational Information Systems 7:2 (2011) 623-630 625 Similar to swapped network (or OTIS), links between vertices of the same group are regarded as intra-group links, and links between vertices of between two groups follow the swapping strategy, which are regarded as inter-group links. In hypercube, the neighbor of a node along the i-th dimension is denoted (i). With respect to a given destination node D, a neighbor (i) of node is called a preferred neighbor for the routing from to D if the ith bit of D is 1. We say in this case that i is a preferred dimension. Neighbors other than preferred neighbors are called spare neighbors. Routing through a spare neighbor increases the routing distance by two over the minimum distance. An optimal path can be obtained by routing through all preferred dimensions in some order. node T is called an (, D)-preferred transit node if any preferred dimension for the routing from to T is also a preferred dimension for the routing from to D. 3. Algortihm 3.1. The Unsafety Sets alculations Definition 2. The first-level unsafety set 1 i n i S1 = f where of a node is defined as i f is give by (1) f i { = φ ( i) } if ( i) is fault Otherwise (2) In the formula, the n is the number of dimension in hypercube. Definition 3. An isolated node is associated with first-level unsafety set containing n+1 addresses of faulty nodes, i.e., = n+1. Definition 4. If for some node, = n, then node is called a dead-end node. Each node uses the unsafety set to determine the faulty set F, which comprises those nodes which are either faulty or unreachable from due to faulty nodes or links. This is achieved by performing m-1 exchanges with the reachable neighbors. After determining F, node calculates m unsafety sets denoted, S 2,..., S m, where m is an adjustable parameter between 1 and 2n+2. Definition 4. The k-level unsafety set S k, 1 k m, for node is given by S k = = { B F d( B, ) k} The k-level unsafety set S k represents node s view of the set of nodes at distance k from which are faulty or unreachable from due to faulty nodes and links. Notice that if the network is disconnected due to faulty nodes and links, s view about unreachable nodes may not be accurate. In this case massage of unreachability may occur. Figure 2 gives an outline of the Find_Unsafety_Sets algorithm that node uses

626 W. Wei et al. /Journal of omputational Information Systems 7:2 (2011) 623-630 it to determine it s faulty and unsafety sets. Example 1. onsider a two-dimensional hypercube with 4 nodes as basic network, there are 32 nodes in BSN-Hypercube. Now, assumed there are 8 faulty nodes (faulty nodes are represented as black nodes), as shown in Figure 3. Table 1 shows the corresponding first-level unsafety set,, associated with each node. The Find_Unsafety_Sets algorithm calculates the sets S m for all 1 k m after calculating F. To achieve this, (m-1) exchanges of fault information are performed among neighboring nodes. Find_Unsafety_Sets(<g c, p c >) /* called by node to determine its faulty set F */ { = set of faulty or unreachable immediate neighbors; } F.= ; for( k=2; k<=m; k++){ for( i=1; i<=n; i++) { i if ( pc ( ) F ) { (i) send F to p c ; (i) (i) receive F from p c ; F = F F (i) ; send F to ; receive F p c, g > from ; < c F = F F ; }}} for ( k=1; k<=m; k++) ={<g b, p b > F dist(<g c, p c >, <g b, p b >) = k } Fig.2 The Find_Unsafety_Sets Algorithm that Determines the Faulty Set for Node. Fig.3 BSN-Hypercube with 8 Faulty Nodes Let m=2n+2 and for the sake of specific illustrations let us compute the unsafety sets associated with node =<00, 00, 0>. First, the node assigns the addresses of its immediate faulty neighbours to its faulty set

W. Wei et al. /Journal of omputational Information Systems 7:2 (2011) 623-630 627 F. Then each node performs 2n+1 exchanges of the new elements of its faulty set F with the immediate non-faulty neighbors. After determining F, node calculates m unsafety sets according to the distance between node and each element of F. So, the faulty set for node in our example, given in decimal representation, F ={<00, 01, 0>, <01, 11, 0>, <10, 10, 0>, <11, 10, 0>, <00, 00, 1>, <00, 11, 1>, <01, 01, 1>, <11, 11, 1>}, and the unsafety sets are ={<00, 01, 0>, <00, 00, 1>}, S 2 ={}, S 3 ={<00, 11, 1>, <01, 01, 1>}, S 4 ={<10, 10, 0>}, S 5 ={<01, 11, 0>, <11, 10, 0>, <11, 11, 1>}, and S 6 ={}. Table 1 The Unsafety Sets of Nodes in BSN-Hypercube with 8 Faulty Nodes. Node <00,00,0> <00,01,0> <00,10,0> <00,11,0> <01,00,0> <01,01,0> <01,10,0> <01,11,0> {<00,01,0>, <00,00,1>} faulty {<00,01,0>} { } { } {<01,01,1>, <01,10,1>} faulty {<01,11,0>} Node <10,00,0> <10,01,0> <10,10,0> <10,11,0> <11,00,0> <11,01,0> <11,10,0> <11,11,0> {<10,10,0>} { } faulty {<10,10,0>} {<11,10,0>, <00,11,1>} { } faulty {<11,10,0>, <11,11,1>} Node <00,00,1> <00,01,1> <00,10,1> <00,11,1> <01,00,1> <01,01,1> <01,10,1> <01,11,1> {<00,00,1>, {<00,00,1>, {<00,01,0>, faulty faulty faulty { } {<01,01,1>} <00,11,1>} <00,11,1>} <01,01,1>} Node <10,00,1> <10,01,1> <10,10,1> <10,11,1> <11,00,1> <11,01,1> <11,10,1> <11,11,1> { } { } {<10,10,0>} {<11,10,0>} { } {<01,11,0>, <11,11,1>} {<11,11,1>} faulty 3.2. Fault-Tolerant Algorithm Definition 5. For a given source nodes (denoted <g c, p c, i 1 >) and destination node D (denoted <g d, p d, i 2 >) in BSN-Hypercube, we define the (, D)-unsafety vector U,D = ( u, D, D 1, uk,, u, D m ) where its k th element is given by D u, k = { T S k, such that T is an (, D)-preferred transit node} In other words, D u, k is the number of faulty or unreachable (, D)-preferred transit nodes at distance k from <g c, p c, i>. D u, k can be viewed as a measure of routing unsafety at distance k from <g c, p c, i>, hence the name unsafety vectors for U,D. The algorithm for fault-tolerant routing on BSN-Hypercube as following: Nexthop_Fault-Tolerant_Routing(<g c, p c, i 1 >,<g d, p d, i 2 >) { if(i 1 = 0 and i 2 = 0) if(g c = g d and p c = p d ) return null; //reach destination if (p c = p d )

628 W. Wei et al. /Journal of omputational Information Systems 7:2 (2011) 623-630 if (the node is not faulty) return ; if ( a non-faulty neighbor (i) and (i) is not dead-end) return (i) ; return < Nexthop_UVH (g c, g d ), p c, 0>; if(i 1 = 1 and i 2 = 0) if(g c = g d and p c = p d ) if (the node <g c, p c, 0> is not faulty) return <g c, p c, 0>; if ( a non-faulty neighbor (i) and (i) is not dead-end) return (i) ; if(p c = g d ) if (the node is not faulty) return ; if ( a non-faulty neighbor (i) and (i) is not dead-end) return (i) ; return < Nexthop_UVH (g c, p d ), p c, 1>; if(i 1 = 0 and i 2 = 1) if(g c = g d and p c = p d ) if (the node <g c, p c, 1> is not faulty) return <g c, p c, 1>; if ( a non-faulty neighbor (i) and (i) is not dead-end) return (i) ; if(p c = g d ) if (the node is not faulty) return ; return < Nexthop_UVH (g c, p d ), p c, 0>; if(i 1 = 1 and i 2 = 1) if(g c = g d and p c = p d ) return null; //reach destination if(p c = p d ) if (the node is not faulty) return ; if ( a non-faulty neighbor (i) and (i) is not dead-end) return (i) ;

W. Wei et al. /Journal of omputational Information Systems 7:2 (2011) 623-630 629 return < Nexthop_UVH (g c, g d ), p c, 1>; } Nexthop_UVH () /*Nexthop routing function with unsafety vectors in hypercube*/ { if (p c = p d ) return null; if ( a preferred non-faulty neighbor (i) with least ( (i), D)-unsafety vector return (i) ; if ( a spare non-faulty neighbor (j) with least ( (j), D)-unsafety vector return (j) ; } U ( i ) D U, and (i) is not dead-end) ( j ) Fig.4 The Fault-tolerant Routing Algorithm, D and (j) is not dead-end ) 4. Experiments and Performance Evaluation This section conducts performance statistical results of the proposed unsafety vectors approach. To this end, a simulation study has been carried out for the unsafety vectors approach over a 512-node 4 dimensions BSN-Hypercube with different random distributions of faulty nodes. We started with a non-faulty BSN-Hypercube and then the number of faulty nodes was increased gradually up to 70% of the network size with random fault distribution. A total of 256 source-destination pairs where selected from each node to all other nodes in the network at each run. We compute the probability of routing failure and the average length of the successful routing when different percentages of nodes become faulty in BSN-Hypercube. A percentage of faulty nodes are randomly selected before simulation, and then these two metrics are measured by running the routing algorithm with fault tolerance. The probability that a node fails is varied from 0% - 70%. 256 tests are run for each failure probability. Figure 5 plots the probability that the routing ends in failure as a function of the probability of nodes failure. Note that our tests include the test cases in which the destination of a routing is faulty, which means that when the probability of nodes failure is 20%, about 20% of routing failure can not be avoided because the destination is faulty. When the probability of nodes failure is 70%, the fault-tolerant routing algorithm is capable of delivering messages. The routing length would inevitably increase when a routing encounters a faulty node on its way to the destination. Figure 6 plots the average hops required for the successful routing. The even curve in this figure indicates that the routing length is rarely influenced by faulty nodes when the probability of nodes failure is not high. In this example, the probability is no more than 40%. 5. onclusions In this paper, we have proposed an algorithm for fault-tolerant routing on BSN-Hypercube network based on the concept of unsafety vectors, as a first step in this algorithm, each node determines its view of the faulty set F of nodes which are either faulty or unreachable from. Equipped with these unsafety sets each node calculates unsafety vectors and uses them to achieve fault-tolerant routing in the

630 W. Wei et al. /Journal of omputational Information Systems 7:2 (2011) 623-630 BSN-Hypercube. A performance result has revealed that the new algorithm performs substantially in terms of the routing distance and percentage of reachability even when the probability of faulty nodes arrived 70%. The probabilty that routing fails 1 0.9 Unreachability 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0% 10% 20% 30% 40% 50% 60% 70% The probablity that nodes fail The average length of successful routing 15 14 13 12 11 10 9 8 7 6 5 Routing Length 0% 10% 20% 30% 40% 50% 60% 70% The probability of nodes fail Fig.5 The Probability that Routing Nodes in Failure Fig.6 The Average Length of Successful Routing Acknowledgement This work is supported by the Natural Science Foundation of hina (No. 60973150) and the Young Natural Science Foundation of Dongguan University of Technology (No. 2010QZ21). References [1] Wenjun.Xiao, Weidong.hen, Mingxin.He, Wenhong.Wei and B.Parhami. Biswapped Network and Their Topological Properties. Proceedings-Eighth AIS International onference on Software Eng., Artific. Intelligence, Networking, and Parallel/ Distributed omputing, 2007, pp.193-198. [2] Wenhong Wei and Wenjun Xiao. Matrix Multiplication on the Biswapped-Mesh Network. Proceedings Eighth AIS International onference on Software Eng., Artific. Intelligence, Networking, and Parallel/Distributed omputing, 2007, pp.211-215. [3] Wenhong Wei, Wenjun Xiao. Fault Tolerance in the Biswapped Network. The 8th International onference on Algorithms and Architectures for Parallel Processing, 2008, 5022: 79-82. [4] Wenhong Wei, Wenjun Xiao. Algorithms of Basic ommunication Operation on the Biswapped Network. The 8th International onference on omputational Science, 2008, 5101: 347-354. [5] Wenhong Wei, Wenjun Xiao. Efficient Parallel Algorithm for Sorting on the Biswapped Network. Journal of omputational Information Systems, 2008, 4(4): 1365-1370. [6] Yulian Yu, Wenhong Wei. Load balancing on the biswapped network. Proceedings of the 2nd International onference on Intelligent Networks and Intelligent Systems, 2009, pp.146-149. [7] M.S. hen, K.G. Shin, Adaptive fault-tolerant routing in hypercube multicomputers, IEEE Trans. omputers, 1990, 39 (12): 1406 1416. [8] J.-P. Sheu, M.-Y. Su, A multicast algorithm for hypercube multiprocessors, Proc. Int. onf. Parallel Processing, 1992, pp.18 22. [9] Y. Lan, An adaptive fault-tolerant routing algorithm for hypercube multicomputers, IEEE Trans. Parallel Distributed Syst. 1998, 6 (11): 1147 1152 [10] T.. Lee, J.P. Hayes, A fault-tolerant communication scheme for hypercube computers, IEEE Trans. omputers 41 (10) (1992) 1242 1256 [11] J. Al-Sadi, K. Day, M. Ould-Khaoua, Unsafety vectors: A new Fault-tolerant routing for the binary n-cube, Journal of Systems Architecture, 2002, 47( 9): 783-793