An Optimization Model of Load Balancing in P2P SIP Architecture 1 Kai Shuang, 2 Liying Chen *1, First Author, Corresponding Author Beijing University of Posts and Telecommunications, shuangk@bupt.edu.cn 2, Beijing University of Posts and Telecommunications, clybuptbjc@gmail.com Abstract Based on the study of the existing load balancing strategy, this article offered an algorithm called AHBA (Adaptive Heterogeneous Balancing Algorithm) designed for mobile hot spot problem considering both node address space and node capacity adaptive heterogeneity. This strategy takes into account the heterogeneity of the nodes and rational allocation of address space to reallocate of the loads on super nodes, thus to improve the performance of P2P network. AHBA algorithm can be divided into two steps: initial distribution and adjustment. AHBA algorithm firstly carefully assigns the node ID for each serving node according to the load on each node and the capacity of each node. When the load of the network is uneven, a new node would be added to the network or one old node's ID would be changed. The changed or new node's ID is carefully computed by AHBA adjust algorithm. 1. Introduction Keywords: P2P, Load Balancing, Node Weights, Node Identification Distributed VoIP network is a hot topic in both academic and industry fields. The distributed VoIP network possesses the features such as distributed, decentralized, scalability and robustness. Usually distributed VoIP network apply P2P as the overlay architecture and SIP as the signal protocol. The other advantage of distributed VoIP network is that it could solve the existing problems such as the single point failure problem caused by the traditional client and server architecture of the SIP system. Furthermore, the performance bottleneck could also be solved by applying distributed VoIP network. For a well-designed distributed VoIP system, it must be able to gracefully handle the uncertainty and instability caused by the load of day-to-day communication activities. Those activities might lead to the booming number of online users and inconsistent user density in the two separate time intervals. As a result these uncertain activities would cause uncertain load for the VoIP system. This article proposes a load balancing algorithm "Adaptive heterogeneous network load balancing strategy - AHBA (Adaptive Heterogeneous Balancing Algorithm) to effectively erase the uneven load sharing of the network resources and solve the resource allocation problem. Our main method is to effectively allocate the node identifier according to the history records not randomly when the network initially established. And then when the system runs for a while, there might be some uneven load in the network. When the network is uneven, the algorithm will adjust the node identifier according to the real-time status of the network. Here the network we means the distributed VoIP network which applies DHT algorithm in the P2P overlay. In detail, the AHBA algorithm can be divided into two steps: initial distribution and adjustment. AHBA algorithm firstly carefully assigns the node ID for each serving node according to the load on each node and the capacity of each node. When the load of the network is uneven, a new node would be added to the network or one old node's ID would be changed. The changed or new node's ID is carefully computed by AHBA adjust algorithm. 2. Related Work Load balancing algorithm is to avoid load imbalance by distributing application load among the nodes in proportional to node capacities. A good load balancing should have good performance in balancing the utilization rate of each node. This article focus on the load balancing algorithm in P2P overlay using Distributed Hash Table (DHT). In DHT system, each node will obtain an independent and unified identity ID by the consistent hash or random function. The article [1] has confirmed that International Journal of Digital Content Technology and its Applications(JDCTA) Volume 7, Number 11, July 2013 doi : 10.4156/jdcta.vol7.issue11.3 18
the generated node ID, resulting in a maximum interval of O (logn / N) minimum interval of O (N 2 ), i.e. Bn = O (NlogN). To solve the node bearer address space non-equilibrium, the key is to design an improved hash ID generating algorithm. One improved hash function or ID generation mechanism is to use local detection scheme to dynamically adjust the node ID for each physical nodes. [2-3-4-5-6-7-8-9-10-11-12] When a new node is added, firstly consistency hash function or random function generates a node ID, and then by detecting the load state of its neighbor regions, adjust the node ID to generate a new adjusted ID thus achieve a partial load balancing. When the system is in the instability situation or the load distribution changes, dynamically adjusted some ID of the node to perform load balancing. The article [13] proposes a mechanism to compute the node weight using a representative group of node parameters and a so-called service profile. The novelty of Node Weight Computation (NWC) is taking into account several parameters in the weight computation and the technique to create the service profile applying factorial design and multi-objective optimization based on simulation. Moreover, NWC can be used for node weight computation in MANETs also by other clustering algorithms. However this passage only focuses on node capacity heterogeneity ignoring the allocation of the address space which largely influences the load distribution. According to the previous research, most researchers only focus on the address space strategy or the heterogeneous nodes balancing strategy. However, the real overlay P2P network is not evenly distributed not only in the address space but also in the nodes heterogeneity. Besides, all these focus on the general property of P2P network without considering business service specificity. In this article, we will propose an optimization model of load balancing in P2P SIP architecture considering both nodes heterogeneity and address space distribution. 3. Definition Introduction in AHBA Algorithm Before introducing the AHBA algorithm, first there are some definitions and formulas that should be explained. In terms of node, the definitions include node capacity Ci, node load Li, node utilization Ui, node ideal load Pi, the rate of deviation of a single node load ei. While in terms of network, related definitions are the network capacity utilization U and the network load deviation E. 3.1. Compute node capacity Ci The user activities in the DHT system would put load for every resource system in serving node such as storage space, CPU consumption, transmission bandwidth, etc. This article viewed all the factors that affect node capacity, such as CPU speed, storage capacity, memory, I / O response rate and bandwidth, as a resource. The parameter node capacity Ci represents node maximum carrying capacity for node i. This article established a dynamic weighted capacity assessment model, obtain the node capacity Ci[13-14-15]: Ci = P(CPU) w1 + P(Space) w2 + P(Memory) w3 + P(IO) w4 + P(BandWidth) w5 In this formula, the capacity for each serving node is determined by the each resource such as CPU capacity, transmission bandwidth, and memory capacity according to how much they can influence the capacity. In detail, the wi represents each factor s weight; P (X) represents the measure of the ability of Resource X value. Different application systems are different from the configuration of the respective factors weight. For example, the server cluster had high I / O and bandwidth requirements for each node, so the weights of the corresponding I / O and the bandwidth can be configured for a larger value. Empirical model can be used to set wi as well. By collect the physical parameters and historical data, we can get the corresponding relations of each resource to its capacity. And according to the corresponding table, we can get the final node capacity Ci that would reflect the maximum number of objects that could serve. 3.2. Compute the nodes load Li 19
Distributed network based on a DHT, the load of the node includes both the occupied storage space and also a processing task related CPU overhead and data transfer and application layer routing maintenance brought bandwidth consumption. These loads are associated with the number of objects that need to be served in general. Usually the relationship between the numbers of objects of the node service is linear. Li = Ni Q Li represents the node load; Ni represents the number of objects that node i servers; Q represents is a constant value based on historical experience. 3.3. Compute the node capacity utilization Ui The node capacity utilization Ui is the ratio between node load L and node capacity C. That means for node i, its capacity utilization is Ui: Ui = Li/Ci 3.4. Compute the network capacity utilization U The capacity utilization of the defined system is the ratio between the total load of the system and the total system capacity U = Li Ci 3.5. Compute node ideal load Pi Calculate a single the SN "ideal load" is the product of the system capacity utilization and node capacity Pi = U Ci 3.6. Calculate the rate of deviation of a single node load ei The defined node load deviation rate e node ratio of the deviation of the actual load node ideal load and node capacity, node i load deviation rate e 3.7. The network load deviation E ei = (Pi Li)/Ci The network load deviation E is defined as the sum of each node s deviation rate. 4. AHBA Algorithm Introduction E = ei AHBA Algorithm could solve the problem of uncertain load caused by user activities. In order to solve these problems, AHBA Algorithm applies the ID manipulation strategy. Generally speaking, AHBA algorithm firstly carefully assigns the node ID for each serving node according to the load on each node and the capacity of each node. In such scenarios, each physical node runs a single ID firstly. When the load of the network is uneven, a new node would be added to the network or one old node's ID would be changed. The changed or new node's ID is carefully computed by AHBA adjust algorithm. In detail, AHBA algorithm could be divided into 3 parts. 20
i. Distributed the original node ID based on the history record in traditional communication system; ii. If partial network overloaded using specific adjust algorithm. In this step, either a new node is added or an old node s ID is changed. iii. Add new node when the network in the whole is overloaded using specific expansion algorithm; The Fingure 1 AHBA Algorithm Flow Process Chart displays the AHBA Algorithm Process simply. In detail, firstly the AHBA algorithm will run the initial distribution module to assign node ID for the super nodes. When the network utilization rate U is above the threshold low_network (low threshold level for the network) below threshold high_network (the high threshold level for the network), the AHBA algorithm will run the adjustment module to adjust some nodes ID to balance the load across the whole network. When the network utilization rate U is above threshold high_network, the AHBA algorithm will run the expansion module to add new nodes to the original overlay in order to expand the network scale. The parameter threshold low_network and threshold high_network can be set according to the past experience and should be variable according to different system. Figure 1. AHBA Algorithm Flow Process Chart 4.1. Initial Distribution Based on our survey, the traffic in VoIP system is quite similar with the traffic in traditional communication system. According to these, the initial distribution for the server node should not be random. In ABHA Algorithm, we first analyze the traffic in original communication and the initial load for node in VoIP system should be considered as a constant ratio of the original one. So the initial ID for node I should be calculated as follows. Id = Id + K K Idi represents the node id for node i; Ki represents the traditional traffic for users in the region same with node I; S represents all the address space for the network. S 21
4.1. Adjust and Expansion Algorithm If the network utilization rate U is above the threshold low_network (low threshold level for the network) below threshold high_network (the high threshold level for the network), but some node s utilization rate ui is above threshold high_node (the high threshold for per node), then we should adjust the network. The adjust algorithm could be divided into steps: i. Order each node in the network according to the node deviation ui; ii. Execute the following steps until no node s ration is above threshold high_node ; a) If the utilization rate of previous node i-1 is below threshold low_network, distribute new ID for the node i according to the following formula. Id i represents the ID for node i ; Id i- 1reprensents the Id for node i-1; P i represents the ideal load for node i. Id = Id + Pi b) If the utilization rate of previous node i-1 is above threshold low_network and the utilization rate of node i-2 is also above threshold low_network, activate new serving node. i. Recalculate the network utilization rate U; ii. Recalculate the perfect load Pi for node I, Pi-1 for node i-1; iii. Recalculate the perfect load P new for the newly added node; iv. Recalculate the newly node for node i, node i-1 and also new node; Id = Id + Pi If the network utilization rate U is above threshold high_network, i. Order each node in the network according to the node deviation ui; ii. Execute the following steps until no node s ration is above threshold high_node : a) If the utilization rate of previous node i-1 is above threshold low_network and the utilization rate of node i-2 is also above threshold low_network, activate new serving node. i. Recalculate the network utilization rate U; ii. Recalculate the perfect load Pi for node I, Pi-1 for node i-1; iii. iv. Recalculate the perfect load P new for the newly added node; Recalculate the newly node for node i, node i-1 and also new node; Id = Id + Pi 4. Simulation experiments and results In this architecture, the underlying overlay is using Chord protocol as the P2P algorithm. In the simulation experiment, the resources and traffic is a simulation of real situation. The simulation parameters are listed in the Table 1 Simulation Experiment Configuration. In the real world, the network scale is very large usually thousands or ten thousands. In this simulation we choose one thousand. Besides, in the real situation, the data transfer cost would be very large when executing the ABHA algorithm. Because of this, the aim of this article is to compare the traditional algorithm and ABHA algorithm according to the balanced effect. Table 1.Simulation Experiment Configuration Parameters Description Value L N Cmax The space for ID Network scale Node Max Capacity 2 16 10 3 50 22
C Node Capacity [1,50] This article compares the ABHA algorithm and the traditional random distribution algorithm based on the network load balanced degree of probability distribution. According to the Figure 2 Comparison between ABHA Algorithm and Random Algorithm, the ABHA algorithm largely improves the balanced degree in the VoIP network. As the network scale expands, the network load deviation value of ABHA Algorithm increases much slower than that of Random ID Generation Algorithm. To be specific, the horizontal axis represents the network scale and the ordinate axis represents the network load deviation. For the legend, the solid line represents the ABHA Algorithm and the dotted line represents the Random ID Generation Algorithm. The network load deviation for Random ID Generation Algorithm and ABHA doesn t have much differences when the network scale is below 150. However the network deviation for Random ID Generation largely exceeds that of ABHA Algorithm when the network scale is beyond 197. That means the network balancing of ABHA Algorithm is much better than that of Random ID Generation Algorithm when the network scale is over 197. network load deviation 40 30 20 10 0-10 1 50 99 148 197 246 295 344 393 442 491 540 589 638 687 network scale Random ID Generation Algorithm ABHA Algorithm Figure 2.Compare between ABHA Algorithm and Random Algorithm 5. Conclusions This paper proposes a load balancing algorithm based on the integration of P2P SIP VoIP network, mainly to solve the mobile hot issues caused by user behavior. The load balancing algorithm combines the heterogeneous characteristics of the network nodes for load balancing and network address space allocation for load balancing. AHBA algorithm firstly carefully assigns the node ID for each serving node according to the load on each node and the capacity of each node. In such scenarios, each physical node runs a single ID firstly. When the load of the network is uneven, a new node would be added to the network or one old node's ID would be changed. The changed or new node's ID is carefully computed by AHBA adjust algorithm. In the future, the author will continue to measure the cost of the network migration studies, in order to find a balance between the costs and benefits of load balancing for network managers direct guidance. 6. Acknowledgements Important national science & technology specific projects: Next-generation broadband wireless mobile communications network(2011zx03002-002-01), Innovative Research Groups of the National Natural Science Foundation of China (61121061), National Key Basic Research Program of China (973 Program)(2009CB320504). 23
7. Reference [1] Jian Liang, NaoumNaoumov, Keith W, "The Index Poisoning Attack in P2P File Sharing Systems", IEEE INFOCOM, vol.6, 2006. [2] B. Mitra, S. Ghose, N. Ganguly, "Effect of dynamicity on peer to peer networks", Proceedings of the 14th international conference on High performance computing, pp.452-463, 2007. [3] B. Mitra, S. Ghose, N. Ganguly, "Stability analysis of peer-to-peer networks against churn", Pramana. Vol.71, no.2, pp.263-273, 2008. [4] S. Marti, H. Garcia-Molina, "Taxonomy of trust: Categorizing P2P reputation systems", Computer Networks, vol.50, no.4, pp.472-484, 2006. [5] Godfrey B, Lakshminarayanan K, "Load balancing in dynamic structured P2P systems", IEEE INFOCOM, vol.04, pp.2253-2262, 2004 [6] Wang Bin, Shen Qingguo, "ID management and allocation algorithm for P2P load balancing", Communication Technology(ICCT), pp.1232-1235, 2010 [7] Cheng Jun, "Research of Load Balancing Algorithm in DHT Based P2P Systems", Internet Technology and Applications, pp.1-4, 2010 [8] Atsushi Takeda, Takuma Oide, "Simple dynamic load balancing mechanism for structured P2P network and its evaluation", International Journal of Grid and Utility Computing, pp126-135, 2012 [9] G.T.Chavan, M.A. Mahajan, "Load Balancing in P2P networks using DHT based systems and Ant based systems: A Comparison", International Journal of Engineering Research and Applications (IJERA),vol. 3, pp.356-359,2013 [10] Zhang Bingquan, "An optimization model of load balancing in Peer to Peer (P2P) Network", Computer Science and Service System(CSSS), pp.2064-2067, 2011 [11] WANG Bin, SHEN Qing-guo, "An Effective Algorithm for Hierarchical P2P Load Balancing", JCIT, vol.6, No.5, pp.231-236, 2011 [12] Wei Mi, Chunhong Zhang, Xiaofeng Qiu,"Ant-based Load Balancing Algorithm in Structured P2P Systems", JCIT, vol.7, No.6, pp.332-340, 2012 [13] KarolyFarkas, Theus Hossmann, "Node Weight Computation in MANETs", Computer Communications and Networks, 2007. [14] Ong, Ivy, "Dynamic Load Balancing and Network Monitoring in iata Protocol for Mobile Appliances", Multimedia and Ubiquitous Engineering, pp1-2,2010 [15] Xiangbin Yan, Li Zhai, "C-index: A weighted network node centrality measure for collaboration competence", Journal of Informetrics, pp223-239, 2013 24