A QoS-driven Resource Allocation Algorithm with Load balancing for

A QoS-driven Resource Allocation Algorithm with Load balancing for Device Management 1 Lanlan Rui, 2 Yi Zhou, 3 Shaoyong Guo State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China llrui@bupt.edu.cn, yizhou.work@gmail.com, syguo@bupt.edu.cn Abstract Currently, load balancing is one of the core issues of performance improvement in servers. However, previous studies have paid less attention to the resource allocation for Device Management (DM) server which need process large quantities of DM commands simultaneously. In this paper, we propose a novel resource allocation algorithm including load balancing and service scheduling with admission control in order to achieve optimized resource allocation for DM server. Integrating the characteristics of DM service and DM server, our algorithm is dual-driven by service QoS and server loads. In addition, a new mathematic model is proposed to measure the load of DM server. The simulation results show that our algorithm could improve service QoS and achieve optimization of resource allocation effectively. Keywords: Service Scheduling, Load Balancing, Resource Allocation, Device Management 1. Introduction With rapid development and enrichment of mobile communication services, the requirements of mobile service for device capacity are continually increasing. Mobile devices, as the carriers of service, continue to develop at a high speed and have become an indispensable part of operator services system. The concept of device management is proposed to match the challenge. Device management promotes the development of network management in direction of both user-oriented and service-oriented. At present, device management has become a new hot topic in research of network management. By defining open service framework [1], OMA [2] achieves interconnected service for different devices and all kinds of services could co-exist within the OMA framework. For mobile services and terminal devices, which both are high-speed developing in quantity and variety, the processing capacity of DM server is more and more important. The capacity of DM server depends on hardware performance and deployment of DM servers. However OMA does not provide solution about deployment of DM servers. Therefore how to achieve optimized deployment of DM servers becomes a key issue for device management. Most of traditional solutions have applied load balancing algorithm to improve the usage of every server in systems. A lot of related researches are about achieving load balancing on web servers, while they have ignored the usage of DM servers. They focus on combining the characteristic of web with load balancing algorithm. Unlike previous research, we aim to propose a new load balancing algorithm by closely combining with OMA DM. Furthermore, it is urgent to achieve distinction on priority level for mobile service. However, QoS mechanisms for network could not solve the problem of device QoS control on device management. As an integral part of device management, DM server should support QoS mechanisms and policies. Therefore, how to optimize the allocation of server resources and provide QoS-based services has become another key issue for device management. The purpose of traditional load balancing strategy [3] is to achieve balanced load allocation for servers. However, they ignore the impact of different service characteristics on load distribution. Thus, we propose a new resource allocation algorithm which dual-driven by service QoS and DM server loads. The remainder of this paper is organized as follows: In Section 2, related works are presented. In Section 3 we propose our resource allocation algorithm with its architecture, then we describe our service scheduling algorithm and load balancing algorithm separately. Section 4 describes the experimental results. Finally the concluding remarks are drawn in Section 5. Advances in information Sciences and Service Sciences(AISS) Volume4, Number9, May 2012 doi: 10.4156/AISS.vol4.issue9.18 148

2. Background 2.1. OMA DM OMA DM specifications are important standard released by OMA in order to achieve mobile device management. It proposed how to define the management information in form of DM tree [4] and how to manage device remotely through OMA DM. OMA DM specifications focus on service framework, work flow and data organization. Unfortunately, in the continuous improvement of OMA DM, OMA ignored potential problems in the deployment such as resource allocation. According to OMA DM, a management session consists of some DM commands. Note that different DM commands could result in different resource consumption on DM server. For example: Alert may consume less resource, while Add may consume more resource. As is mentioned above, the load of DM server is changing dynamically. This situation has been taken into consideration when we handle load allocation. Based on the above analysis, we could divide DM commands into two types: High-Cost Command and Low-Cost Command. High-Cost Commands consist Add, Replace, Atomic, Result and Sequence, because all this kind of commands need to either transport MOs or execute through a set of commands. Low-Cost Commands, including Get, Alert, Delete, Exec and Copy, need not to transport MOs which consume less resource. 2.2. Load Balancing Algorithm Numerous dispatching algorithms [5, 6, 7] were proposed for cluster servers. We can classify them into static and dynamic algorithms [8, 9]. Static algorithms fail to consider real-time load status information. In contrast, dynamic algorithms use mechanisms to monitor load information and dispatch task based on real-time load information. Therefore dynamic algorithms are more complex than static ones and could provide better performance on load balancing. Much research has been done on load balancing algorithm. Round-Robin [10] is a simple and frequently used load balancing algorithm. However, it is a non-adaptive load balancing algorithm. Servers will handle the request from client in turns which allocate service resources to user based on a fixed order. Least-Loaded [11] is a frequently used dynamic load balancing algorithm as well. One server will handle every request from clients until the server load reaches a threshold. Once reaches the threshold, the server of minimum load will continue handling requests and form a cycle. Ant colony [12, 13] recently becomes an available effective method for load balancing algorithm. It always defines several kinds of ant with different behavior to simulate the working process of servers. Although there is lots of research on load balancing, little attention has been devoted to a specific load balancing algorithm designed for device management server. This paper extends the load balancing algorithm by integrating the characteristic of device management. 2.3. Load Descriptor Now almost all the dynamic load balancing algorithms achieve an evaluation of load of servers through a periodic sampling. An important topic is how to build mathematic model to evaluate load status of the server. Load descriptor is a metric that indicates the mathematic model to measure server load information. Many load balancing algorithms collect usage of hardware as load descriptor such as CPU usage, memory usage and the number of network connections [14, 15]. These parameters could directly describe the usage of server and they could be applied to measure a general server. However, this description of server load has not considered the impact of DM service and DM session. Consequently, a dispatching design based on the direct resource measurements only could be This work was supported by the Funds for Creative Research Groups of China (60821001), NSFC (60973108, 60802035), National S&T Major Project (2011ZX03005-004-02) and Chinese Universities Scientific Fund (BUPT2009RC0504) 149

risky. In order to avoid the risk, we take the characteristic of DM and DM service into consideration and put forward a proper method to describe the load of DM server. 2.4. Queuing Models Queuing models were originally used to estimate system behaviors for static design and capacity planning. Early works on queuing networks focus on providing an analytical scheme for capacity planning at design time, while ignoring its function for load balancing and service scheduling. Through [16] a combination of Markov chain and queuing model, we can easily get the analysis about the remaining resource of the system. At first, we use queuing model to estimate the remaining capacity of DM server and finally build the data model to measure the load of DM servers with the help of queuing model. In traditional way of network modeling and analysis, Markov model without memory is often used. At present, a large number of research results indicate that the arriving of user requests should be Long-Range Dependence (LRD), self-similar and fractal or called heavy-tailed distribution. A typical heavy-tailed distribution, which is used to model the internet services, is the Pareto distribution. Definition 1: Pareto distribution Cumulative distribution function: () = ( < ) = 1 (1) Probability mass function: () = (2) 3. Overview In this section, we propose a new resource allocation algorithm for Device Management, called QDRA, for device management server. 3.1. Architecture Currently, both centralized and distributed frameworks are commonly used. The advantage of centralized framework is that the load of each server could be easily summarized and central server could achieve uniform resource allocation. On the other hand, the disadvantage is that the central server should have high performance. However the whole system will be affected once error occurs. For distributed framework, it is convenient to transfer load from one server to another in real-time. But the network structure could be very complicate and it will bring extra overhead on every server in system. We combine the advantages of both frameworks and propose a hybrid framework to achieve service scheduling and load balancing. As is shown in Figure.1, the architecture of QDRA is a hybrid framework. 150

Figure 1. Architecture of QDRA Balancer is central server for DM servers composing the centralized framework. Terminal devices will send service requests to Balancer. Balancer should classify requests and put them into corresponding request queue. Based on our service scheduling algorithm, Balancer decides whether requests should be handled or not. At the same time, Balancer will monitor load information of DM servers in real-time in order to decide which DM server request should be handled based on our load balancing algorithm. By using distributed architecture, DM servers could exchange their load information and user requests. 3.2. Algorithm Description In order to describe the algorithm, the specific design is shown in Figure.2. Figure 2. Components in QDRA When service request from device arrives, service classification module in Balancer should classify the request according to service QoS requirement and push it into the corresponding requests queue. We divide classification mechanism into two types: User-based and Service-based. User-based method is to classify requests by properties of device user. By setting priority user group, it could provide better QoS guarantees. Service-based method is to classify by features of the service. Because different services require different coast of resources from DM server, this method could improve QoS guarantees. After pushing service requests into queues, decision module should execute admission control for every request queue in turn. Based on the load information of DM server, admission control could decide whether the request from one queue should be handled or waiting. The work flow mentioned above is the service scheduling algorithm in QDRA. Next, we introduce the load balancing algorithm in QDRA. Decision module also collects load information of every DM servers so that decision module can select suitable DM server to handle the request after admission control. Monitor module in DM server 151

is response to measure load information and report it to decision module in period. Therefore, decision module could get real-time load information of every DM server. Based on real-time load information, Balancer could select the most suitable DM server for the service request and send the request to the DM server. In addition, the load information always changes dynamically. In order to achieve load balanced, DM commands should be transferred from one DM server to another based on load information of others. Transfer module in DM servers is response to exchange load information and transfer DM command. From the above, we can easily see that the admission control and load descriptor are the key to QDRA. The specific introduction of admission control and load descriptor is shown in the next chapter. 3.3. Load Descriptor Load descriptor is a metric that indicates the mathematic model to measure the load of DM server. We divide load information into two parts: remaining resource and load trend. Remaining resource means the remaining capacity of DM server. Load trend means that we predict changes in server load. We introduce the method to measure the load changing trend for every DM server in next. is used to hold High-Cost Commands. Low-Cost Commands, including Get, Alert, Delete, Exec and Copy, need not to transport MOs, so they consume less resource and is used to hold Low-Cost Commands. Based on the above analysis, we define: = (3) refers to the number of High-Cost Commands in ; refers to the number of Low-Cost Commands in. Because DM session consists of many DM commands meaning that DM session is a combination of High-Cost Commands and Low-Cost Commands, two types of commands will be used alternately in a session. According to the above feature, we assume that the number of clients is fixed. When is high, we know the number of Low-Cost Commands in is high which means in short time load is more likely going to rise; When is low, we know the number of High-Cost commands in is high which means in short time load is more likely going to decline. = (4) It demonstrates load changing trend in short time. In order to quantify the remaining resource in DM server, we assume that the arriving of requests meet Pareto distribution. So the mean and second moment of the Pareto distribution is: [] = () = [ ] = () =, 1, = 1 (5) (6) α is the shape parameter, p is the shortest possible serve time and q is the longest possible serve time. So. Based on the Pollaczek Khinchin formula, we know the expected response time R: = [] + [] = [] + ( []) (7) is the request arrival rate and w is queuing delay of request. According to the formula beside, the expected response time of our system is: 152

= + = (8) i is the number of DM server and j(1 or 2) is the queue number of DM server. The value of, and can be obtained from DM server. If we set the maximum allowable response time of queue as, we can calculate the maximum allowable arrival rate as. Then, we have the remaining capacity of every queue in DM servers. = 1, 1 2. = + (9) 3.4. Admission Control Admission control means that Decision should decide whether the request should be handled or not. Because when the system is low-loaded, throughput of the system is proportional to system load. However when the system is high-loaded, throughput of the system will drop rapidly. Obviously, in this situation, the system is unable to meet the QoS requirements. Therefore, it is necessary to use admission control to allow deferment of service requests when the load of the system is high. In this way, it could guarantee the throughput and improve QoS. Every requests queue will be set a weight ( ) and a counter ( ). indicates that requests from should be sent to DM server in one round of request handling. At the beginning of one request handling round, will be set as. Once one request from is send to DM server, should minus one. After one round of request handling starts, one request from a random queue which is not zero should be handled. Decision will decide the request whether to send based on the handling probability ( ). We present the definition of in next. For every requests queue, we set a minimum threshold of remaining resource as which means when the RC is lower than, DM server is overload for this queue. We also set a maximum threshold of remaining resource as which means when the RC is higher than DM server is low-loaded for this queue. If this request was accepted, there would be three situations: If is higher than, is 1 which means that this request should be accepted because the system is low-loaded. If is lower than, is 0 which means that this request should not be accepted because the system is overload. If is higher than and lower than, we should calculate. = (10) For different request queues, we should set different and based on their QoS requirement. 4. Evaluation Because of the limitation of experimental condition, it is difficult for us to deploy our resource allocation algorithm on real DM servers. So we use network simulation software-ns2 to simulate our algorithm and test performance. In NS2, we build the server topology based on figure.2, and there are five DM servers connected with Balancer. We modify queue management algorithm (Queue/RED) in NS2 based on our resource allocation algorithm. We design two types of simulation: simulation for service scheduling and simulation for load balancing. In order to test service scheduling algorithm, we build three request 153

queues in balancer. As we mentioned in previous chapter, we set different and for these queues. 1 = 1 = ; 2 = 0.9, 2 = 0.3 ; 3 = 0.6, 3 = 0.2. = ( + ) The of these three queues are shown in the following figure: Figure 3. of Three Queues Figure.3 shows that requests from queue 1 will be always send to DM server. Requests from queue 2 could be hold when the load of server is high in order to guarantee throughput of DM server and the slop of the curve of queue 3 is higher. We set these three different queues to simulate different services which have different QoS requirement in practice and these three requests respectively share one third of total requests. The simulation result of three queues is shown in following figures: Figure 4. Response Time Results Figure.4 indicates that the response time for requests from queue 1 is stable and has good performance. Furthermore, the effect of admission control is not obvious when load is low. However, when load is about to rise, the effect of admission control is more and more obvious which means that high QoS requirement service have good QoS guarantee. In addition, low QoS requirement services, such as requests from queue 2 and 3, have to rise response time to guarantee system throughput. From the present simulation results, it can be shown that without admission control, response time will rise sharply when DM server is about to overload. The results demonstrate that our service scheduling algorithm with admission control could significantly improve QoS and service scheduling for DM services is also necessary. 154

5. Conclusion This paper introduces a new resource allocation algorithm for device management server. The algorithm combines service scheduling and load balancing. We use admission control to achieve service scheduling and design a suitable mathematic model for device management to achieve load balancing. We evaluate the algorithm and show the performance. The simulation results demonstrate that the algorithm has great effect on service scheduling, best response time and throughput compared with two well-known load balancing algorithm. 6. References [1] Ayachitula N, Chang SP, Collaborative end-point service modulation system (COSMOS), the 6th International Workshop on Web Information Systems Engineering, pp.660-668, 2005. [2] OMA, Enabler Release Definition for OMA Device Management, Candidate Version 1.2 08, Open Mobile Alliance Ltd, USA, 2006. [3] LIN Zhang, LI Xiao-ping, SU Yuan, A Content-based Dynamic Load Balancing Algorithm for Heterogeneous Web Server Cluster, Journal of Computer Science and Information Systems, vol.7, no.1, pp.153-162, 2010. [4] OMA, OMA Device Management Tree and Description, Candidate Version 1.2 15, Open Mobile Alliance Ltd, USA, 2005. [5] Christodoulopoulos K, Sourlas V, A comparison of centralized and distributed meta-scheduling architectures for computation and communication tasks in Grid networks, Journal of Computer Communications, vol.32, no.7-10, pp.1172-1184, 2009. [6] ZHANG Qi, A. Riska, W. Sun, E. Smirni, G. Ciardo, Workload-aware load balancing for clustered web servers, IEEE Transactions on Parallel and Distributed Systems, vol.16, no.3, pp.219 233, 2005. [7] E. Casalicchio, S. Tucci, Static and dynamic scheduling algorithms for scalable web server farm, In proceedings of the 9th Euromicro Workshop on Parallel and Distributed Processing, pp.369 376, 2001. [8] Chaskar HM, Madhow U, Fair scheduling with tunable latency: A round-robin approach, IEEE Global Communications Conference (GLOBECOM), vol.2, pp.1328-1333, 1999. [9] FAN De-ming, An adaptive and dynamic load balancing algorithm for structured P2P systems, Journal of Advances in Information Sciences and Service Sciences, vol.3, n.11, pp.350-356, 2011. [10] Artur Czumaj, Chris Riley, Christian Scheideler, Perfectly balanced allocation, The 6th Int l Workshop on Approximation Algorithms for Combinatorial Optimization Problems, vol.2764, pp.171-181, 2003. [11] T.F. Abdelzaher, K.G. Shin, N. Bhatti, Performance guarantees for web server and systems: A control theoretical approach, IEEE Transactions on Parallel and Distributed Systems, vol.13, no.1, pp.80-96, 2002. [12] LI Xin, ZHANG Hu-yin, WU Di, WANG Jing, Global load balancing based on heuristic ant colony algorithm for structured P2P systems, Journal of Advances in Information Sciences and Service Sciences, vol.3, n.8, pp.162-169, 2011. [13] MI Wei, ZHANG Chun-hong, QIU Xiao-feng, A security-aware load balancing algorithm for structured P2P systems based on ant colony optimization, Journal of Advances in Information Sciences and Service Sciences, vol.3, n.9, pp.183-190, 2011. [14] V. Cardellini, E. Casalicchio, M. Colajanni, Ph.S. Yu, The state of the art in locally distributed web-server systems, Journal of ACM Computing Surveys(CSUR), vol.34, no.2, pp.263 311, 2002. [15] D.A. Menasce, M.N. Bennani, Analytic performance models for single class and multiple class multithreaded software servers, In Proceedings of Computer Measurement Group Conf., pp.155-161, 2006. [16] Mor Harchol-Balter, Task assignment with unknown duration, Journal of the ACM (JACM), vol. 49, no.2, 2002. 155