Comparison between two approaces to overload control in a Real Server: local or ybrid solutions? S. Montagna and M. Pignolo Researc and Development Italtel S.p.A. Settimo Milanese, ITALY Abstract Tis wor analyzes te performances of two algoritms candidate to overload control witin a server. An algoritm wors according to a local strategy, using te concept of measuring te processor occupancy. Te oter algoritm taes advantage of measuring bot te processor occupancy and te call es. Te latter fits in te ybrid family, because it exploits a furter element Load Balancer to reduce te carge of te cpu of te server under protection. Te job assumes te trougput, i.e. te number of calls managed witin one second, as te unit of comparison. Index Terms Load Balancer, Load control, SIP, H.248, overload protection, queuing. I. INTRODUCTION Telecommunication networ switces are engineered to carry a certain number of active calls guaranteeing a quality of service QoS. However, te offered traffic may, for a sort period of time, be bigger tan te engineered capacity. Overload controls are necessary to maintain te trougput and te QoS at acceptable levels. Overload mecanisms are classified in two ferent categories: local or remote [1]. In te latter, a switc may require collaboration from neigbour elements to come out quicly from its state of suffering. Tis elp isn t required wen te strategy of control belongs to te local family, were eac switc adopts a protection of its own. Local o remote coices are bound cases, amidt wic it is possible to exploit some ybrid solutions. For example, te beaviour of te overload mecanism can be local for a type of traffic and no local or remote to manage oter types. In tis wor we study two ferent overload control mecanisms: one labelled Overload Control Load tat belongs to te local family and anoter one, labelled Overload Control Load Loss belonging to te ybrid family. In literature some algoritms for te overload controls ave been proposed and evaluated. Some of tese use a single measure, typically te processor occupancy [1], oter use two ferent measures [2]: trougput and processor capacity. Different strategies use time priority [3] or some tresolds buffer [4,5] tat allow to define ferent values of QoS. Wit te advent of te Session Initiation Protocol SIP [6] tese mecanisms of overload control can react in ferent ways. A SIP call in te simple case could be an INVITE or REGISTER option rejected by te mecanism can prime some re-transmission as described in [6]. It s wort to igligt tat overload conditions may ave ferent origins, suc as overload because of users beaviour, i.e. user to server overload, or an overload condition between servers. Tis means tat in te first case te beaviour of te user is very important wile it isn t in te second scenario. As sown in [7] for a server to server overload, te use of a tresold mecanism and re-transmission gives rise to a sarp drop of trougput wen te offered load approximates te server capacity. Always in [7] tree window-based feedbac algoritms for te overload control are proposed and evaluated. Te analysis envisages only SIP INVITE calls. Actually, te transaction to SIP signalling is in race, but generic servers need to manage ferent types of protocols signalling at te same time. In tis paper, we investigate te performances of te two ferent overload control metods, bot wit omogeneous only SIP or H 248 traffic signalling and eterogeneous traffic SIP and H. 248. Te caracteristics of tese protocols are ferent and tis affects bot te treatment costs of a successful call and te refusal costs of te same call. From real measures we obtained tat te cost of refusal of one called base tat uses SIP protocol Invite,1 Trying,18 Ringing, 2 OK, BYE and 2 OK is included in te range 2% - 3% of te cost of te management of te complete call. One complete call wit protocol H. 248 taes around 4 ms and te treatment of refusal is 1% of te cost. Tis 978-1-4244-5794-6/1/$26. 21 IEEE 845
means a waste in te occupation of te cpu, tus increasing te refused traffic and te reduction of te trougput. Te performances of te system will be investigated by using te trougput parameter, tat is te total calls managed by te system witin a time unit. II. REFERENCE MODEL Two types of traffic sources are assumed: H. 248 and SIP ; SIP is modelled wit INVITE and REGISTER metods only. For bot sources inter-arrival time distribution is assumed wit negative exponential; H. 248 call will ave λ H average wile INVITE and REGISTER metods will ave respectively λ I and λ R averages. Te following figure represents a complete system wit a load balancer wit Round Robin service discipline and a real server pool RSI to process te incoming calls. Te subsystem studied in tis paper for te evaluation of and algoritms performances is represented in te dotted box. Source SIP RR Load Balancer Source H.248 Source H.248 RS_1 RS_i RS_n Feedbac only RR=Round Robin RS=Real Server sin Figure 1. Sceme for and algoritms evaluation Te traffic is modelled at call level, since te main goal of tis wor is to evaluate te system trougput; modelling at message level is wort only wen te goal is to evaluate te message-processing delay. For instance, in SIP protocol, te basic call setup procedure, witout advanced call features, consists of six basic messages following te INVITE one {1 Trying, 18 Ringing, 2 OK, ACK, BYE and 2 OK}. Due to te call level modelling te cost of processing of an INVITE message is assumed equal to te cost of processing of te INVITE itself plus te oter six messages. Bot proposed algoritms and can exploit te caller-transmission mecanism described in [7]. A. Load based Overload Control algoritm controls te load of te local server on wic it is running; te status of te server is represented by te processor occupancy ρ meas wic is expressed as te percentage of time, witin a given probe interval ζ, in wic te processor is busy in processing calls. Te uses te parameter ρ for estimating te number of calls ζ, tat can be serviced in te previous probe interval. During current probe interval, te calls exceeding te estimation of K + made in te previous probe interval are rejected. Te algoritm uses two ferent tresolds, ρ min and ρ max, to cec te rate of te output traffic. In bot simulations and measurements te following values for te parameters ave been assumed: {ζ =1 sec.; ρ min =.7; ρ max =.85}. Te processing cost of a call C est is estimated as te ratio between measured ρ meas and number of serviced calls during a probe interval ζ. Te estimation K + of te number of calls tat can be accepted in te following probe period ζ, is calculated as: K + ρ + = {1} C est Te parameter ρ + taes te ρ min value wen te measure of te processor occupancy ρ meas is greater tan ρ max tresold, oterwise ρ + taen te ρ max value. In te case were a single tresold φ target is used ρ min = ρ max = φ target, te system trougput T can be evaluated by means of te following analytical expression: T = ϕ t arg et λ p 1 p { 2 } Were p is te cost to reject te call, wile λ is te mean value of te input traffic inter-arrival. algoritm is part of te family of te so called local overload control algoritms [2]. B. Load & Loss based Overload Control metod controls te load of te local real server on wic is running by means of measurement of bot local real server load and call. System load is measured according to wat as been done in algoritm see formula {1}. Te algoritm running on real server RSi receives from te load balancer LB te number of SIP calls tat ave been discarded from te load balancer itself. Te probability and respectively for H.248 and SIP traffic is calculated. Ten we evaluate te number of SIP calls: K meas = λ 1 and H.248 calls: = λ..1 248 meas K tat te system could andle in te next probe interval. Te sum K plus K named K S is compared to K + formula {1}. If te value of K S is greater tan K + te overall load as to be reduced, oterwise it can be increased. In te first case te reduction is performed according to te following strategy: a We evaluate K =K S -K + b We calculate te SIP calls: 846
+ los. K = and te H. 248 calls: = +. c In te case tat total probability K S is equal zero, te SIP calls accepted K are evaluated by =.5 wile te H. 248 calls accepted are estimated by =.5 Were I is te cost of te INVITE call, r is te cost of a Register for a user and is te cost of H. 248 call. Initially we assume tat λ H, = λ R. =. Te results are obtained by simulation for algoritm, wile for te algoritm we used bot simulations and analytical solutions see formula {2} wit φ =.78. Te algoritm as been also implemented, tus allowing to get real measurements and compare tem wit simulation and analytical results. We assume for te RS te cost I is equal to 15 ms wile te cost to reject a call is equal to 3 ms. p I = 3 ms. wile on te LB te cost are.5 ms to accept te call and.3 ms to refuse te call. In te fig. 2 we investigate ow te trougput canges wit a ferent offered load. d If is zero ten = K + wile if is zero ten = K +. If K is positive ten: a We evaluate K =K + -K s b We calculate te SIP calls accepted K by: = + + and te H. 248 calls accepted by: = + + c In te case of a zero total probability K S te SIP calls K are estimated by: = +.5 and te H.248 calls by: = +.5. d If SIP calls arrived in te previous probe interval are less tan K old, ten a quote p of te value K is decreased and added to. Te same approac is used for H. 248 calls. Te value K is forwarded to te LB. Te traffic filtered by LB reaces te RS and can not be rejected. III. RESULTS We examine te performance of te and algoritms in te case of omogeneous traffic and wit retransmissions mecanism on case and off case. In fig. 2 we sow ow te trougput caracteristics cange for ferent values of te offered load. Te offered load, labelled σ, is obtained by te formula: I r λ + λ r λ σ + = {3} I Trougput 6 4 3 2 1 simulation measures analitycal 1 2 3 4 56 6 7 8 9 1 1 2 2 Offered Load Figure 2. Trougput vs Offered Load Invite/s. Te results sow tat algoritm wors better tan algoritm wen te system is in overload region offered load bigger tan 56 calls/s.. In tis scenario, we igligt tat te re-transmission mecanism is active only for algoritm. Anoter interesting point concerns performances: simulations and analytical results are very similar to te real measure also if we model at te call level and not at te messages level as in te real beaviour. Te next figure sows ow te Register trougput canges in time. Te analysis of tis beaviour is very important because after a down of a RS, te restoring of users as to be accomplised in a sort time. We assume te cost to elaborate a Register on te RS is equal to 5 ms wile te reject of a call is equal to.5 ms. Te costs on te LB are similar to te previous case. 847
2 λr = 6 reg./s Transient pase 6 =.9 σ ; σ H =.1 σ 2 4 Trougput Reg./sec 1 1 Trougput cps 3 2 =.1 σ ; σ H =.9 σ 1 σ= + σ H 1 2 3 4 6 7 8 9 1 11 12 13 14 1 Timesec,6,8 1 1,5 2 2,5 3 4 Offered Load Figure 3. Trougput vs time Te trougput are te number of Register Options to be elaborated by te system every ζ ζ = 1 s.. Te results sow tat, wit algoritm, te rate of te Register is equal 17 Reg./s 17 Reg. every 1 s. wile it is equal to 46 Reg./s.46 Reg. every 1 s. wit metod. Te input rate, λ R, is 6 Reg./s. It is important to observe tat te trougput of te strategy doesn t depend on te input rate obviously wen te system is in overload region wile te trougput of te strategy depends on te input rate, according to formula {2}. Bot algoritms sow a transient pase; te lengt of tis period depends on te parameters {ρ min, ρ max, λ r, r } for te solution, and on te parameters {ρ min, ρ max, λ r, r, p r } for te solution were r is te cost to elaborate a Register option wile p r is te cost to reject a register option by te RS. Figure 4. Trougput vs. Offered Load Te results sow tat te gain on te trougput for te mecanism is bigger wen te offered load increases and te percentage of te load due to SIP traffic is iger. In te last figure we investigate ow te total trougput Invite plus H.248 canges wen te load offered to te system canges over time. We offered to te system a load equal.8 for a period of 6 s., ten te offered load is ept to 2. for 3 s. before returning to te initial value σ =.8. 4 4 In te next figure we investigate te effect on te total trougput Invite plus H.248 in two ferent cases: a te offered load is 9% generated by te H.248 traffic and 1% due to Invite traffic; b te offered load is 1% due to H.248 traffic and 9% is SIP traffic. Te cost to elaborate an H.248 call is equal to 4 ms., wile te cost to reject an H.248 call is equal to 4 ms. Trougput 3 3 2 2 1 1 σ=.8 =.4;σ =.4 σ=2. =1.6;σ =.4 Start Disturbance σ=.8 =.4;σ =.4 End Disturbance 4 42 44 46 48 52 54 56 58 6 62 64 66 68 7 72 74 76 78 8 82 84 86 88 9 92 94 96 98 1 12 14 16 18 11 Time sec. Figure 5. Trougput vs time Te results of Fig. 5 igligt te effectiveness of te algoritm compared to te performances of te mecanism. Te ference on te trougput is obvious wen te offered load is over te limit value. At te end of te disturbance period te mecanism maintains a iger trougput because it tends to andle te call tat was in te re-transmission pase. 848
IV. CONCLUSIONS In tis wor we proposed and evaluated te performances of two overload control mecanisms tat can be implemented in Real server for signalling traffic management. Te algoritm, named, uses te processor occupancy as a control parameter, wile te metod named exploits also te measure of te probability for ferent classes of traffic. mecanism evaluates te number of calls tat can be served by te RS, ferentiating SIP versus non- SIP traffic. Differently from te metod, SIP calls can be refused only by te Load Balancer, wile H.248 calls are rejected by te Real server. Te results are: Te algoritm acieves better performances wit respect to te metod. Te gain obtained by te mecanism is maximum in case of omogeneous SIP traffic. Te trougput in particular for te metod is limited by te cost to reject calls. Wen tese values increase te aggregate trougput Invite and H.248 decreases quicly at te increase of te offered load. REFERENCES [1] L.Cry, J. Kaufman and P.T.Lee Load Balancing and overload control in a distributed processing telecomunications system. United States Patent No 4,974,256,199 [2] Kasera,S., Pineiro, J.,Loader, C.,Karaul, M.,Hari, A., LaPorta, T.: Fast and robust signalling overload control. Networ Protocols 21 Nint International Conference on 11-14 Nov. 21 [3] M. Ota: Overload Protection in a SIP Signaling Networ. In: International Conference on Internet Surveillance and Protection ICISP 6 26. [4] S. Montagna, M. Pignolo, Performance evaluation of Load Control Tecniques in SIP Signaling Servers. Proceedings of te Tird International Conference on Sistems, ICONS 8; April 15, 28S. [5] S. Montagna, M. Pignolo. Load Control tecniques in SIP signaling servers using multiple tresolds NETWORKS 28 [6] J. Rosenberg, H. Sculzrinne, G. Camarillo, A. Jonston, J. Peterson, R. Spars, M. Handley and E. Scooler, SIP: Session Initiation Protocol. IETF, RFC3261, June 22 [7] E.C. Noel and C.R. Jonson, Initial simulation results tat analyze SIP based VoIP networs under overload. Boo: Managing traffic performance in converged networs; pp 54-64. LNCS - Springer; September 27. 849