Auonomic managemen of scalable -balancing fo ubiquious newoks Toshio TONOUCHI and Yasuyuki BEPPU Inene Sysems Laboaoies, NEC Copoaion {onouchi@cw, y-beppu@ak}.jp.nec.com Absac. In ubiquious newoks, a lo of sensos and RFID eades will be conneced o he newoks as well as PCs and mobile phones ae conneced, and huge numbes of ansacions ae expeced o be issued by hese devices. Since he s caused by he ansacions ae geing inceased gadually, i is difficul fo sysem manage o esimae he equied pefomance. The scalable balancing mehod is, heefoe, expeced. This epo poposes he scalable -balancing mehod in which seves communicae each ohe, and hey manage hemselves fo -balancing. Because his mehod does no need a cene seve, i can avoid he cene-seve boleneck. This sysem, howeve, is pone o divege; he sysem coninues balancing epeaedly in ceain configuaion and neve sops -balancing opeaions. In his pape, we claify a sufficien condiion of he divegence, and peven he sysem fom he divegence. Keywods: balancing, auonomic algoihm, ubiquious newok 1. Inoducion A lo of affic is expeced o be issued by a huge numbe of PCs and cellula phones as well as by RFID eades and sensos in ubiquious newoks. Fo example, all mechandises and poducs will be aached wih RFID ags, and hey may be aced, in whole supply chains, by RFID eades equipped by waehouses, ucks, sma shelves, shopping cas. Anohe example is senso sysems fo secue socieies. Sensos and suveillance cameas, like CCTV in UK, ae equipped a evey place in ciies. Logs of sensos and videos which he cameas ake come and go hough ubiquious newoks. We assume ha in nea fuue fom a million o en million sensos and RFID eades ae allocaed eveywhee. These sensos and eades ae assumed o sense ages once a second, and, as a esul, 1 [eques/sec] 1 million [sensos] = 1million [eques/sec] of equess ae issued. The affic is 1 GB/sec if an aveage size of packe is abou 1kB. Fig. 1 shows he pefomance of local balances [1]. This gaph shows ha hey canno handle 1GB/sec of packes whose size is 1kB. New balancing mehods ae, heefoe, equied fo he ubiquious ea.
Tage L4 Connecions pe Seconds Connecions pe Seconds Requesed File Sizes Fig. 1. Pefomances of local balances [1] Ou age sysem is a web sysem, which is composed of cliens, AP seves and a daabase sysem. We pu he following assumpion on he age sysem. AP seves should be saeless. In anohe wod, he saes of applicaions unning on he AP seves ae soed in he daabase sysem. Fo example, AP seves eieve he saes of applicaion fom he daabase sysem wih he key which is encoded in cookies in cliens. The daabase sysem suppos a ansacion mechanism. AP seves can use he mechanism and hey can soe and eieve daa fom he daabase sysem aomically. In anohe wod, a soe o eieval opeaion of AP seves caused by an eques fom a clien may succeed, o failed. When failed, he opeaion is canceled and no effec is applied o he daabase. In his pape, we only discuss obusness of he AP seves, and we do no handle he obusness of daabase seves. I is a fuue wok. The poocol beween seves and cliens (cliens may be edge seves conneced o RFID eades o sensos) suppos a ediecion mechanism. The mechanism enables seves o indicae cliens o access anohe seve. An example of a poocol suppoing is HTTP [2]. In HTTP, when a seves eply o clien wih 32 esponse code, he clien ediecs o he seve which is suggesed wih he 32 esponse. The AP seves, he daabase sysem and he cliens ae disibued o whole newok, and hey ae conneced. In Secion 2, pevious balancing mehods ae given. We show ou algoihm in Secion 3. Secion 3 also includes peliminay expeimens, which shows a poblem of he poposed mehod. We claify he mechanism causing he poblem and show how o peven fom he poblem. Secion 4 shows by expeimenal evaluaions he pevenion succeeds. The conibuions of his pape ae as follows:
We poposed a disibued balancing mechanism wihou a cene seve. We can, heefoe, avoid boleneck of he cene seve. No cene seve mechanisms, like he poposed mehod, inheenly fall ino divegence. Fo example, seves coninue ying o disibue he o ohe seves, bu he disibuion does neve sop and, in addiion, some seve may eceive moe han eve. We show ha he divegence in expeimens, and we claify ha he divegence mechanism by mahemaical model. The model claifies a sufficien condiion ha he mehod conveges. The sysem is guaaneed o convege if we design he sysem obeying he sufficien condiion. 2. Relaed wok A lo of valuable wok has been done in he balancing echniques. These woks ae soed ino wo ypes: one is a local balancing echnique and he ohe is a global balancing echnique. Many vendos povide many kinds of local balances. A local balance has a viual IP addess which is open o cliens. The balance dispaches incoming equess fom he viual IP addess o seveal back-end seves. I is easy o inoduce local balances because an exising seve can be eplaced wih a balance and is back-end seves. In addiion, he sysem wih a local balance can manage he back-end seve failue because a balance can easily monio is back-end seves. Howeve, a balance iself can be boleneck when huge affics come. I is because a local balance has only one viual IP addess and i mus handle all affics. As a esul, local balancing echnique has limiaion in scalabiliy. A ypical mehod of global balancing echniques is a DNS ound-obin mehod; a special DNS seve can be placed in he newok (Fig. 2). The DNS seve answes he IP addess of a no-busy seve o a neaes seve o a clien which sends a DNS eques. The clien, heefoe, can access he no-busy seve which is expeced o eply quickly o he neaes seve which can communicae o he clien wih a small newok delay. Howeve, caches of DNS answes esul in he inaccuae balancing [3]. DNS seves consis of he ee sucues whose oos ae hieen oo DNS seves in he wold. A local DNS seve may have caches of he answes fom is paen DNS seve. The local DNS seve may no know he change of he balancing siuaion because an acive cache is used insead of he infomaion of is paen DNS seve. To avoid cache poblem, only one DNS seve may gahe all he infomaion, and all he cliens may access he DNS seve. In his mehod, howeve, he boleneck of he DNS seve emeges. I is said ha he DNS seve can handle abou housands of equess pe seconds. Because each seve access of a clien accompanies wih a DNS eques, he DNS seve is expeced o handles 1million 1million [eques/sec], bu i may be impossible. f5 3DNS sysem [4], which is now called Global Taffic Manage, ies o solve he cache poblem. 3DNS seves ae allocaed as local DNS seves, and hey always efesh -balancing infomaion wih a popieay poocol. Fig. 3 shows he mechanism of he 3DNS sysem. Each local newok is equipped wih a 3DNS
. healh check seve. When a use belonging o a local newok accesses a seve wih URL, e.g. hp://newyok.myfim.com/, his/he clien hos asks he 3DNS seve in he local newok he IP addess coesponding o he URL (Sep 1 in Fig. 3). The local DNS seve eceives he quey, and i also asks a 3DNS seve (Sep 2). The 3DNS seve asks backend seves disibued among newok sysems is and he newok disances beween he 3DNS seve and he backend seves (Sep 3). I is assumed ha each local newok is equipped wih he f5 balances. The infomaion and he newok disances ae acquied wih he popieay poocol beween a 3DNS seve and f5 balances. The 3DNS seve answes o he clien he IP addess of an adequae seve, consideing he infomaion and he newok disances (sep 4). Because only cliens in each local newok can access a 3DNS seve, he pefomance of single 3DNS seve need no be excellen. The 3DNS sysem, howeve, can be scalable. In addiion, popieay poocol can educe he cache poblem. Howeve, he 3DNS sysem assumes ha each local newok mus is equipped wih a 3DNS seve, and f5 balances. The sysem is difficul o be deployed. Roo DNS FQDN IP addess cache Inene FQDN IP addess 1. DNS eply 2. DNS eques cache Local DNS DNS Load balance GW 3. Access o. healh check Fig. 2. DNS Round-obin echnique
1. Clien queies local DNS 4. Local DNS esponds bes seve Local DNS 2. Local DNS queies DNS 3. 3DNS ask each balance o choose a bes seve Inene Popey Loadbalance 3DNS Popey Loadbalance 3DNS Popey Loadbalance 3DNS Fig. 3. Mechanism of 3DNS [4] Thee ae some aciviies in which Disibued hash able echnique (DHT) is used as a global balancing mechanism. In sho, jobs o conens ae assigned o seves in global newoks by DHT. Jobs o conens ae assigned he seves of IDs same wih he ID which a hash funcion applied o he jobs o conens calculaed. This may esul in balancing because he assignmen by he hash funcion seems o be andom. Byes poins ou ha his mechanism can no be a fai balancing [5]. Fig. 4 shows he poblem when Chod [6] is used. Chod is no guaaneed o assign seves ino a ing in equal disances. A seve whose ing disance beween iself and a pevious seve is long may be assigned moe jobs. Chod hash ing Ac is sho. The numbe accesses is small. The numbe of accesses is lage. Ac is long. Fig. 4. Unfai balancing wih Chod
3. Appoach: Load exchange mehod We povide a naïve scalable balancing mehod called a exchange mehod. I is scalable because no cene managemen seve exiss. We can avoid so-called a cene-seve boleneck. No-cene appoaches usually esul in divegence; he algoihm does no sop and esul in wose siuaion. Ou peliminay expeimens show ha ou algoihm is guaaneed of convegence; he algoism always sops and esuls in almos equally -balanced sauaion. 3.1 Algoihm Fig. 5 shows he inuiive image of he exchange mehod. We assume ha seves know some seves (we call hem neighbos), and his makes a seve gaph G = (V, E), whee V is a se of seves, and E : (V, V) is a se of a pai of a seve and is neighbo. We call E a se of edges. We assume ha he numbe of edges conneced o a seve is less han a consan numbe (e.g. 5) even if he numbe of N becomes big. In ou algoihm menioned lae, each seve communicaes wih is neighbos, and a lage numbe of neighbos may become ovehead. Edge RFIDs S' Resul of Exchange S'. -> l S. + (1-l) S. Exchange of Load l (S. - S'.) Load Infoion S Resul of Exchange Rediecion S. -> (1-l)S. + l S'. Edge seve RFIDs Edge seve Edge seve Edge seve RFIDs RFIDs RFIDs Fig. 5. Oveview of he exchanging mehod S. means he of S V. A may be a numbe of equess pe second, a CPU, a esponse ime, and so on. S and is neighbo S ell each ohe hei infomaion peiodically. The numbe of neighbo seves is less han given small consan. I means ha he ovehead and delay issued by his communicaions ae lile because each seve communicaes wih a small numbe of neighbos.
Seve S gives S some jobs when S. / S. > D, whee D is a given exchange heshold (D 1). In his case, S gives is job o S so ha he of S becomes S. l (S. - S.) = (1-l) S. + l S. and so ha he of S becomes S. + l (S. - S.) = (1-l) S. + l S., whee l is a given exchange faco ( < l < 1). The exchange is ealized wih ediecion mechanism. Fo example, in HTTP case, S issues 32 esponse wih he URL of S. The clien e-issues he eques o S when i eceives he esponse. S epeas o answe a 32 esponse unil he becomes (1-l) S. + S.. 3.2 Poblem of he exchange mehod Fig. 6 is a peliminay evaluaion esul of he exchange mehod. We implemened he mehod ove fou TOMCAT [7] AP seves. Fig. 6 is a gaph of he balancing among hese fou seves. The x-axis shows ime, and he y-axis shows he s of he fou seves. The lef gaph is evaluaed unde l =.6, and he igh one is unde l =.8. One hand, he lef one shows ha all s of he seves convege and he all s of seves become almos equal. On he ohe hand, he igh one shows divegence; he s do no esul in a sable sae. We call i a divegence poblem. 1 9 8 6 5 Divegence (call/sec) 7 6 5 4 3 Convegence spi4 spi5 spi6 spi7 (call/sec) 4 3 2 spi4 spi5 spi6 spi7 2 1 1 Time 11:26:3 11:26:33 11:26:36 11:26:39 11:26:42 11:26:45 11:26:48 11:26:51 11:26:54 11:26:57 11:27: 11:27:3 11:27:6 11:27:9 11:27:12 11:27:15 11:27:18 11:27:21 11:27:24 11:27:27 11:27:3 11:27:33 11:27:36 11:27:39 11:27:42 11:27:45 11:27:48 11:27:51 11:27:54 11:27:57 Condiions: Complee gaph of fou nodes, D = 2, l =.6 14:2: 14:2:5 14:2:1 14:2:15 14:2:2 14:2:25 14:2:3 14:2:35 14:2:4 14:2:45 14:2:5 14:2:55 14:21: 14:21:5 14:21:1 14:21:15 14:21:2 14:21:25 14:21:3 14:21:35 14:21:4 14:21:45 14:21:5 14:21:55 14:22: 14:22:5 14:22:1 14:22:15 14:22:2 14:22:25 14:22:3 14:22:35 14:22:4 14:22:45 14:22:5 14:22:55 Time Condiions: Complee gaph of fou nodes, D = 1.2, l =.8 Fig. 6. Peliminay evaluaion esuls of exchange mehod. Fig. 7 shows he eason of he divegence. When a seve wih a ligh (a he cene of he figue) is conneced o seves wih a heavy (sep 1), he seve wih ligh eceives he fom is conneced seves. As a esul, i mus manage heavy (Sep 2). I, hen, gives is ino anohe conneced seve (e.g. a seve in igh side) (Sep 3). And he igh side seve will y o give is ino anohe seve in he nex ime (Sep 4).
Sep 1 Sep 2 Sep 3 Sep 4 Fig. 7. Mechanism fo he divegence poblem 3.3 Mahemaical models and sufficien condiion of convegence I is impoan o guaanee he exchange algoihm o convege. We claify he behavio of he exchange algoihm wih a mahemaical model, and find a sufficien condiion. In he model, we ignoe Load balancing heshold D because he simpliciy of he model. In anohe wod, we conside D be 1. The algoihm conveges unde D > 1 if he algoihm conveges unde D = 1. Theefoe, he convegence condiion unde D = 1 is a sufficien condiion o hose unde D 1. We have n seves :, L, ( n 1). x i, is he of i on Time and ( x x ) def x =, L n 1, modeled by : whee means a anspose maix. The exchange is x (1) i, + 1 = xi, l ( xi, x j, ) j N ( i) whee N (i) is a se of seves conneced o seve i. In sho, hee is N( i) def = { j N ( i, j) E} whee G = (V, E). I is because i gives j Load l( x i, x j, ) when x i, > x j,. We define Adjacency a, L a, n 1 N( i) if i = j def maix A = M O M whee a def i, j = 1 if j N( i). Noice ha an 1, L a n 1, n 1 if j N( i) n 1 j= a i, j = because he numbe of neighbos equals o he sum of edges. We can expess (1) as x = ( la E) + 1 + x i (2) (3)
6 5 4 3 2 1 spi4 spi5 spi6 spi7 x = 1 ( la + E) x whee E is a uni maix. I is known ha lim x i, convege if all eigen values of H = ( la + E) ae lage han -1, and 1 o below. I is clea ha de( H α E) = whee Eigen value α of H. One of he eigen values of G is 1 because n de( H α E) = de( H E) = de( l A) = l de( A) if α = 1. And hen, you can easily find de( A ) = because of (2). In addiion, he eigen veco coesponding o Engen value α = 1is α = i ( 1L1) because H α i = α i. This eigen veco means ha all s ae equalized in fuue. In sho, he sufficien condiion of he convegence is ha all he eigen values α of H ae 1 < α 1. All s convege in he equal s if he sufficien condiion is saisfied. def 4. Expeimenal evaluaions In ode o pove he coecness of he mahemaical model, we compae he simulaion esuls based on he mahemaical model wih he expeimenal evaluaions. We give bus equess o a seve, and Fig. 8 shows he s of seves afe he bus equess. The uppe hee gaphs ae he esuls of he simulaions and he lowe hee gaphs ae hose of he expeimenal esuls. Each pai of gaphs in he same ow is evaluaed unde he same condiion. The lef pai is evaluaed unde l =.4, he cene pai is evaluaed unde l =. 6, and he igh pai is evaluaed unde l =. 8. Fou seves ae conneced o he ohe seves in all evaluaions. In sho, seves in each evaluaion make a complee gaph. Complee gaphs wih fou nodes complee ノード 構 gaphs 成 (n=4) wih 完 fou 全 グラフ nodes (simulaion) complee ノード 構 gaphs 成 (n=4) wih 完 fou 全 グラフ nodes (simulaion) complee ノード gaphs 構 成 (n=4) wih fou 完 全 nodes グラフ (simulaion) (call/sec) 負 荷 1.2 1.8.6.4.2 -.2 -.4 Convegence ノード ノード1 ノード2 ノード3 1 2 3 4 5 6 7 8 9 1 11 Time 時 間 (ic) (call/sec) 負 荷 3 2 1-1 -2-3 -4 Divegence 1 2 3 4 5 6 7 8 9 1 11 Time 時 間 (ic) ノー ド ノー ド1 ノー ド2 ノー ド3 (call/sec) 負 荷 3 2 1-1 1 2 3 4 5 6 7 8 9 1 11-2 -3 Divegence -4-5 Time 時 間 (ic) ノード ノード1 ノード2 ノード3 6 5 l =.4, Max eigen value= -.84 Evaluaion 6 5 l =.6 Max eigen value = -1.8 l =.8 Max eigen value = -2.7 Evaluaion Divegence Evaluaion 4 3 Convegence spi4 spi5 spi6 spi7 4 3 Convegence spi4 spi5 spi6 spi7 2 2 1 1 12:1:3 12:1:32 12:1:34 12:1:36 12:1:38 12:1:4 12:1:42 12:1:44 12:1:46 12:1:48 12:1:5 12:1:52 12:1:54 12:1:56 12:1:58 12:11: 12:11:2 12:11:4 12:11:6 12:11:8 12:11:1 12:11:12 12:11:14 12:11:16 12:11:18 12:11:2 12:11:22 12:11:24 12:11:26 12:11:28 12:11:3 11:59: 11:59:3 11:59:6 11:59:9 11:59:12 11:59:15 11:59:18 11:59:21 11:59:24 11:59:27 11:59:3 11:59:33 11:59:36 11:59:39 11:59:42 11:59:45 11:59:48 11:59:51 11:59:54 11:59:57 12:: 12::3 12::6 12::9 12::12 12::15 12::18 12::21 12::24 12::27 14:4: 14:4:6 14:4:12 14:4:18 14:4:24 14:4:3 14:4:36 14:4:42 14:4:48 14:4:54 14:5: 14:5:6 14:5:12 14:5:18 14:5:24 14:5:3 14:5:36 14:5:42 14:5:48 14:5:54 14:6: 14:6:6 14:6:12 14:6:18 14:6:24 14:6:3 14:6:36 14:6:42 14:6:48 14:6:54 14:7: 14:7:6 14:7:12 14:7:18 14:7:24 14:7:3 14:7:36 14:7:42 14:7:48 14:7:54 Fig. 8. Simulaions based on he mahemaical model and expeimenal evaluaions
Boh gaphs in he lef pai convege because he max absolue value of eigen values excep 1 is -.84 in he lef pai. Boh of he gaphs in he igh pai divege because he max absolue value of eigen values is -2.7 in he igh pai. These wo esuls show ha he model is coec in he expeimens. Noice ha he lowe gaph in he cene pai conveges while he uppe one diveges. We hink ha he convegence condiion given in Secion 3.3 is sufficien condiion. Theefoe, hee some cases whee he sysem conveges while he coesponding model divegences. 5. Conclusion We popose he naïve global balancing mehod in his pape. The mehod is pone o divege, bu we model he mehod and claify he sufficien condiion ha he balancing conveges. The coecness of he model is poved hough he expeimenal esul. This algoihm assumes ha Gaph G is given. We ae now developing he auonomic gaph making poocols. We aim a he mainenance-fee and faul-olean gaph consucion. Acknowledgemen This eseach was sponsoed by he Minisy of inenal Affais and Communicaion in Japan. Refeences 1 F5, Boadband-Tesing: Applicaion Taffic Managemen, Jan., 25 2 T. Benes-Lee, e al., Hypeex Tansfe Poocol -- HTTP/1., RFC 1945, May 1996 3 Tony Bouke, Load Balancing, O eilly, 21, ISBN -596-5-2 4 f5, BIG-IP Global Taffic Manage, hp://www.f5.com/poducs/bigip/gm/ 5 Byes, J., e al. Simple Load Balancing fo Disibued Hash Tables, in Poceedings of 2nd Inenaional Wokshop on Pee-o-Pee Sysems (IPTPS '3), pp. 8-87 6 Soica, I., e al. Chod: A scalable pee-o-pee lookup sevice fo inene applicaions, In ASM SIGCOMM 21, pp.149-16 7 The Apache Sofwae Foundaion, Apache Tomca, hp://omca.apache.og/