Resource Management and Organization in CROWN Grid

Resource Management and Organzaton n CROWN Grd Jnpeng Hua, Tanyu Wo, Yunhao Lu Dept. of Computer Scence and Technology, Behang Unversty Dept. of Computer Scence, Hong Kong Unversty of Scence & Technology Abstract The man goal of the key project, CROWN Grd, s to empower n-depth ntegraton of resources and cooperaton of researchers natonwde and worldwde. CROWN project was started late 23 and we have successfully released CROWN 2. n November of 25. In ths paper, we ntroduce the resource management and organzaton n the CROWN grd. Keyword: GROWN, grd computng, servce club, trust, ncentve 1. Introducton Grd computng enables users to share resources, ncludng CPU, storages, memory, bandwdth, etc, across multple domans [7, 3, 8, 12, 11] over wde-area network. The man goal of our key project, CROWN (Chna R&D Envronment Over Wde-area Network), s to empower n-depth ntegraton of resources and cooperaton of researchers natonwde and worldwde. CROWN was started n late 23. Tens of unverstes and nsttutes, such as Tsnghua Unversty, Pekng Unversty, Chna Academy of Scences and Hong Kong Unversty of Scence and Technology, Behang Unversty from ctes have joned CROWN. Through the Computer Network Informaton Centre of CAS, the CROWN s connected to some famous nternatonal grd testng bed such as GLORIAD and PRAGMA. Lots of applcatons n dfferent domans have been deployed nto CROWN grd, such as gene comparson n bonformatcs, clmates pattern predcton n envronment montorng, etc. Along wth the growth the sze of a grd system, resource management and organzaton become a challenge ssue n grd systems. Obvously, a centralzed mechansm greatly suffers sngle pont of falure. Hence, an effcent resource management mechansm for a hghly heterogeneous, dynamc dstrbuted envronment s of great mportance, and there ndeed have been many researchers worked n resource management approaches such as Buyya et al. [17, 19, 2]. In ths paper, we focus on the ssue of maxmzng resource utlzaton and provdng dynamc sharng and collaboraton mechansms n GROWN grd. The rest of ths paper s organzed as follows. We ntroduce the archtecture of CROWN system n Secton 2. Sectons 3 and 4 dscuss the resource organzaton and management mechansm n statc and dynamc grd envronments, respectvely. Secton 5 concludes the paper. 2. Archtecture of CROWN 2.1 Overvew CROWN employs a mult-layered structure of resource organzaton and management, as llustrated n Fg. 1, based on the characterstc of e-scence applcatons and the resource subordnaton relatonshp. It ncludes four major enttes: Node: All knds of heterogeneous resources are encapsulated nto CROWN nodes, and servces are deployed on these nodes to provde a homogeneous vew for upper mddleware to access the resources. Doman: Nodes are organzed nto dfferent domans accordng to ther subordnaton or resource catalogs. A Resource Locatng and Descrpton Servce (RLDS) s deployed n each doman as the center of grd nformaton management. Domans may contan several sub-domans such that all RLDS come up wth a tree lke topology.

Fg 1: CROWN resource organzaton and management Regon: For flexblty, the concept regon s ntroduced to coordnate dfferent doman trees. A regon swtch s deployed n each regon such that flexble topologes and nformaton sharng mechansms can be appled among regons. Gateway: To connect wth other grd systems, we deploy several gateways on the edge of the CROWN system. 2.2 CROWN Mddleware CROWN mddleware s bult on top of physcal resource layer whch s consstng of exstng heterogeneous resources. A rch set of software components and tools are provded n the mddleware layer to support the development and runnng of grd servces n the envronment. An applcaton layer s bult based on the mddleware layer of CROWN provdng grd applcatons for mult-dscplne e-scence research, as shown n Fg. 2. CROWN s a servce orented grd applcaton supportng envronment. Resource sharng and collaboraton are acheved through grd servces. A grd servce contaner, called NodeServer, s deployed on each of the grd hosts. All the servces are encapsulated as certan types of resources, wth some of them provdng applcaton specfc functons and others provdng general servces, such as grd nformaton servces (GIS). In CROWN, before a computer becomes a Node Server (NS), t must be nstalled wth CROWN mddleware. The servce contaner s the core component n CROWN mddleware, whch provdes a runtme envronment for varous servces. Each NS usually belongs to a securty doman. Every doman has at least one RLDS to provde nformaton servces, and RLDS mantans the dynamc nformaton of avalable servces. Fg 2: Archtecture of CROWN mddleware

By queryng RLDS for nformaton of avalable grd servces, the Scheduler can select proper resource to execute the job and solve the problem on behalf of the grd user. CROWN Desgner s an IDE for servce developer. The CROWN Montor s to trace the system status n a global vew and help to analyze the system runnng behavors. A full-fledged grd securty soluton s also provded n CROWN whch contans message level securty, authentcaton and authorzaton, credental management, and dentty mappng for heterogeneous securty nfrastructure and trust negotaton. 2.3 Testbed and Applcatons A CROWN testbed s establshed based on exstng network nfrastructure (CERNET and CSTNET). Several plot applcatons are runnng n the CROWN envronment. AREM: The Advanced Regonal Etacoordnate numercal predcton Model s to meet the requrement of meteorology and atmospherc physcs researchers to analyze weather and vsualze the result and refne predcton models. MDP: Combnng speech recognton engne wth grd technology, a massve Multmeda Data Processng platform s provded to support parallel mult-way speech recognton. gvz: A vsualzaton tool for smulaton data developed by Unversty of Leeds. After encapsulatng t nto grd servce and deployng t n both CROWN and Whte Rose Grd n UK, a large number of users can make use of ths functon smultaneously. DSSR: Dgtal Sky Survey Retreval applcaton s a tool for searchng and retrevng a whole star graph accordng to certan regonal parameters. It helps astronomcal researchers to do further observaton, analyss, and studes n an effcent way. 3. S-Club: Resource Organzaton n Statc Envronments Early computatonal grds employed the layered archtecture wth an hourglass model [8]. Recently, wth the evoluton of Web servces, the servce-orented archtecture has become a sgnfcant trend for grd computng, wth OGSA/ WSRF as the de facto standards [9, 1]. CROWN has adopted the servce-orented archtecture, connectng large amount of servces deployed n unverstes and nsttutes. Many efforts have been made to mprove nformaton servce for grd computng. Some centralzed approaches, such as MDS-1 [7], Hawkeye [1] and R-GMA [2], have been desgned for small-scale grd systems. In these approaches, only one centralzed server s deployed to provde nformaton servce. To provde nformaton servce for large scale grd systems, several solutons, such as MDS-2 [6] and GAIS [15], have adopted a herarchcal archtecture to organze GISs. To search servces over the GISs, however, we have to traverse a partal of the herarchcal archtecture, whch could cause long latency as well as huge search traffc. CROWN employs a servce club mechansm, called S-Club, for effcent resource organzaton and servce dscovery. 3.1 Desgn Ratonal The basc dea of S-Club s to buld an overlay over the exstng GIS mesh network. In such an overlay, GISs provdng the same type of servces organzed nto a servce club. An example of such a club overlay s shown n Fg. 3, where nodes C, D, E, G form a club. A search request could be forwarded to the correspondng club frst such that search response tme and overhead can also be reduced f the desred result s avalable n the club.

9 Average Response Tme (ms) 8 7 6 5 4 3 2 1 MDS-lke, p=.2 MDS-lke, p=.4 MDS-lke, p=.8 S-Club, p=.4 S-Club, p=.4 S-Club, p=.8 46 91 137 182 228 273 319 364 41 455 Smulaton Tme (s) Fg 3: An example of servce club Fg 4: Average response tme Intutvely, to set up a club requres nformaton exchange, and clubs need to be mantaned dynamcally because new GISs may jon and some exstng GISs may leave. Also, t s possble that some types of servces become less popular after the club s bult. Therefore, we have to be careful on the trade-off between the potental beneft and the cost ncurred. In general, the usage of servces s not unformly dstrbuted. Some types of servces can be very popular and others may not. When/how clubs are constructed/destroyed wll be key ssues n S-club scheme. We assume that any search request s frstly sent to a GIS close to the user. On recevng a search request for a specfc servce type, the GIS checks locally whether there has been a club for ths type. If yes, the GIS forwards the request to the club, whch wll be flooded wthn the club only. If there s no club for ths type, however, the GIS floods the request throughout the mesh network. When a new GIS jons the GIS network, t has no dea what clubs are there. But snce t has at least one neghbour n the underlyng mesh network, t can ask one of ts neghbours for the nformaton of exstng clubs. Namely, t smply copes the nformaton of clubs from ts neghbour. For more detaled dscusson, please refer to [13]. 3.2 Performance In order to evaluate S-Club, we use followng three performance metrcs: average response tme, total traffc overhead, and optmzaton rato. Average Response Tme of a query s one of the parameters concerned by end users. We defne response tme of a query as the tme perod from when the query from end user s receved by the frst GIS node, and untl when ths node collects all the response results from other GISs and meets the end user s demands. Total Traffc Overhead of a query s defned as the length of message sent among GISs to answer the query. If S-Club s adopted, the traffc overhead of creatng, mantanng and deletng s also ncluded. We have traced the real message n CROWN Grd, n whch GIS s wrapped as an OGSA Grd Servce usng SOAP messages. Includng the head of SOAP message, the sze of all related messages s rangng from 15 bytes to 3 bytes. Optmzaton Rato shows the beneft of S- Club mechansm wth comparson to MDS-lke system. The optmzaton rato of average response tme s defned as ARTM ART OptmzatonRato = ART Where ART M s the average response tme of MDS-lke system and ART S s the average response tme of S-Club system. M S

Total Traffc (GB) 9 8 7 6 5 4 3 2 1 MDS-lke S-Club Average Response Tme (ms) 9 8 7 6 5 4 3 2 1 MDS-lke S-Club 1 2 3 4 5 6 7 8 9 1 Percentage of Requested Servces (%) 1 2 3 4 5 6 7 8 9 1 Percentage of Requested Servces (%) Fg 5: Total traffc v.s. % of requested servces Fg 6: Average response tme v.s. % of requested servces Total Traffc (GB) 45 4 35 3 25 2 15 1 5 MDS-lke,n=1 MDS-lke,n=2 MDS-lke,n=4 S-Club,n=1 S-Club,n=2 S-Club,n=4 1 2 3 4 5 6 Number of Requests ( 1,) Average Response Tme (ms) 7 6 5 4 3 2 1 MDS-lke,n=1 MDS-lke,n=2 MDS-lke,n=4 S-Club,n=1 S-Club,n=2 S-Club,n=4 1 2 3 4 5 6 Number of Requests ( 1,) Fg 7: Total traffc v.s. number of requests We compare the S-Club wth MDS-lke [6] systems. In Fg. 4, the parameter p stands for the percentage of requested servces. We see that wth the tme elapses, average response tme of all MDS-lke systems s almost a constant whle the average response tme of S-Club systems s decreasng gradually and fnally comes to a stable state. Fgures 5 and 6 plot the beneft S-Club mechansm brngs wth the change of parameter p wth a GIS network of 4 nodes. Wth the ncrement of p, total traffc and average response tme ncrease together. We fnd that S- Club reduces the total traffc by 46% and average response tme by around 7-26%. The expermental results n Fg. 7 and Fg. 8 show that S-Club reduces total traffc and average response tme sgnfcantly. For example, when n=2 and total number of Fg 8: Response tme v.s. number of requests requests s 6, S-Club reduces traffc by 55% and average response tme by 27%. Compared wth the prevous approaches, S- Club adopts a fully decentralzed archtecture wth an unstructured topology. Each GIS keeps the nformaton of servces regstered to t. To mprove the performance, overlay s constructed dynamcally and may be changed constantly wth the self-organzatons of servce clubs. 4. Resource Organzaton n Dynamc Envronments In S-Club, we assume the grd nodes are not selfsh and would lke to share ther resources wth others. In order to tackle the free rdng problem, CROWN combnes the soft ncentve [16, 5, 15] and hard ncentve schemes [1, 4, 19] and propose a trust-ncentve framework.

Fg 9: S-Club based two ter organzatons 4.1 Basc dea We borrow the prncples of market prcng when constructng the trust and ncentve model. The prmary goals are securng shared resources, promotng users to share valuable resources, mantanng the balance of supply and demand n compettve grd resource market, and fnally maxmzng aggregate resource utlzaton. As llustrated n Fg. 9, the dstrbuted resource management n CROWN grd adopts a two-ter peer-to-peer archtecture [18]. The super layer s the backbone consstng of S- Clubs; the chld layer ncludes all clents and resource provders. Nodes may jon and leave the collaboraton dynamcally, or transfer from one club to another club. Thus, the super layer nodes may have the trust records for chld nodes. In ths approach, the more resources a node provdes, the hgher degree of trust a node has. Obvously, every node s wllng to cooperate wth a node wth hgher trust value. The allocaton mechansm s as follows. In each perod of bddng, we calculate the per unt valuaton Vj bd j = Q j et j V j, Qj, et j for each bd n Clubj, where denote bddng prce, node number, and estmated executon tme, respectvely. Then we use 1,,, k max bd j bd j L bd j maxbd to denote. We calculate the evaluaton value of each bd by the formula: bd j evlbd j = α + (1 α) trust maxbd where pd trust cd pd Chld cd pd cd s the trust value from Clubj to, and α [,1] s rsk degree of Clubj accordng to the beneft and secure factors. If Clubj focuses more on beneft than securty, set α bgger; otherwse, set α smaller. We sort all bds n descendng order evlbd j. accordng to We go through the sorted bd lst and evaluate each sngle bd. If the resource request for a bd can be fulflled wth the remanng node resources, we wll allocate the resources to the bd. 4.1 Performance We conduct an experment wth a set of fve peers wth varyng currency and trust value, bddng for some porton of 2 unt resources. In the experment, the amount of resource that each bd can obtan s determned by TotalResources evlbd, where evlbd s evlbd the evaluated value of a bd. We adopt ths dstrbuton polcy to protect lght users aganst starvaton from heavy users when the demand s over the supply. We defne a metrc named allocaton_rato for each bd as follows. allocaton _ quantty allocaton_rato= total _ request _ quantty In ths experment, we assume that the total request quantty of each node s 1 unts. There are fve curves n Fgures 1 and 11, where x- axs represents the bds tmes, and y-axs represents the allocaton rato. We frst set the rsk degree α = 1 n Fg. 1 and consder the securty factor and set the rsk degree α =.2 n Fg. 11.,

Allocaton Rato(%) 1 8 6 4 2 S=9% S=3% S=6% S=3% S=9% Allocaton Rato(%) 1 8 6 4 2 S=9% T=.2 S=3% T=.8 S=6% T=.6 S=3% T=.1 S=9% T=.9 1 2 3 4 5 6 7 8 9 1 11 Bds Tmes Fg 1. Incentve compatble allocaton 1 2 3 4 5 6 7 8 9 1 11 Bds Tmes Fg 11.Trust ncentve allocaton The symbol S n both fgures denotes the rato of provdng resources, for example, the frst node provdes 9% of local resources to other nodes. The symbol T n Fg. 11 denotes the trust value of bddng nodes. In the frst three perods, only two nodes request for 1 unts resources and the supply meets the demand. Thus the allocaton ratos for the two nodes are all 1%. After the thrd perod, the ncreasng bds outnumbered the supply. As shown n Fg. 1, wthout the securty consderaton, the nodes provdng the same resources nearly obtan the same allocaton rato, and the more resources a node contrbutes, the hgher allocaton rato a node obtans. However, f consderng the trust value n Fg. 11, we can see that the provson rato of node 3 (6%) s less than that of node 1 (9%), but node 3 has a hgher trust value (T=.6) and thus obtans more resources than node 1. Smlarly, node 2 and node 4 have the same provson rato: 3%. Node 2 obtans more resources than node 4 at the sxth bd n Fg. 6, because the trust value of node 2 (T=.8) s far greater than that of node 4 (T=.1). Thus, by evaluatng the allocaton scheme, we see that peers can obtan more resources only havng shared more resources and accumulated hgher degree of trust. 5. Concluson In order to maxmze resource utlzaton as well as provdng trust management n grd computng envronments, we dscuss two resource organzaton and management mechnsm adpoted n CROWN grd. By buldng a servce club based overlay over the exstng mesh network of GISs, we mprove the resource locatng effcency. By ntroducng the prce strategy, we encourage nodes to share more resources, and ensure the balance of supply and demand, as well as enhance the resource utlzaton of grd envronment. We show the effectveness of our approach through experments n CROWN [14] system n our lab. 6. Acknowledgements Ths work s partally supported by Chna Natonal Scence Foundaton Project No. 9141211, Chna Mnstry of Scence and Technology under Contract 25CB3218, 25AA1191. We would also lke to thank Dr. Chunmng Hu, Yu Zhang, Halong Sun, Zongxa Du at Behang Unversty for ther help on ths work.

7. Reference [1]A. AuYoung, B. N. Chun, A. C. Snoeren, and A. Vahdat, "Resource Allocaton n Federated Dstrbuted Computng Infrastructures," n Proceedngs of the 1st Workshop on Operatng System and Archtectural Support for the on demand IT InfraStructure, 24. [2]R. Buyya, D. Abramson, and S. Venugopal, "The Grd Economy," Specal Issue on Grd Computng,Proceedngs of the IEEE, Mansh Parashar and Crag Lee (edtors), vol. 93, pp. 698-714, 25. [3]S. J. Chapn, D. Katramatos, J. Karpovch, and A. Grmshaw, "Resource management n Legon," n Proceedngs of Workshop on Job Schedulng Strateges for Parallel Processng, n conjuncton wth the Internatonal Parallel and Dstrbuted Processng Symposum, 1999. [4]B. N. Chun, J. A. C. Ng, D. C. Parkes, and A. Vahdat, "Computatonal Resource Exchanges for Dstrbuted Resource Allocaton," 24. [5]M. Feldman, K. La, I. Stoca, and J. Chuang, "Robust Incentve Technques for Peer-to-Peer Networks," n Proceedngs of ACM E- Commerce Conference (EC'4), 24. [6]S. Ftzgerald, I. Foster, C. Kesselman, G. v. Laszewsk, W. Smth, and S. Tuecke, "A Drectory Servce for Confgurng Hgh- Performance Dstrbuted Computatons," n the 6th IEEE symposum on Hgh- Performance Dstrbuted Computng, 1997, pp. 365-375. [7]I. Foster and C. Kesselman, "Globus: A metacomputng nfrastructure toolkt," n Internatonal Journal of Super-computer Applcatons, vol. 11, 1997, pp. 115-129. [8]I. Foster, C. Kesselman, and S. Tuecke, "The anatomy of the Grd," Internatonal Journal of Super-computer Applcatons, 21. [9]I. Foster, C. Kesselman, J. Nck, and S. Tuecke, The physology of the Grd: An Open Grd Servces Archtecture for dstrbuted systems ntegraton, http://www.globus.org/research/papers/ogsa.p df [1]I. Foster, C. Kesselman, J. Nck, and S. Tuecke, "Grd Servces for dstrbuted system ntegraton," n IEEE Computer Magazne, vol. 35, 22, pp. 37-46. [11]I. Foster and C. Kesselman, The Grd 2: Blueprnt for a New Computng Infrastructure: Morgan Kaufmann Publshers, 23. [12]J. Frey and T. Tannenbaum, "Condor-G: A computaton Management Agent for mult- Insttutonal Grds," Journal of Cluster Computng, vol. 5, pp. 237, 22. [13]C. Hu, Y. Zhu, J. Hua, Y. Lu, and L. M. N, "Effcent Informaton Servce Management Usng Servce Club n CROWN Grd," n Proceedngs of IEEE Internatonal Conference on Servces Computng, 25. [14]J. Hua, Y. Zhang, X. L, and Y. Lu, "Dstrbuted Access Control n CROWN Groups," n Proceedngs of Internatonal Conference on Parallel Processng (ICPP), 25. [15]T. B. Ma, S. C. M. Lee, J. C. S. Lu, and D. K. Y. Yau, "A Game Theoretc Approach to Provde Incentve and Servce Dfferentaton n P2P Networks," n Proceedngs of ACM SIGMETRICS/PERFORMANCE, 24. [16]K. Ranganathan, M. Rpeanu, A. Sarn, and I. Foster, "To Share or not to Share: an Analyss of Incentves to Contrbute n Collaboratve Fle Sharng Envronments," n Proceedngs of Workshop on Economcs of Peer to Peer Systems, 23. [17]R. Ranjan, A. Harwood, and R. Buyya, "Grd Federaton: An Economy Based, Scalable Dstrbuted Resource Management System for Large-Scale Resource Couplng," Grd Computng and Dstrbuted Systems Laboratory, Unversty of Melbourne, Australa, 24. [18]L. Xao, Z. Zhuang, and Y. Lu, "Dynamc Layer Management n Super-peer Archtectures," IEEE Transactons on Parallel and Dstrbuted Systems (TPDS), 25. [19]C. S. Yeo and R. Buyya, "Prcng for Utltydrven Resource Management and Allocaton n Clusters," n Proceedngs of the 12th Internatonal Conference on Advanced Computng and Communcaton, 24.