Internatonal Journal of Grd Dtrbuton Computng Vol.8, No.1 (2015), pp.25-32 http://dx.do.org/10.14257/gdc.2015.8.1.03 A Novel Archtecture Degn of Large-Scale Dtrbuted Obect Storage Sytem Shan Yng 1 and YAO Nan-mn 2 1 College of Informaton and Computer Engneerng, Northeat Foretry Unverty, Harbn 150040, Chna; 2 College of Computer Scence and Technology, Harbn Engneerng Unverty, Harbn 150001, Chna hanyngc@163.com Abtract A novel archtecture degn of large-cale dtrbuted obect torage ytem (called DOSS) propoed. Our degn tae everal apect effectng on the overall performance of DOSS nto conderaton, ncludng an mproved model of nteracton baed on the tradtonal nteractve mode of obect-baed torage ytem, MDS(Metadata Server) management cheme and a load balancng cheme combne the defnton of the maxmum load for OSD(Obect Storage Devce), herarchcal model and re-pollng n order to olve acce of heat obect ue. Our expermental reult how that the archtecture an effectve way to promote the performance of DOSS. Keyword: dtrbuted obect torage ytem; MDS management; model of data nteracton; load balancng 1. Introducton Obect torage ytem eparate data path, control path and management path n order to provde calablty, hgh performance, ecurty and data harng cro platform of torage ervce effectvely [1~3]. Wth expandng cale of DOSS and dverty of uer requrement, ytem complexty and requrement of dfferent level of ervce, archtecture degn how more and more mportance for promotng the performance of DOSS. A we now, archtecture degn nclude many apect, whle each degn of the tructure and level wll have effect on the overall performance to ome extent. Our degn manly tae the followng apect nto conderaton. Frtly, an mproved model of nteracton propoed baed on the tradtonal nteractve mode of obect-baed torage ytem [4, 5]. Secondly, a metadata management cheme ntroduced whch tae full account of the mater metadata electon from both the local and global performance mprovement. Thrdly, a load balancng cheme propoed combne the defnton of the maxmum load for OSD, herarchcal model and re-pollng n order to olve acce of heat obect ue. Our nteracton model of obect-baed torage and MDS management cheme can refer to the lterature [6, 7]. 2. Man Idea and Problem Formulaton 2.1. DOSS Archtecture Model The archtecture model we propoed for dtrbuted obect torage hown n Fgure.1. Whle the prmary part and relaton of the component and the mplemental detal decrbed a follow. In th fgure, there are three man component: (1)Clent; (2) Storage erver; (3) MDS. Local MDS compoe LRG, and multple LRG compoe WLRG. Each LRG connect to the torage erver for obect torage. ISSN: 2005-4262 IJGDC Copyrght c 2015 SERSC
Internatonal Journal of Grd Dtrbuton Computng WLRG 1.Requet Clent Applcaton and erver Control path MDS Metadata Server MDS LRG 3. repone Data path Management path 2. Management Storage Server Clent OSD Storage Manager OBSD Storage Devce Fgure 1. Archtecture Model for DOSS Fgure 2. Interactve Model of DOSS 2.2. Interactve Mode of Data for DOSS The nteracton model of DOSS hown n Fgure.2. The nteracton model ha three path: control path, data path and management path [8~13]. The three path are eparated. The Interactve proce of DOSS decrbed a follow: 1. Clent end requet to MDS for only once, and then wat for recevng data. 2. MDS no longer a tranfer taton, but a functonal entty. 3. Becaue authorzaton and authentcaton of clent are accomplhed by MDS, OSD trut MDS whch mean OSD wll not authentcate the requet and command from MDS. 4. OSD end repone data to clent after accomplhng the OSD torage operaton. 5. In the whole proce, MDS ha been a manager of OSD operaton. 2.3. MDS Management Scheme FSN λ1 λ λm FSRG N1 g N2 1 2 e2 FMN OSD queue LMN e1 WSRG e3 OALS agnng layer LSRG c b N0 a e0 3 N3 Logc OSD Pollng layer LSN d e f _ Fgure 3. The Topology Structure of Local Group and Global Group Fgure 4. Layer Model In DOSS, clent from dfferent regon end requet to the mot approprate MDS, epecally n DOSS, MDS n dfferent regon are ontly reponble for all nd of operaton of metadata, whle the performance of r/w operaton heavly depend on the topology tructure of MDS n dfferent geographcal regon. In our cheme, MDS dtrbuted n dfferent regon are parttonng at frt, and then MDS are dvded nto many group. In order to mplfy the dcuon of the topology n the followng ecton, each MDS condered a a node. When a node n the group communcate wth another node, a topology tree contructed and the prmary node of local group whch n the ame group wth requet node treated a the root of the tree. Fgure.3 how the topology compoed by WLRG and SRG. The SRG requeted by clent called LSRG, and other SRG n the WLRG called FSRG. The Node n SRG are dvded nto two categore: Mater Node (MN) and Secondary Node (SN). The Node requeted by clent at a certan tme become the MN, SN wll ubttute MN whle topology changed or ome operaton happened. Snce SRG ha two type of LSRG and FSRG, the MN and SN n LSRG called a LMN and Copyrght c 2015 SERSC 26
Internatonal Journal of Grd Dtrbuton Computng LSN, and MN and SN n FSRG called a FMN and FSN. The problem of vrtual center node electon treated a a mlar -center problem[14, 15]. 2.4. Load Balancng Scheme for DOSS In DOSS, clent from dfferent regon end requet to popular obect whch may produce ncreang of obect heat and load mbalance. Our load balancng cheme may be a effectve way to olve the ue. Defnton aocated wth our load balancng trategy are decrbed a follow: Defnton 1: Set of obect OB : { ob 1,,ob,,ob m}, m total number of obect, ob be : 1,,,, n OB and ob1 ob obm.let et of torage node D d d d, and n repreent total number of torage node. Defnton 2: Let properte of obect be obect dentfy and obect ze, repectvely. ob ( od, oz ), od and oz repreent Defnton 3: D model d ( dc, dt, dl, db ) and dc, dt, dl, db d repreent d capacty, tranfer rate, the current load, and d bandwdth repectvely. Defnton 4: Set of requet REQ : { req1,, req,, req u }, req ( od, rat, rt, rft ), rat, rt, rft repreent tartng tme of requet, tartng tme and completon tme of ervce, repectvely. Defnton 5: Maxmum load of each OSD loadm db, db, h repreent bandwdth of ( db db) h Defnton 6: Average repone tme of requet MRT mrt( REQ) m 1. (1) d, total bandwdth of d, heat of obect. (1 u) u rp 1. (2) u REQ and rp repone tme of requet. Dervaton of defnton 5 and 6 decrbed a follow: Let obect requet arrval rate dtrbuton obey Poon dtrbuton, arrval rate, whle obect ze obey zpfan dtrbuton. Requet probablty of ob p, by the Zpf m law, p 1 [16,17]. Requet arrval rate of ob 1 p, expected ervce rate E[ ] tme n p o OB, bandwdth of d db p obw, bandwdth of requet of ob ob OB db, the number of ervce requet at the ame ntenty of OSD obey 1, then average watng tme ewt n obw. Snce queue ervce. n - Snce dtrbuton of acce of obect aumed Poon dtrbuton, the heat of the obect h orv. orv fxed ervce tme of ob, orv oz dt. LEE [18] adopt wth ervce tme by order of the requeted obect to obtan the mnmum dfference between the d load and ervce tme for each d. The formula can be een by the ervce tme that ervce tme proportonal to the ze of the obect n the preme of dfference of the torage devce tranfer rate, o we combne obect Copyrght c 2015 SERSC 27
Internatonal Journal of Grd Dtrbuton Computng the ze of the heat wth the obect and the obect arranged n order of ze, n order to acheve mnmum ervce tme [19~22]. The trategy mae gradng the dfferent OSD n term of OSD ablty poble. Herarchcal model hown n Fgure.4. In th cenaro, MDS end obect to a local queue of OSD wth FCFS rule. Logc OSD logcally dvded nto two level, dtrbuton layer and the pollng layer. In the dtrbuton layer, the OSD layer condered a a logcal layer whch ha n logcal torage devce, the hgher the level, the tronger the ervce capacty. OSD herarchcal taen by logcal clafcaton of bandwdth, for example, f the bandwdth of OSD A n tme the bandwdth of OSD B, then OSDA can be dvded nto n logc bandwdth, whle the proceng capacty of OSDA n tme OSD B [23,24]. Herarchcal level dvon acheved collaboratve ervce of heterogeneou OSD and overall ervce performance. Pollng layer agn the obect to d reaonably accordng to the charactertc of the obect by way of em-pollng. Pollng mode adapted whch OSD wth hgh level ha pollng prorty, whle pollng begn untl level of OSDA wthout pollng and level of OSD B are equal. Therefore, th approach agn obect of hgh heat to a dfferent OSD accordng to ervce capablte, then the obect wth mlar ze wll be agned to OSD wth the ame bandwdth n a balanced manner whch tae dfferent requet level and dfferent ervce level nto account. Snce obect watng for allocaton accordng to ervce n chronologcal order, the hottet document agned to dfferent OSD n order to enure load balancng. Defnton 7: Set of logc layer L : { L 1,, L,, L v}, v number of dvded logc layer. Defnton 8: The maxmum load of L of OSD load on t own layer, load v load, 1 OSD. Allocaton algorthm decrbed a follow. Algorthm for the Strategy loadl, the total load of each OSD um load load on each logc layer L of program Agnobect Call algorthm LogcLevel to dvde logc OSD; All obect are ordered accordng ervce tme of n acendng, perform the followng tep for obect ; Caculate loadm of OSD wth fomula(2); If obect are not fully allocated, then call the functon AgneachLOSD n order to agn obect to approprate OSD; end. program AgneachLOSD For 1 and v, pollng L Whle 1 and n,do v If load loadl v and load loadm Agned ob to OSD ; load load load ; 1 ; 1 ; Copyrght c 2015 SERSC 28
Internatonal Journal of Grd Dtrbuton Computng end. program LogcLevel Order OSD acendng accordng to bandwdth 1, v 0, loadl =0 For 1 and n, can OSD equentally Whle db db, do 1 0 1 loadl db db, dvded logcal area layer of the logcal OSD, v ; ; end. OSD dvded nto logc OSD wth v layer accordng to dfferent bandwdth from L to 1 L v. The bandwdth on the ame layer the ame whch mean the maxmum load of each layer on logc OSD load equal. Logc OSD n dfferent level have dfferent functon, the hgher the level, the tronger proceng capablty, n order to enure the obect of dfferent heat dtrbutng n dfferent OSD. Meanwhle, tme of pollng of OSD wth trong ervce capacty ncreae to enure the balance of heat and ze dtrbuton of obect n the OSD, thereby mnmzng heat problem of obect acceed. Snce ervce capacty of OSD dfferent, the lower OSD level ha the weaer ervce capacty, on whch ome requet wth lttle heat and long tme ervce utable proceed. 3. Performance Evaluaton In our experment, we employ a mulaton to evaluate the propoed archtecture. We frt ntroduce expermental ettng. We ued NS-2 a the tool to etablh overall archtecture of DOSS and dm4.0 a tool to realze the OSD node. Our experment follow everal preme. Frtly, the acce of large fle relatve to tranmon tme, ee and rotatonal latency delay can be gnored. Secondly, Poon dtrbuton to reach the obect, obect acce obey Zpf dtrbuton. Thrdly, queung model obey M/G/1 model, ervce rule obey FCFS rule. A for the heterogeneou cae of OSD, dfferent properte of OSD are teted. To mae tet reult more clearly, we ue the OSD whch not dentcal to mulate heterogeneou OSD. We compare our trategy (OASL for hort) wth a mple pure pollng [16, 18] the cae (SOO for hort) the obect allocaton trategy n order to evaluate two dfferent dtrbuton tratege for the mpact on ytem performance over tme, our tet doe not conder dfferent batche of the dtrbuton of the requet. Wth the ncreae number of requet n the OSD, performance dfference of OSD wll change accordngly. For th problem, we teted the change MRT whle the OSD were 4, 6, 8, 10 and 12. The reult n Fgure.5 how the two tratege ha lttle dfference whle number of OSD 4. However, wth the ncreae of the number of OSD, OASL tratege produce mall mpact on MRT than SOO MRT, but th gap ncreae wth the number of OSD gradually wdenng. In our analy, a tang nto account the obect ze and capacty of two d ervce, OASL trategy contnue to how t better than SOO wth the ncreang number of OSD. From the fgure, the trend curve how that the growth rate of MRT n OASL how low growth than that of the MRT n SOO, wth the ncreang number of OSD, the ncreang the number of OSD ha lttle nfluence on MRT n OASL relatvely. It can be een that MRT change are relatvely table wth the ncreang number of OSD. Copyrght c 2015 SERSC 29
MRT(m) MRT(m) MRT(m) Internatonal Journal of Grd Dtrbuton Computng 120 100 80 OASL SOO 30 25 20 OALS SOO 60 15 40 10 20 0 4 6 8 10 12 number of OSD 5 0 100 200 300 400 500 Requet Acce Rate (1/m) Fgure 5. MRT of Dfferent Number Fgure 6. MRT of Dfferent Requet OSD Acce Rate MRT are teted wth the rate of requet acce rate changng n Fgure.6. MRT teted when requet acce 100,200,300,400,500 (1/m). The fgure how the mpact of MRT n OASL and SOO trategy wth the requet arrval rate ncreang, however, a the requet arrval rate n the ame crcumtance, OASL trategy obtan le MRT than SOO. So we peculate that t cloely related the dtrbuton of obect requet on one hand and layered allocaton of the OSD bandwdth on the other hand. It can alo be een from the fgure, wth the requet arrval rate ncreae, OASL trategy ha le nfluence on growth rate of MRT than SOO. 100 80 60 OASL SOO 40 20 0 1 2 3 4 5 maxmum obect ze(g) Fg. 7. MRT of Maxmum Obect Sze In the dtrbuton of the obect by way of pollng, the relatvely large ze of the obect wll gnfcantly affect the performance of load balancng trategy, o large obect are adopted to tet performance of SOO and OASL trategy. Experment reult hown n Fgure.7 how MRT ncreaed for SOO and OASL wth the maxmum obect ze ncreang. However, you can ee, when the larget obect the ame, MRT n OASL alway le than n SOO, a the larget ze of obect ncreae, the growth rate of MRT n OASL le than SOO. When the bgget obect of relatvely large, OASL howed excellent performance. 4. Concluon In th paper, we degn the archtecture of DOSS. Our degn tae everal apect effectng on the overall performance of DOSS nto conderaton, ncludng mproved model of nteracton, metadata management cheme and load balancng cheme. Our expermental reult how that the archtecture an effectve way to promote the performance and enure the load balancng. In our mmedate future wor, we wll tae dynamc load balancng cheme nto our conderaton whch nown a a gnfcant apect n obect torage ytem to enure communcaton relablty and dtrbuton ratonalty. Copyrght c 2015 SERSC 30
Internatonal Journal of Grd Dtrbuton Computng Acnowledgement Th wor upported by the Fundamental Reearch Fund for the Central Unverte No.DL13BBX03 and Fundamental Reearch Fund for the Central Unverte: HEUCFT1202,HEUCF100609. Reference [1] M. Mener, G. R. Ganger and E. Redel, Obect-Baed Storage, Communcaton Magazne, vol. 41, no. 8 (2003), pp. 84-90. [2] A. Azagury, V. Drezn, M. Factor, E. Hen, D. Naor, N. Rnetzy, O. Rodeh, J. Satran, A. Tavory and L. Yeruhalm, Toward an obect tore, Proc. 20th IEEE/11th NASA Goddard Conf. Ma Storage Sytem and Technologe, (2003). [3] M. M. Factor, K. Meth and D. Naor, Obect Storage: The Future Buldng Bloc for Storage Sytem, Proceedng of the 2nd Internatonal IEEE Sympoum on Ma Storage Sytem and Technologe, (2005). [4] M. Mener, G. R. Ganger and E. Redel, Obect-Baed Storage puhng more functonalty nto torage, Communcaton Magazne, vol. 34, no.17, (2008), pp. 88-90. [5] D. H. C. Du, Advancement and Future Challenge of Storage Sytem, Proceedng of the IEEE, (2008). [6] S. Yng, Y. Nanmn and Z. Janmng, Reearch on Archtecture of Obect-baed Storage Sytem, Journal of Computer Reearch and Development, vol. 46, (2009), pp.198-202. [7] S. Yng and Y. Nanmn, Reearch on MDS Selecton Scheme n Dtrbuted Obected Storage Sytem, Journal of Wuhan Unverty of Technology, vol. 33, no. 7, (2011), pp. 143-146. [8] D. Feng and H. Lu, Schedulng n Huge Obect-baed Storage Sytem, Proceedng of Japan-Chna Jont Worhop on Fronter of Computer Scence and Technology (FCST), (2006). [9] Yu Hua, Y. Zhu, H. Jang, D. Feng and L. Tan, Scalable and Adaptve Metadata Management n Ultra Large-cale Fle Sytem, Proceedng of the 28th Internatonal Conference on Dtrbuted Computng Sytem, (2008). [10] S. A. Brandt, E. L. Mller, D. D. E. Long and L. Xue, Effcent Metadata Management n Large Dtrbuted Storage Sytem, Proceedng of the 20th IEEE/11th NASA Goddard Conference on Ma Storage Sytem and Technologe, (2003). [11] W. Ln, Q. We and B. Veeravall, A Weght-baed Metadata Management Strategy for Petabyte-cale Obect Storage Sytem, Proceedng of the fourth nternatonal worhop on Storage Networ Archtecture and Parallel, (2007). [12] Q. Lu, D. Feng and F. Wang, Rerarch on MetaData Server of Hgh Relablty, Computer Engneerng, vol. 34, no. 17, (2008), pp. 88-90. [13] Y. Hua, D. Feng and B. Xao, TBF:An Effcent Data Archtcture for Metadata Server n the Obectbaed Storage Networ, Proceedng of Internatonal conference on Networ (ICON), (2006). [14] J. L, Xu Lu and J. Zhu, r-domnatng Set Problem and -Center Problem n Weghted Tree, OR Trananacton, vol. 13, no. 2, (2009), pp. 111-118. [15] S. Khuller and Y. J. Sumann, The capactated -center problem: SIAM Journal on Dcrete Mathematc, vol. 13, (2000), pp. 403-418. [16] T. Xe and Y. Sun, A Fle Agnment Strategy Independent of Worload Charactertc Aumpton, ACM Tranacton on Storage, vol. 5, no. 3, (2009). [17] Z. Zeng and B. Veeravall, On the Degn of Dtrbuted Obect Placement and Load Balancng Stratege n Large-Scale Networed Multmeda Storage Sytem, IEEE Tranacton on Knowledge and Data Engneerng, vol. 20, no. 3, (2008), pp. 369-382. [18] L. W. Lee and P. Scheuermann, Fle Agnment n Parallel I/O Sytem wth Mnmal Varance of Servce Tme, Tranacton on Computer, vol. 49, no. 2, (2000), pp. 127-140. [19] P. Scheuermann, G. Weum and P. Zabbac, Data parttonng and load balancng n parallel d ytem, The VLDB Journal, vol. 7, (1998), pp. 48-66. [20] Z. Gongye, L. We and C. Jnca, Hotpot data balancng n OBS, Journal of Huazhong Unverty of Scence and Technology (Nature Scence Edton), vol. 35, no. 12, (2007), pp. 28-31. [21] W. Fang, Z. Shunda, F. Dan and Z. Lngfang, Hybrd obect allocaton polcy for obect torage ytem, Journal of Huazhong Unverty of Scence and Technology (Nature Scence Edton), vol. 35, no. 3, (2007), pp. 46-48. [22] L. J. Qn, D. Feng, L. F. Zeng and Q. Lu, Dynamc Load Balancng Algorthm n Obect-Baed Storage Sytem, Computer Scence, vol. 33, no. 5, (2006), pp. 88-91. [23] Z. Shengl, Method of Data Agnment on Heterogeneou D Sytem, Mn-Mcro Sytem, vol. 25, no. 11, (2004), pp. 1970-1974. [24] S. A. Wel, K. T. Pollac and S. A. Brandt, Dynamc Metadata Management for Petabyte cale Fle Sytem, Proceedng of the 11th Internatonal Conference on Hgh Performance Computng (HPC), (2004); Bangalore, Inda. Copyrght c 2015 SERSC 31
Internatonal Journal of Grd Dtrbuton Computng Author Shan Yng. Lecturer of College of Computer Scence and Technology, Harbn Engneerng Unverty. Born n 1981, receved Ph.D. degree from College of Computer Scence and Technology, Harbn Engneerng Unverty. She maor n networ torage and cloud torage. YAO Nan-mn. Profeor and Ph.D. upervor of College of Computer Scence and Technology, Harbn Engneerng Unverty. Born n 1974, member of nformaton torage technology pecalty commttee n Chna Computer Federaton. He maor n Wrele Senor Networ and cloud torage ytem. Copyrght c 2015 SERSC 32