DBA-VM: Dynamic Bandwidth Allocator for Virtual Machines

DBA-VM: Dynamc Bandwdth Allocator for Vrtual Machnes Ahmed Amamou, Manel Bourguba, Kamel Haddadou and Guy Pujolle LIP6, Perre & Mare Cure Unversty, 4 Place Jusseu 755 Pars, France Gand SAS, 65 Boulevard Massena 7513 Pars Emal: {ahmed.amamou, manel.bourguba, guy.pujolle}@lp6.fr kamel@gand.net Abstract Cloud computng s an emergent paradgm that allows customers to rent nfrastructure, platforms and software as a servce. Wth resource sharng and reuse through vrtualzaton technology, cloud envronments become even more cost effectve and flexble. Nevertheless, networkng wthn vrtualzed cloud stll presents some challenges n performance and resource allocaton. In ths paper, we propose DBA-VM, a Dynamc Bandwdth Allocator for Vrtual Machnes wth regard to the establshed SLAs. The proposed scheme enforces the solaton between the vrtual machnes through the transmsson bandwdth adjustment at the network I/O channel. The expermental performance evaluaton shows that DBA-VM allows to the vrtualzed system to respect each vrtual machne SLA whle reducng the global physcal resources (CPU and memory) consumpton. I. INTRODUCTION Cloud computng s a new technology trend that s expected to reshape the nformaton technology landscape. It s a way to delver software, nfrastructure and platforms as a servce to remote customers over the Internet. Cloud computng reduces hardware s management and software resources cost by shftng the locaton of the nfrastructure to the network. It offers hgh avalablty, scalablty and cost-effectveness snce t s partcularly assocated wth the provson of computng resources on-demand and accordng to a pay-as-you-use model. These resources are kept on the provder s servers whch are located n varous parts of the Internet. Ther management s then shfted from the user to the provder. Cloud computng refers to both the applcatons delvered as servces over the Internet and the hardware and systems software n the data centers that provde those servces [1]. The cloud s defned n [2] as a large pool of easly usable and accessble vrtualzed resources (such as hardware, development platforms and/or servces). These resources can be dynamcally reconfgured to adjust to a varable load (scale), allowng also for an optmum resource utlzaton. Ths pool of resources s typcally exploted by a pay-as-you-use model n whch guarantees are offered by the nfrastructure provder by means of customzed Servce Level Agreements (SLAs). Ths defnton ntroduces the vrtualzaton as a key enablng technology for cloud computng. In fact, vrtualzaton bascally allows parttonng one physcal machne to multple vrtual nstances runnng concurrently and sharng the same physcal resources. Advances n system vrtualzaton make nfrastructure-as-a-servce a compellng paradgm snce t offers cost effectveness through resources sharng. It also offers flexblty through the ablty of mgratng vrtual machnes from one physcal machne to another whch helps reducng energy consumpton. Furthermore, t enhances the cloud platform scalablty and avalablty through the nstantaton of new solated vrtual nstances on demand. Vrtualzaton has been wdely studed and deployed n recent years [3]. The Vrtual Machne Montor (VMM), also called hypervsor s a software layer that presents abstractons of the underlyng physcal resources to the guest machnes. It allows the dfferent vrtual machnes to share the physcal resources ncludng the network devce. Network I/O vrtualzaton s essental to provde connectvty to the vrtual machnes. However, the current mplementatons of VMMs do not provde hgh enough throughputs, especally when the applcatons runnng on dfferent vrtual machnes wthn the same physcal machne are I/O ntensve (web servces, vdeo servers,...)[5][6][7]. Network ntensve applcatons are among the applcatons domnatng the cloud-based data centers today [9]. Although there are compellng advantages behnd vrtualzng the cloud computng nfrastructure, there are stll performance ssues that need to be addressed before vrtualzng the data centers could be fully advantageous. Indeed, concurrent applcatons share equally the avalable bandwdth. Current VMMs only offer a statc allocaton of the bandwdth. In ths paper, we propose an SLA aware dynamc bandwdth allocator that dynamcally manages bandwdth allocaton among vrtual machnes accordng to the establshed SLAs. The proposed mechansm allocates the requred bandwdth n terms of both bts per second and packets per second whle mnmzng global physcal resources consumpton. The remander of ths paper s organzed as follows: Secton 2 ntroduces some related works. We state the problem through the natve system evaluaton n secton 3. In secton 4 we detal the proposed soluton and ts expermental evaluaton n secton 5. Fnally, secton 6 concludes the paper and ntroduces our future work. II. RELATED WORK Over the last few years, a far number of research efforts has been dedcated to the enhancement of the I/O vrtualzaton 978-1-4673-2713-8/12/$31. 212 IEEE 713

technology. In both [5] and [6], the authors conducted extensve measurements to evaluate the performance nterference among vrtual machnes runnng network I/O workloads that are ether CPU or network bound. They show how dfferent resources schedulng and allocaton strateges and workloads may mpact the performance of a vrtualzed system. In [1] the authors show that cache and memory archtecture, network archtecture and vrtualzaton overheads can be scalablty bottlenecks n a vrtualzed cloud platform, dependng on whether the applcaton s compute or memory or network I/O ntensve respectvely. Network performance evaluaton of vrtual machnes was the objectve of multple others works [11] [12]. The transmsson, recepton and emsson throughputs of vrtual machnes are shown to be very low compared to the Dom (the prvleged doman) performance. The multple context swtches and the costly I/O communcaton between the drver doman and the vrtual machnes through the event channel are behnd ths drastc performance degradaton. A deep analyss of the network I/O operatons wthn Xen n [8] shows that the grant mechansm ncurs sgnfcant overhead when performng network I/O operatons. Ths overhead s mostly due to the overheads of grant hypercalls and of the hgh cost of page mappng/unmappng. For ths purpose, the authors proposed several optmzatons to the memory sharng mechansm mplemented n Xen. They mproved the cache localty by movng the grant copy operaton from the drver doman to the guest. Besdes, they proposed to relax the memory solaton property to reduce the number of grant operatons performed. In ths case, performance would come at the cost of solaton, one of the most attractve benefts of the Xen archtecture. In [13], the authors proposed a new desgn for the memory sharng mechansm wth Xen whch completes the mechansm presented n [8]. The basc dea of the new mechansm s to enable the guest domans to unlaterally ssue and revoke a grant. Ths allows the guest domans to protect ther memory from ncorrect Drect Memory Access (DMA) operatons. Beyond the memory sharng mechansm, the authors of [14] proposed to optmze the nterrupt delver route and shorten the network I/O path. In [1], the author shows that the outof-the-box network bandwdth to another host s only 71% and 45% of non-vrtualzed performance for transmt and receve workloads, respectvely. These bottlenecks are present even on a test system massvely over-provsoned n both memory and computaton resources. Smlar restrctons are also evdent n commercal clouds provded by Amazon [19], showng that even after much research effort I/O vrtualzaton bottlenecks stll challenge the desgners of modern systems [2]. III. BACKGROUND AND PROBLEM STATEMENT A. Vrtualzed cloud envronment A cloud platform bascally conssts n multple data centers connected through a WAN and a web portal. The data center s composed of multple physcal nodes connected through a LAN. Insde the data center, the nfrastructure can be vrtualzed, n whch case each physcal machne supports multple solated vrtual machnes. Dfferent applcatons (game server, Fg. 1. Drver doman based I/O Vrtualzaton model meda server..) run over these vrtual machnes and users have drect access to those applcatons through the web portal. These vrtual machnes share the same hardware and storage, and can be mgrated from one physcal machne to another n the same data center or even n a remote data center. The VMM ensures physcal resources sharng (CPU, memory, etc.) and provdes solated shared access to the devces through a specal vrtual machne called drver doman (Fgure 1). The drver doman hosts the devces physcal drvers and s responsble for protectng the I/O access as well as transferrng the traffc to the approprate vrtual machne. Wth the drver doman I/O model, all the vrtual machnes share the same network nterface and the drver doman demultplexes ncomng and multplexes outgong traffcs. A great level of transparency s hence reached snce the guest machnes do not have to mplement the eventually buggy devce drvers. Besdes, snce all the traffc goes through the drver doman, ths latter enjoys more traffc montorng abltes lke admsson control or prortes establshment between the flows wth regard to ther types. However, ths model performance experences lmtatons due to the overhead ncurred by the communcaton between the drver doman and the guests. We wll further analyze ths lmtaton n the next secton. B. Xen network I/O archtecture Xen [2] s a popular open source VMM for the x86 archtecture. Xen reles on the drver doman to host devce drvers and to ensure shared access to the network devce among the guest machnes [15]. In a Xen envronment, the drver doman hosts the physcal devce drvers. Each guest machne s assocated one or more vrtual nterfaces (vf) that are connected va a brdge to the physcal nterface. A vf s splt nto the netback (n the drver doman) and the netfront (n each guest machne). Shared memory pages are used to transfer the packets between the drver doman and the guests. Network transmssons and receptons are acheved as llustrated by Fgure 1. As soon as a packet s sent by upper layer, t s relayed to netfront, ths latter notfes the netback of the arrval of the packet and copes the packet to ts address space.when the drver doman s scheduled, the netback see the notfcaton, look for the packet n shared memory page and relays t to the Brdge. The Brdge relays the packet to devce drver that 978-1-4673-2713-8/12/$31. 212 IEEE 714

transmt t to the network devce. Incomng packets wll follow the opposte path. C. Problem statement In a vrtualzed cloud, multple vrtual machnes are dedcated to dfferent types of applcatons whle sharng the same physcal machne and network devce. The sum of rates at whch the vrtual machnes transmt cannot thus exceed the physcal Network Interface Card (NIC) bandwdth. Some applcatons lke vdeo streamng servers are requred to sustan an acceptable throughput so that the contract wth the customer could be respected. The vdeo server thus requres a bandwdth that may not be guaranteed n the presence of concurrent flows. In a natve vrtualzed system, the vrtual machnes share the avalable bandwdth equally. Then, nstantatng a new vrtual machne may compromse the QoS requred by already runnng applcatons. Natve Xen only offers a tool to statstcally set a cap on the bandwdth that a vrtual machne can enjoy and the whole system needs to be restarted after each reconfguraton. Frst, we show through expermental evaluaton how the Xen natve system s unable to respect the bandwdth allocaton specfed n the SLAs. 1) Expermental Setup: The system that we are usng s a Dell PowerEdge 295 server, wth two 294 Mhz Intel Quad-core CPUs. Pars of cores share the same L2 cache, and all 8 cores share the same man DDR2 667Mhz memory. Networkng s handled by one quad-ggabt card usng a PCI X4 channel. As a hypervsor, we use Xen 3.4. n paravrtualzaton mode. We nstantate a drver doman and three guest machnes: VM1, VM2 and VM3 for traffc transmsson. The drver doman s allocated four cores and each guest vrtual machne s allocated only one core. As traffc snk, we used one NEC machne wth a 24 Mhz core 2 duo processor, 1GB DDR2 667Mhz and 1 Gb NIC. We used Iperf for the traffc transmsson. VM1, VM2 and VM3 are characterzed by the SLAs SLA1, SLA2 and SLA3 respectvely, as follows: VM1 and VM3 requre a bandwdth of only 15Mb/s whle VM2 requres 7Mb/s. Each vrtual machne s connected to one vrtual nterface. The three vrtual machnes send traffc at the rate of 1Gb/s. We consder the followng two scenaros: In the frst scenaro, the three vrtual machnes send packets of 15 bytes. In the second scenaro, VM1 and VM3 send packets of 64 bytes whle VM2 sends packets of 15 bytes. In both scenaros, the three vrtual machnes transmt packets at the rate of 1Gb/s. 2) Expermental Results: Fgure 2 shows how the natve system s unable to guarantee the requred bandwdth to each vrtual machne. Indeed, the three vrtual machnes share equally the lnk bandwdth and transmt at 33 Mb/s each. Then, one can magne that usng a traffc shapng method as traffc controller (TC),deployed n drver doman, would resolve the problem. Ths s ndeed true wth the frst scenaro when vrtual machnes transmt large packets of 15 bytes. However, wth regard to the second scenaro, we notce that nether the natve system nor the natve system wth TC respect the establshed SLA. In fact, no vrtual machne s able to Banwdth(Mb/s) 8 7 6 5 4 3 2 1 SLA1(15Mb/s) SLA2(7Mb/s) SLA3(15Mb/s) (a) scenaro1 Fg. 2. TC Banwdth(Mb/s) 6 5 4 3 2 1 Transmsson throughput SLA1(15Mb/s) SLA2(7Mb/s) SLA3(15Mb/s) (b) scenaro2 transmt at the requred throughput: VM1 and VM3 requre 15 Mb/s each, but they are able to acheve 5 Mb/s. VM2 sees ts throughput lmted to 52 Mb/s whle t requres 7 Mb/s. Ths s due to the fact that the transmsson s lmted to 19 Kp/s as shown by fgure 3 wth 64 bytes szed packets. The vrtual machne s then unable to transmt 64 bytes szed packets at more than 5 Mb/s. The transmsson capacty s even worse wth traffc control TC snce ths latter rejects packets n the drver doman after they are transferred from the vrtual machne through the shared memory. Ths leads us to evaluate the system s consumpton n terms of CPU and memory transactons n order to determne the system bottleneck. Ths bottleneck s behnd the transmsson throughput lmtaton of 19 Kp/s. Then, we consder the two man system physcal components: the CPU and the memory. We am to determne the component whch has reached ts maxmum capacty when the vrtual machne transmts at the maxmum throughput of 19 Kp/s. For each component, we profle ts usage usng Xenoprofle [21] n order to determne the effectvely used capacty of the component. Then, we compare, we compare the upper-bound capacty per transmtted packet wth the effectvely consumed capacty per packet. Fgure 4 shows that at a throughput of 19 Kp/s, the system has consumed all the avalable memory transactons whle there stll are avalable CPU cycles. We conclude then that the memory s the physcal bottleneck of the system. In current VMM mplementatons, when one vrtual machne transmts at a rate exceedng the avalable bandwdth, the drver doman drops the packets (n the netback). Packets are then dropped after they have been transferred through the memory from the netfront to the netback. All of these operatons are shown to requre multple memory transactons. To encounter ths problem, we propose to ntegrate an SLA-based Dynamc Bandwdth Allocator for the vrtual machnes called DBA-VM that wll run n the drver doman to dynamcally adjust the transmsson bandwdth of each vrtual machne accordng to the establshed SLA and the avalable bandwdth n terms of bts per second as well as packets per second. Furthermore, n order to mnmze the memory consumpton, we propose that the DBA-VM drops packets n the netfront (rather than n the netback) whenever the packet s TC 978-1-4673-2713-8/12/$31. 212 IEEE 715

CPU Cycles (KC/packet) 1 8 6 4 2 Output Rate (Kp/s) Fg. 3. 3 25 2 15 1 5 64Bytes packets 15Bytes packets 5 1 15 2 25 3 Input Rate (Kp/s) Transmsson Throughput n packets per second CPU Upper-bound Effectve 5 1 15 2 25 Input Rate (Kp/s) Fg. 4. Memory Transactons (MT/packet) 8 7 6 5 4 3 2 1 Physcal resources consumpton Memory Upper-bound Effectve 5 1 15 2 25 Input Rate (Kp/s) dedcated to be dropped due to bandwdth exceed. Thus we elmnate unnecessary and costly packet copes and notfcatons between the netfront and the netback. IV. DBA-VM: DYNAMIC BANDWIDTH ALLOCATOR FOR VIRTUAL MACHINES We consder a vrtualzed system wth a drver doman and several vrtual machnes wth dfferent QoS requrements dependng on applcatons that each one hosts. We use an SLA that, n addton to system requrements (CPU, memory), specfes bandwdth usage n terms of bts per second and a maxmum packets per second rate for each vrtual machne as network requrements. Such an SLA defnton takes also nto consderaton the physcal machne packets per second rate lmt. The proposed DBA-VM s bult n wth regard to such an SLA defnton. In order to guarantee an acceptable bandwdth to vrtual machnes hostng applcatons requrng QoS, DBA- VM proposes a dfferentaton mechansm operatng at the drver doman level that dynamcally readjusts transmsson bandwdth accordng to the SLAs. Ths mechansm classfes the dfferent vrtual nterfaces nto classes that are characterzed by a prorty, by a maxmum and mnmum bandwdth and by maxmum allowed packet per second rate. Snce the memory bottleneck s due to multple useless packets copes from the netfront to the netback, DBA-VM wll be deployed between these two components to avod such useless memory usage. Indeed, wth the DBA-VM, the packets dedcated to be dropped whenever the maxmum allowed bandwdth s exceeded, wll be dropped at the netfront, before ther transfer to the netback In our algorthm we use the followng notatons: N the number of vrtual machnes. VM j the vrtual Machne j, j= 1..N vf vrtual nterface, =1..M B p maxmum bandwdth of the physcal nterface p. B the bandwdth at whch vf s transmttng, =1..M the maxmum bandwdth at whch vf s allowed to emt, set n the SLA. B mn the mnmum guaranteed bandwdth of vf, set n the SLA. Bp ex s the avalable physcal nterface bandwdth. C the class of vf. pps VMj the maxmum rate n packets per second that VM j can send, set n the SLA B max pps vf the maxmum rate n packets per second that vf can send BT VMj total bandwdth emtted by VM j The DBA-VM s run n two steps: frst t computes the maxmum bandwdth n bts per second and second t computes the maxmum packets per second rate. a) Step 1: Maxmum bandwdth computaton n bts per second For each physcal nterface P, the DBA-VM browses each vf attached to P startng wth the ones belongng to the hghest prorty class. The DBA-VM measures B for each vf. In the case where multple vrtual nterfaces belong to the same class, the DBA-VM wll start wth the frst created one. For each vf, f B s between B max B < B mn ), then no change s made. In the case where B exceeds B max and B mn (B > B max (B max < ) then B wll be readjusted to B max and the avalable bandwdth BP ex wll be augmented by the resultng dfference of B -B max. BP ex Bex P +(B - B max ) Fnally n the case where B went below B mn then the DBA-VM checks whether there stll s avalable bandwdth (B ex ) on the physcal nterface and whether (B - B mn ) < B ex or not. If so, B mn s readjusted to B and BP ex s dmnshed by the dfference B mn - B. If not, n the case where the current vrtual nterface belongs to the least mportant class, t readjusts the bandwdth of all the other vrtual nterfaces vf j, j=1..k belongng to the same class to B mn j so that B could reach B mn. In the case where there are other less prortzed classes C x, x=1..n, then the bandwdth of each vrtual nterface belongng to the class C x s also readjusted to the mnmum bandwdth of the class C x : Bx mn startng wth the least prortzed class.if there s an remanng avalable bandwdth B ex > then t wll be reallocated to the dfferent vrtual nterfaces based on ther prortes. b) Step 2: Maxmum bandwdth computaton n packets per second For each vrtual machne j VM j the DBA-VM browses each vf attached to VM j. The DBA-VM measures B for 978-1-4673-2713-8/12/$31. 212 IEEE 716

Banwdth(Mb/s) 9 8 7 6 5 4 3 2 1 15B15B15B SLA1(15Mb/s,2kp/s) SLA2(7Mb/s,6kp/s) SLA3(15Mb/s,6kp/s) (a) 64B 15B 64B Banwdth(packets/s) 16 14 12 1 8 6 4 2 SLA1(15Mb/s,2Kp/s), 64Bytes SLA2(7Mb/s,6kp/s), 15Bytes SLA3(15Mb/s,6kp/s), 64Bytes TC DBA-VM Fg. 5. (a) Transmsson throughput wth DBA-VM n bts per second (b)transmsson throughput n packets per second wth dfferent confguratons (b) Delay(ms) 8 7 6 5 4 3 2 1 SLA1(15Mb/s,2kp/s), 64Bytes SLA2(7Mb/s,6kp/s), 15Bytes SLA3(15Mb/s,6kp/s), 64Bytes TC DBA-VM (a) Delay Fg. 6. Jtter(ms) 2 15 1 Packets delay and jtter 5 SLA1(15Mb/s,2kp/s), 64Bytes SLA2(7Mb/s,6kp/s), 15Bytes SLA3(15Mb/s,6kp/s), 64Bytes TC DBA-VM (b) Jtter each vf and then sums all B ths sum s BT VMj whch s the total bandwdth consumed by VM j For each vrtual nterface the bandwdth n packet per second s the total vrtual machne packet per second multpled by nterface usage coeffcent whch s nterface bandwdth dvded by vrtual machne total bandwdth ( B /BT VMj ). V. PERFORMANCE EVALUATION We have developed the proposed DBA-VM as a module that we ntegrated to the drver doman kernel. It conssts of a daemon that perodcally executes the descrbed algorthm, checks the rate at whch each vrtual machne s transmttng, and reconfgures all the vrtual machnes rates accordng to these results and SLA defnton. In order to evaluate our algorthm performance, we wll compare t to Natve Xen System and also to traffc shapng mechansm usng TC deployed n the drver doman. We use the same expermental setup and scenaros as n secton III.c. We also modfy the three SLA by ntroducng the SLA packets per second parameters as follows: SLA1 fxes the maxmum rate to 2 klos packets per second (Kp/s), whle SLA2 and SLA3 fx t to 6Kp/s. We present DBA-VM evaluaton of bandwdth, QoS parameters and System resources consumpton. A. System throughput For homogenous traffc of large packets (15 bytes) fgure 5(a), DBA-VM allows the system to respect the SLA. In fact, as total packet rate per second s well below VM maxmum achevable packets per second rate, all the SLAs throughput n terms of packets per second and bts per seconds are respected. In the second scenaro, the vrtual machnes transmt a mxture of large and small packets. We notce a decrease n the transmsson of the three vrtual machnes especally for VM1 and VM3 whch transmsson throughput dropped from 15 Mb/s to 9.1 Mb/s and 22 Mb/s respectvely. In such case VM2 throughput decreases slghtly whle we have a more mportant decrease n VM1 and VM3 throughput. However M2 throughput for the DBA-VM case n scenaro 2 s clearly better than for natve System and natve system wth TC.VM1 s more affected than VM3 by ths decrease snce ts SLA allows less packets per second rate. We also notce that even CPU Cycles (MC/s) 24 22 2 18 16 DBA-VM.3.4.5.6.7.8.9 1 1.1 1.2 readjustement perod (s) Fg. 7. System Resource consumpton: CPU f the VM1 and VM3 throughput s decreased, the packets per second rate specfed n the SLA s respected as shown by fgure 5(b). Unlke natve system and natve system wth TC, DBA-VM mposes a strct packet per second allowed rate. Ths constrant allows a better network bandwdth sharng between the dfferent vrtual machnes. B. Performance analyss for QoS parameters We frst notce that the proposed mechansm consderably reduces the packets delay for the flows transmtted by VM2 and VM3 from respectvely 3 and 4ms to less than 1ms and 1.6ms.However, we notce an ncrease n the delay for packets transmtted by VM1 from 4ms up to 6ms. As VM1 has low packets per second rate, t s scheduled for a smaller perod comparng to VM2 and VM3 so ths lead to a bgger packets transmsson delay. As VM1 has a relatvely low packets per second rate compared to VM2 and VM3, t s expected that the packets transmtted by VM1 experence a hgher delay. The jtter value n DBA-VM s around.5ms for hgh prorty flows. Ths represents a relatvely good result for QoS compared wth a TC based system. It s nterestng to pont out that wth DBA-VM, when we are under the SLA lmts, the loss rate s lower than wth the natve system snce we control the packets transmsson rate for each Vrtual machne so we can lmt packets drop. However, as soon as we reach the SLA lmtatons, the loss rate grows rapdly. Ths s an ntended mechansm to avod affectng the other machnes performances. C. System Resources consumpton We also evaluated the system CPU and memory resources consumpton for dfferent readjustment perods. The readjustment perod s the perod after whch bandwdth s recomputed n terms of bts per second and packets per 978-1-4673-2713-8/12/$31. 212 IEEE 717

transacton (mllon/second) 6 5 4 3 2 1 DBA-VM.3.4.5.6.7.8.9 1 1.1 1.2 Fg. 8. readjustement perod (s) System Resource consumpton: MEMORY second for each Vrtual Machne. The DBA-VM avods transferrng packets emtted from vrtual machnes beyond ther SLA. It also avods packets loss n drver doman. Ths leads to less system resources consumpton. However the algorthm ntroduces a CPU and memory overload. Usng a long readjustment perod leads to less computaton n the daemon, so ths wll lead to less memory and CPU usage for a non I/O operaton, n return ths leads to a bgger adaptaton tme. A compromse should be found between adaptaton tme and system resources consumpton. In order to evaluate the DBA-VM mpact on system resource consumpton we profled the system resources usage (memory transacton and CPU cycles) usng Xenoprofle [21] wth both natve and DBA-VM system. Fgures 8 and 9 present CPU and memory usage for dfferent readjustment perods. Frst we observe that for a perod lastng more than.3 second, the DBA-VM consumes less memory transactons than the natve system. Ths s due to the fact that we avod useless memory copes between the netfront and the netback. Wth regard to the CPU consumpton, the DBA-VM consumes as much CPU cycles as the natve system form a readjustment perod equal to.9 second. Ths value represents a good check perod snce wth such a value, the DBA-VM also reduces the acheved memory transactons compared to the natve system. VI. CONCLUSION In ths paper we proposed DBA-VM, a new mechansm for the dynamc bandwdth allocaton to the vrtual machnes, n a vrtualzed cloud envronment. We frst showed the natve system and the TC based system shortcomngs n guaranteeng the requred transmsson bandwdth to the vrtual machnes. We have also evaluated the system s capacty n terms of transmtted packets per second and show that the memory severely lmts ths transmsson rate. These fndngs led us to propose a novel scheme that enables the system to adjust the transmsson rate of the vrtual nterfaces n the I/O channel accordng to the vrtual machne SLA, the transmsson bandwdth beng defned n terms of both bts per second and packets per second n the SLA. The expermental evaluaton frst shows that the proposed mechansm allows the respect of the vrtual machnes SLA by enforcng the solaton between the dfferent flows as well as an mprovement n the QoS parameters. Furthermore, ths gan s acheved whle reducng the total cost n terms of physcal resources (CPU and memory) usage. We ntend next to extend our algorthm to establsh the SLAs based on flows classes rather than vrtual machnes classes. Furthermore, our proposal could be extended to defne classes accordng to multple QoS parameters lke packet delay and jtter n order to enable vrtualzed cloud totally respond to customers expectatons. REFERENCES [1] M. Armbrust, A. Fox, R. Grffth, A.D Joseph, R. Katz, A. Konwnsk, G. Lee, D. Patterson, A. Rabkn, I. Stoca, A.Zahara, Above the Clouds: A Berkeley Vew of Cloud Computng, Techncal Report No. UCB/EECS- 29-28, February 1, 29. [2] L. M.Vaquero, L. Rodero-Merno, J.Caceres, M.Lndner, A Break n the Clouds: Towards a Cloud Defnton, ACM SIGCOMM Communcaton Revew, vol 39, no. 1, Jan. 29, pp. 5-55. [3] N. Feamster, L. Gao, and J. Rexford, How to lease the Internet n your spare tme, n the Edtoral Zone of ACM SIGCOMM Computer Communcatons Revew, p. 61-64, January 27 [4] P. Barham, B. Dragovc, K. Fraser, S. Hand, T. Harrs, A. Ho, R.Neugebauer, I.Pratt, and A. Warfeld, Xen and the art of vrtualzaton, 19th ACM Symposum on Operatng Systems Prncples, October 23. [5] P. Xng, L. Lng, M. Yduo, A. Menon, S. Rxner, A.L Cox, W. Zwaenepoel, Performance Measurements and Analyss of Network I/O applcatons n Vrtualzed Cloud, Internatonal Conference on Cloud Computng, 21. [6] P. Xng, L. Lng, M. Yduo, S. Svathanu, K. Younggynm, P. Calton, Understandng Performance Interference of I/O Workload n Vrtualzed Cloud Envronments. Internatonal Conference on Cloud Computng, 21. [7] P. Apparao, S. Maknen, and D. Newell, Characterzaton of network processng overheads n xen, n Proceedngs of the 2nd Internatonal Workshop on Vrtualzaton Technology n Dstrbuted Computng VTDC 26, Washngton, DC, USA, 26. [8] JR. Santos, Y. Turner, G. Janakraman, I. Pratt, Brdgng the gap between software and hardware technques for I/O vrtualzaton, USENIX Annual Techncal Conference, 28. [9] A. L, X. Yang, S. Kandula, M. Zhang, CloudCmp: comparng publc cloud provders, n Proceedngs of the 1th annual conference on Internet measurement (IMC 1), 21 [1] J. Shafer, I/O Vrtualzaton Bottlenecks n Cloud Computng Today, Workshop on I/O Vrtualzaton (WIOV 21), Pttsburgh, 21. [11] F. Anhalt, and P. Vcat-Blan Prmet, Analyss and expermental evaluaton of data plane vrtualzaton wth Xen, n Proceengs of the ffth Internatonal Conference on Networkng and Servces 29 [12] X. Xu, F. Zhou, J. Wan, and Y. Jang, Quantfyng performance propertes of vrtual machne, n the Internatonal Symposum on Informaton Scence and engneerng, 28 [13] K.K Ram, Y. Turner and J.R, Santos, Redesgnng Xen s memory sharng mechansm for safe and effcent I/O vrtualzaton, In the second workshop on I/O vrtualzaton, Pttsburgh, PA, USA, 21. [14] J. Zhang, X. L, and H. Guan, The optmzaton of Xen network vrtualzaton, n the proceedngs of the Internatonal Conference on Computer Scence and Software Engneerng, 28. [15] K. Fraser, S. Hand, R. Neugebauer, I. Pratt, A. Warfeld, and M. Wllams, Safe hadrware Access wth the Xen vrtual machne montor, In Proceedngs of the frst workshop on Operatng System and Archtectural Support for the on demand IT Infrastructure, OASIS 24. [16] X. Zhang, and Y. Dong, Optmzng Xen VMM based on Intel Vrtualzaton technology, In the proceedngs of the Internatonal Conference on Computer Scence and Software Engneerng, 28. [17] D. Guo, G. Lao, and L.N Bhuyan, Performance characterzaton and cache-aware core schedulng n a vrtualzed mult-core server under 1GbE, n the Proceedngs of the 29 IEEE Internatonal Symposum on Workload Characterzaton (IISWC) 29 [18] G. Lao, D. Guo, L. Bhuyan, and S.R Kng, Software technques to mprove Vrtualzed I/O performance on mult-core systems, n Proceedngs of the 4th ACM/IEEE Symposum on Archtectures for Networkng and Communcatons Systems (ANCS) 28. [19] http://aws.amazon.com/ec2 [2] S.K. Barker, P. Shenoy, Emprcal Evaluaton of Latency-senstve Applcaton Performance n the Cloud, n the Proceedngs of the frst annual ACM SIGMM conference on Multmeda systems (MMSys 1) 21 [21] A. Menon, G. Janakraman, JR. Santos, and W. Zwaenepoel, Dagnosng performance overheads n the Xen vrtual machne envronment,vee 25. 978-1-4673-2713-8/12/$31. 212 IEEE 718