Dynamc Resource Allocaton and Power Management n Vrtualzed Data Centers Rahul Urgaonkar, Ulas C. Kozat, Ken Igarash, Mchael J. Neely urgaonka@usc.edu, {kozat, garash}@docomolabs-usa.com, mjneely@usc.edu Abstract We nvestgate optmal resource allocaton and power management n vrtualzed data centers wth tme-varyng workloads and heterogeneous applcatons. Pror work n ths area uses predcton based approaches for resource provsonng. In ths work, we take an alternate approach that makes use of the queueng nformaton avalable n the system to make onlne control decsons. Specfcally, we use the recently developed technque of Lyapunov Optmzaton to desgn an onlne admsson control, routng, and resource allocaton algorthm for a vrtualzed data center. Ths algorthm maxmzes a jont utlty of the average applcaton throughput and energy costs of the data center. Our approach s adaptve to unpredctable changes n the workload and does not requre estmaton and predcton of ts statstcs. Index Terms Data Center Automaton, Cloud Computng, Vrtualzaton, Resource Allocaton, Lyapunov Optmzaton I. INTRODUCTION There s growng nterest n mprovng the energy effcency of large-scale enterprse data centers and cloud computng envronments. Recent studes [1] [2] ndcate that the costs assocated wth the power consumpton, coolng requrements, etc., of servers over ther lfetme are sgnfcant. As a result, there have been numerous works n the area of power management for such data centers (see [3] and references theren). At the data center level, applcaton consoldaton has been studed for reducng the total power consumpton. Vrtualzaton s a promsng technque that enables consoldaton of heterogeneous applcatons onto a fewer number of servers, whle ensurng secure co-locaton between competng applcatons. Ths results n hgher resource utlzaton and reducton n energy costs (by turnng off extra servers). However, snce multple applcatons now contend for the same resource pool, t s mportant to develop schedulng algorthms that allocate resources n a far and effcent manner. At the ndvdual server level, technques such as Dynamc Voltage and Frequency Scalng, low power P-states, etc. are avalable that allow a tradeoff between performance and power consumpton. Several recent works (e.g., [4] [5]) have studed the problem of dynamcally scalng the CPU speed for energy savngs. In ths work, we consder the problem of maxmzng a jont utlty of the long-term throughput of the hosted applcatons and the average total power expendture n a vrtualzed data center. Our formulaton unfes these two technques for power control under a common analytcal framework. Ths work was performed when Rahul Urgaonkar worked as a summer ntern at DOCOMO USA Labs. A 1 A 2 A/C 1 A/C 2 W 1 W 2 A N W N A/C M A/C : Admsson Controller for Applcaton R: Router R/C j: Resource Controller for Server j W : Router buffer for Applcaton U j : Buffer for Applcaton on Server j A : Request arrvals for Applcaton Fg. 1. R U 11 U 21 U N1 U 12 U 22 U N2 U 1M U 2M U NM Illustraton of the Vrtualzed Data Center Archtecture. II. RELATED WORK R/C 1 Server 1 R/C 2 Server 2 R/C M Server M Dynamc resource allocaton n vrtualzed data centers has been studed extensvely n recent years. The work n [6] [9] formulates ths as a feedback control problem and uses tools from adaptve control theory to desgn onlne control algorthms. Such technques use a closed-loop control model where the objectve s to converge to a target performance level by takng control actons that try to mnmze the error between the measured output and the reference nput. Whle ths technque s useful as a trackng problem, t cannot be used for utlty maxmzaton problems where the target optmal value s unknown. Work n [10] consders the problem of maxmzng a jont utlty of the proft generated by satsfyng gven SLA and the power consumpton costs. Ths s formulated as a sequental optmzaton problem and solved usng lmted lookahead control. Ths approach requres buldng estmates of the future workloads. Much pror work on resource allocaton s based on predcton-based provsonng and steady state queueung models [11] [14]. Here, statstcal models
2 for the workloads are frst developed usng hstorcal traces offlne or va onlne learnng. Resource allocaton decsons are then made to satsfy such predcted demand. Ths approach s lmted by ts ablty to accurately predct future arrvals. In ths work, we do not take ths approach. Instead, we make use of the recently developed technque of Lyapunov Optmzaton [18] to desgn an onlne admsson control, routng, and resource allocaton algorthm for a vrtualzed data center. Ths algorthm makes use of the queueng nformaton avalable n the system to mplctly learn and adapt to unpredctable changes n the workload and does not requre estmaton and predcton of ts statstcs. The technque of Lyapunov Optmzaton has been used to develop throughput and energy optmal cross-layer control algorthms n tmevaryng wreless networks (see [18] and references). Ths technque has certan smlartes wth the feedback control based approach as t also uses a Lyapunov functon based analyss to desgn onlne control algorthms. In addton, ths technque also allows stablty and utlty optmzaton to be treated n the same framework. Unlke works that use steady state queueng models, ths approach takes nto account the full effects of the queueng dynamcs by makng use of the queue backlog nformaton to make onlne control decsons. III. BASIC VIRTUALIZED DATA CENTER MODEL We consder a vrtualzed data center wth M servers that host a set of N applcatons. The set of servers s denoted by S and the set of applcatons s denoted by A. Each server j S hosts a subset of the applcatons. It does so by provdng a vrtual machne () for every applcaton hosted on t. An applcaton may have multple nstances runnng across dfferent s n the data center. We defne the followng ndcator varables for {1, 2,..., N}, j {1, 2,..., M}: { 1 f applcaton s hosted on server j a j = 0 else For smplcty, n the basc model, we assume that a j = 1, j,.e., each server can host all applcatons. In general, applcatons may be mult-tered and the dfferent ters correspondng to an nstance of an applcaton may be located on dfferent servers and s. For smplcty, n the basc model we assume that each applcaton conssts of a sngle ter. These assumptons are relaxed n Sec. VI where we dscuss extensons to the mult-ter as well as nhomogeneous hostng scenaro. We assume a tme-slotted system. Every slot, new requests arrve for each applcaton accordng to a random arrval process A that has a tme average rate λ requests/slot. Ths process s assumed to be ndependent of the current amount of unfnshed work n the system and has fnte second moment. However, we do not assume any knowledge of the statstcs of A. For example, A could be a Markov-modulated process wth tme-varyng nstantaneous rates where the transton probabltes between dfferent states are not known. Ths models a scenaro wth unpredctable and tme-varyng workloads. Power Consumpton (W) 260 240 220 200 180 160 utlzaton 100% quadratc model utlzaton 50% quadratc model utlzaton 38% quadratc model 140 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 CPU Frequency (GHz) Fg. 2. Power vs CPU frequency for a Dell PowerEdge R610 server. Also note the sgnfcant dle power. A. Control Archtecture Our control archtecture for the vrtualzed data center conssts of three components as shown n Fg. 1. Every slot, for each applcaton A, an Admsson Controller determnes whether to admt or declne the new requests. The requests that are admtted are stored n the Router buffer before beng routed to one of the servers hostng that applcaton by the Router. Each server j S has a set of resources W j (such as CPU, dsk, memory, network resources, etc.) that are allocated to the s hosted on t by ts Resource Controller. In practce, ths Resource Controller resdes on the host Operatng System (Dom0) of each vrtualzed server. The control optons avalable to the Resource Controller are dscussed n detal n Sec. III-C. For smplcty, n the basc model, we assume that the sets W j contan only one resource. Specfcally, we focus on case where the CPU s the bottleneck resource. Ths can happen, for example, when all applcatons runnng on the servers are computatonally ntensve. Our formulaton can be extended to treat the multple-resource case by representng them as a vector of resources and approprately redefnng the control optons and expected servce rates. All servers n the data center are assumed to be resource constraned. Specfcally, n the basc model, we focus on CPU frequency and power constrants. Ths s dscussed n detal n the followng. B. CPU Power-Frequency Relatonshp Modern CPUs can be operated at dfferent speeds at runtme by usng technques such as Dynamc Frequency Scalng (DFS), Dynamc Voltage Scalng (DVS), or a combnaton Dynamc Voltage and Frequency Scalng (DVFS). These technques result n a non-lnear power-frequency relatonshp. For example, Fg. 2 shows the power consumed by a Dell PowerEdge R610 server for dfferent operatng frequences and utlzaton levels. Ths curve was obtaned by runnng a CPU ntensve applcaton at dfferent CPU frequences and utlzaton levels and measurng the power consumpton. We observe that at each utlzaton level, the power-frequency relatonshp s well-approxmated by a quadratc model,.e.,
3 P(f) = P mn + α(f f mn ) 2. Smlar results have been observed n recent works [4]. In our model, we assume that CPUs follow a smlar non-lnear power-frequency relatonshp that s known to the Resource Controllers. The CPUs can run at a fnte number of operatng frequences n an nterval [f mn, f max ] wth an assocated power consumpton [P mn, P max ]. Ths allows a tradeoff between performance and power costs. All servers n our model are assumed to have dentcal CPU resources. Addtonally, the servers may be operated n an nactve mode (such as P-states, CPU hbernaton, or turnng OFF) n order to further save on energy costs. Ths can be advantageous f the workload s low. Indeed, we note from Fg. 2 that the mnmum power P mn requred to mantan the server n the actve state s typcally substantal. It can be as hgh as 65% of P max as reported n other works [1]. Therefore, turnng dle servers to OFF mode, or to some low power hbernaton state, can yeld sgnfcant savngs n power consumpton. Whle an nactve server does not consume any power, t also cannot provde any servce to the applcatons hosted on t. We thus assume that, n any slot, new requests can only be routed to actve servers. Inactve servers can be turned actve to handle ncreases n workload. Snce turnng servers ON/OFF frequently may be undesrable (for example, due to hardware relablty ssues), we wll focus on frame-based control polces n whch tme s dvded nto frames of length T slots. The set of actve servers s chosen at the begnnng of each frame and s held fxed for the duraton of that frame. Ths set can potentally change n the next frame as workloads change. We note that whle ths control decson s taken at a slower tme-scale, other resource allocatons decsons (such as admsson control, routng, CPU frequency scalng and resource allocaton at each actve server) are made every slot. The choce of an approprate value for T s an mplementaton ssue. We do not focus on optmzng ths parameter n ths work. The choce of T affects a complextyutlty tradeoff as dscussed n Sec. V-B. C. Queueng Dynamcs and Control Decsons Let A denote the number of new request arrvals for applcaton n slot t. Let R be the number of requests out of A that are admtted nto the Router buffer for applcaton by the Admsson Controller. We denote ths buffer by W. We assume that any new request that s not admtted by the Admsson Controller s declned. Ths can easly be generalzed to the case where arrvals that are not mmedately accepted are stored n a buffer for future admsson decson. Thus, for all, t, we have: 0 R A (1) Let R j be the number of requests for applcaton that are routed from ts Router buffer to server j n slot t. Then the queueng dynamcs for W s gven by: W (t + 1) = W j R j + R (2) Let S denote the set of actve servers n slot t. For each applcaton, the admtted requests can only be routed to those servers that host applcaton and are actve n slot t. Thus, the routng decsons R j must satsfy the followng constrants every slot: R j = 0 f j / S or a j = 0 (3) 0 a j R j W (4) j S Every slot, the Resource Controller allocates the resources of each server among the s that host the applcatons runnng on that server. Ths allocaton s subject to the avalable control optons. For example, the Resource Controller may allocate dfferent fractons of the CPU (or dfferent number of cores n case of mult-core processors) to the s n that slot. 1 The Resource Controller may also use avalable technques such as DFS, DVS, DVFS, etc. to modulate the current CPU speed whch affects the CPU power consumpton. We use I j to denote the set of all such control optons avalable at server j. Ths ncludes the opton of makng server j nactve (so that no power s consumed) f the current slot s the begnnng of a new frame. Let I j I j denote the partcular control decson taken at server j n slot t under any polcy and let P j be the correspondng power consumpton. Then, the queueng dynamcs for the requests of applcaton at server j follows: U j (t + 1) = max[u j µ j (I j ), 0] + R j (5) where µ j (I j ) denotes the servce rate (n unts of requests/slot) provded to applcaton on server j n slot t by takng control acton I j. We assume that, for each applcaton, the expected value of ths servce rate as a functon of the control acton I j s known for all I j I j. Ths can be obtaned by applcaton proflng and applcaton modelng technques (e.g., [15] [16]). It s mportant to note that we do not need to mplement the dynamc (5). We wll only requre a measure of the current backlog and knowledge of the expected servce rate as a functon of control decsons to mplement our control algorthm. Thus, n every slot t, a control polcy needs to make the followng decsons: 1) If t = nt (.e., begnnng of a new frame), determne the new set of actve servers S. Else, contnue usng the actve set already computed for the current frame. 2) Admsson control decsons R for all applcatons. 3) Routng decsons R j for the admtted requests. 4) Resource allocaton decson I j at each actve server (ths ncludes selectng the CPU frequency that affects the power consumpton P j as well as CPU resource dstrbuton among dfferent s). Our goal s to desgn an onlne control polcy that maxmzes a jont utlty of the sum throughput of the applcatons and the energy costs of the servers subject to the avalable control optons and the structural constrants mposed by ths 1 Addtonal constrants such as allocatng a mnmum amount of CPU share to all actve s can be ncluded n ths model.
4 model. It s desrable to develop a flexble and robust resource allocaton algorthm that automatcally adapts to tme-varyng workloads. In ths work, we wll use the technque of Lyapunov Optmzaton [18] to desgn such an algorthm. IV. CONTROL OBJECTIVE Consder any polcy η for ths model that takes control decsons S η, R η, Rη j, Iη j I j, P η j for all, j n slot t. Note that under any feasble polcy η, these control decsons must satsfy the admsson control constrant (1), routng constrants (3), (4), and the resource allocaton constrant I j I j every slot for all, j. Let r η denote the tme average expected rate of admtted requests for applcaton under polcy η,.e., r η = lm 1 t 1 E {R η t (τ)} (6) t τ=0 Let r = (r 1,...,r N ) denote the vector of these tme average rates. Smlarly, let e η j denote the tme average expected power consumpton of server j under polcy η: t 1 e η 1 j= lm t t τ=0 E { P η j (τ)} (7) The expectatons above are wth respect to the possbly randomzed control actons that polcy η mght take. Let α and β be a collecton of non-negatve weghts. Then our objectve s to desgn a polcy η that solves the followng stochastc optmzaton problem: Maxmze: Subject to: e η j α r η β A j S 0 r η λ A I η j I j j S, t r Λ (8) Here, Λ represents the capacty regon of the data center model as descrbed above. It s defned as the set of all possble longterm throughput values that can be acheved under any feasble resource allocaton strategy. The objectve n problem (8) s a general weghted lnear combnaton of the sum throughput of the applcatons and the average power usage n the data center. Ths formulaton allows us to consder several scenaros. Specfcally, t allows the desgn of polces that are adaptve to tme-varyng workloads. For example, f the current workload s nsde the nstantaneous capacty regon, then ths objectve encourages scalng down the nstantaneous capacty (by runnng CPUs at slower speeds and/or turnng OFF some actve servers) to acheve energy savngs. Smlarly, f the current workload s outsde the nstantaneous capacty regon, then ths objectve encourages scalng up the nstantaneous capacty (by runnng CPUs at faster speeds and/or turnng ON some nactve servers). Fnally, f the workload s so hgh that t cannot be supported by usng all avalable resources, ths objectve allows prortzaton among dfferent applcatons. Furthermore, t allows us to assgn prortes between throughput and energy by choosng approprate values of α, β. A. Optmal Statonary, Randomzed Polcy Problem (8) s smlar to the general stochastc network utlty maxmzaton problem presented n [18] n the context of wreless networks wth tme-varyng channels. Suppose (8) s feasble and let r and e j, j denote the optmal value of the objectve functon, potentally acheved by some arbtrary polcy. Usng the technques developed n [17], [18], t can be shown that to solve (8) and acheve the optmal value of the objectve functon, t s suffcent to consder only the class of statonary, randomzed polces that take control decsons ndependent of the current queue backlog every slot. Specfcally, at the begnnng of each frame, ths polcy chooses an actve set of servers accordng to a statonary dstrbuton n an..d. fashon. Once chosen, other control decsons are lkewse taken n an..d. fashon accordng to statonary dstrbutons. For the basc model of Sec. III wth homogeneous applcaton hostng and dentcal CPU resources, n choosng an actve server set, we do not need to consder all possble subsets of S. Specfcally, we defne the followng collecton O of subsets of S: { } O=, {1}, {1, 2}, {1, 2, 3},..., {1, 2, 3,..., M} (9) Then we have the followng. For brevty, we state ths fact here wthout proof: Fact 1: (Optmal Statonary, Randomzed Polcy) For any arrval rate vector (λ 1,...,λ N ) (nsde or outsde of the data center capacty regon Λ), there exsts a frame-based statonary randomzed control polcy that chooses actve sets from O every frame, makes admsson control, routng and resource allocaton decsons every slot ndependent of the queue backlog and yelds the followng steady state values: 1 lm t t t 1 [ τ=0 A α E {R (τ)} β j S ] E {P j (τ)} = α r β e j (10) A j S However, computng the optmal statonary, randomzed polcy explctly can be challengng and ts mplementaton mpractcal as t requres knowledge of all system parameters (lke workload statstcs) as well as the capacty regon n advance. Even f ths polcy can be computed for a gven workload, t would not be adaptve to unpredctable changes n the workload and must be recomputed. In the next secton, we wll present an onlne control algorthm that overcomes these challenges. V. OPTIMAL CONTROL ALGORITHM In ths secton, we use the framework of Lyapunov Optmzaton to develop an optmal control algorthm for our model. Specfcally, we present a dynamc control algorthm that can be shown to acheve the optmal soluton r and, j to the stochastc optmzaton problem (8). Ths e j
5 algorthm s smlar n sprt to the backpressure algorthms proposed n [17], [18] for problems of throughput and energy optmal networkng n tme varyng wreless networks. A. Data Center Control Algorthm (DCA) Let V 0 be a control parameter that s nput to the algorthm. Ths parameter s chosen by the system admnstrator and allows a utlty-delay tradeoff as dscussed later n Sec. V-B. Approprate choce of ths parameter depends on the partcular system as well as the desred tradeoff between performance and power cost. Ths parameter may also be vared over tme to affect ths tradeoff. Let W, U j, j be the queue backlog values n slot t. These are ntalzed to 0. Every slot, the DCA algorthm uses the backlog values n that slot to make jont Admsson Control, Routng and Resource Allocaton decsons. As the backlog values evolve over tme accordng to the dynamcs (2) and (5), the control decsons made by DCA adapt to these changes. However, we note that ths s mplemented usng knowledge of current backlog values only and does not rely on knowledge of statstcs governng future arrvals. Thus, DCA solves for the objectve n (8) by mplementng a sequence of optmzaton problems over tme. The queue backlogs themselves can be vewed as dynamc Lagrange multplers that enable stochastc optmzaton [18]. The DCA algorthm operates as follows. Admsson Control: For each applcaton, choose the number of new requests to admt R as the soluton to the followng problem: Maxmze: R [V α W ] Subject to: 0 R A (11) Ths problem has a smple threshold-based soluton. In partcular, f the current Router buffer backlog for applcaton, W > V α, then R = 0 and no new requests are admtted. Else, f W V α, then R = A and all new requests are admtted. Note that ths admsson control decson can be performed separately for each applcaton. Routng and Resource Allocaton: Let S be the actve server set for the current frame. If t nt, then we contnue to use the same actve set. The Routng and Resource Allocaton decsons are gven as follows: Routng: Gven an actve server set, routng follows a smple Jon the Shortest Queue polcy. Specfcally, for any applcaton, let j S be the actve server wth the smallest queue backlog U j. If W > U j, then R j = W,.e., all requests n the Router buffer for applcaton are routed to server j. Else, R j = 0 j and no requests are routed to any server for applcaton. In order to make these decsons, the Router requres the queue backlog nformaton U j, j. Gven ths nformaton, we note that ths routng decson can be performed separately for each applcaton. Resource Allocaton: At each actve server j S, choose a resource allocaton I j that solves the followng problem: Maxmze: U j E {µ j (I j )} V βp j Subject to: I j I j, P j P mn (12) The above problem s a generalzed max-weght problem where the servce rate provded to any applcaton s weghted by ts current queue backlog. Thus, the optmal soluton would allocate resources so as to maxmze the servce rate of the most backlogged applcaton. The complexty of ths problem depends on the sze of the control optons I j avalable at server j. In practce, the number of control optons such as avalable DVFS states, CPU shares, etc. s small and thus, the above optmzaton can be mplemented n real tme. It s mportant to note that each server solves ts own resource allocaton problem ndependently usng the queue backlog values of applcatons hosted on t and ths can be mplemented n a fully dstrbuted fashon. If t = nt, then a new actve set S for the current frame s determned by solvng the followng: S =argmax S O [ U j E {µ j (I j )} V β j j ] R j (W U j ) + j P j subject to: j S, I j I j, P j P mn constrants (1), (3), (4) (13) The above optmzaton can be understood as follows. To determne the optmal actve set S, the algorthm computes the optmal cost for the expresson wthn the brackets for every possble actve server set n the collecton O. Gven an actve set, the above maxmzaton s separable nto Routng decsons for each applcaton and Resource Allocaton decsons at each actve server. Ths computaton s easly performed usng the procedure descrbed earler for Routng and Resource Allocaton when t nt. Snce O has sze M, the worst-case complexty of ths step s polynomal n M. However, the computaton can be sgnfcantly smplfed as follows. It can be shown that f the maxmum queue backlog over all applcatons on any server j exceeds a fnte constant U thresh, then that server must be part of the actve set. Thus, only those subsets of O that contan these servers need to be consdered when searchng for the optmal actve set. We note that t s possble for ths algorthm to nactvate certan servers even f they have non-zero queue backlog (and process t later when the server s actvated agan). Ths can happen, for example, f the backlog on the server s small and the optmzaton (13) determnes that the energy cost of keepng the server ON (the second term) exceeds the weghted servce rate acheved (the frst term). Whle we can show optmalty of ths algorthm n terms of solvng the objectve (8), we also consder a more practcal (and potentally suboptmal) strategy DCA-M that mgrates or reroutes such unfnshed
6 requests from nactve servers to other actve servers n the next frame. Our smulaton results n Sec. VII suggest that the performance of ths strategy s very close to that under DCA. Fnally, the computaton n (13) requres knowledge of the values of queue backlogs at all servers as well as the router buffers. Ths can be mplemented by a centralzed controller (that also mplements Routng) that perodcally gathers the backlog nformaton and determnes the actve set for each frame. B. Performance Analyss Theorem 1: (Algorthm Performance) Assume that all queues are ntalzed to 0. Suppose all arrvals n a slot A..d. and are upper bounded by fnte constants so that A A max for all, t. Also let µ max be the maxmum servce rate (n requests/slot) over all applcatons n any slot. Then, mplementng the DCA algorthm every slot for any fxed control parameter V 0 and frame sze T yelds the followng performance bounds: 1) The worst case queue backlog for each applcaton Router buffer W s upper bounded by a fnte constant W max for all t: W W max =V α + A max (14) Smlarly, the worst case queue backlog for applcaton on any server j s upper bounded by 2W max for all, t: U j 2W max = 2(V α + A max ) (15) 2) The tme average utlty acheved by the DCA algorthm s wthn BT/V of the optmal value: lm nf t 1 t t 1 [ τ=0 ] E {P j (τ)} E {R (τ)} β Aα j S r Aα β e j BT V j S (16) where B s a fnte constant (defned precsely n (18)) that depends on the second moments of the arrval and servce processes. We note that the performance bounds above are qute strong. In partcular, part (1) establshes determnstc worst case bounds on the maxmum backlogs n the system at all tmes. Therefore, by part (2) of the theorem, the acheved average utlty s wthn O(1/V ) of the optmal value. Ths can be pushed arbtrarly close to the optmal value by ncreasng the control parameter V. However, ths ncreases the maxmum queue backlog bound (14), (15) lnearly n V, leadng to a utlty-delay tradeoff. We next prove the frst part of Theorem 1. Proof of part (2) uses the technque of Lyapunov Optmzaton [18] and s provded n the Appendx. Proof of part (1): Suppose that W W max for all for some tme t. Ths s true for t = 0 as all queues are ntalzed to 0. We show that the same holds for tme t + 1. We have 2 cases. If W W max A max, then from (2), we have W (t+1) W max (because R A max for all t). Else, f W > W max A max, then W > V α +A max A max = V α. Then, the flow control part of the algorthm chooses R = 0, so that by (2): W (t + 1) W W max Ths proves (14). To prove (15), note that new requests are routed from a Routng buffer W to an applcaton queue U j only when W > U j. Snce W W max and snce the maxmum number of arrvals n a slot to U j s, U j cannot exceed 2W max. W max VI. EXTENSIONS In ths secton, we brefly dscuss two extensons to the basc model of Sec. III. A. Inhomogeneous Applcaton Placement and CPU Resources In ths case, the a j varables need not be equal to 1, j so that requests for an applcaton can only be routed to one of those servers that hosts ths applcaton. The routng constrants n (3), (4) are already general enough to capture ths. In the case of nhomogeneous CPU resources, the DCA algorthm needs the followng modfcaton. In the actve server determnaton step, nstead of only searchng over the collecton O of subsets n (9), now t may have to search over all possble subsets of S. Ths can be computatonally ntensve when S s large. It s possble to tradeoff complexty for utlty optmalty by resortng to sub-optmum heurstc approaches, nvestgaton of whch s left out for brevty n ths paper. B. Mult-ter Applcatons Modern enterprse applcatons typcally have multple components workng n a tered structure [16] [11]. An ncomng request for such a mult-ter applcaton s servced by each ter n a certan order before departng the system. Our framework can be extended to treat ths scenaro by modelng the multter applcaton as a network of queues. Specfcally, we defne U (k) j as the queue backlog for the kth ter of applcaton on server j. Then, the queueng dynamcs for U (k) are gven by: U (k) j (t + 1) = max[u(k) j µ(k) j (I j), 0] + j l S R (k 1) l where R (k 1) l denotes the arrvals to U (k) j from the (k 1) th ter of applcaton on server l. Usng the framework presented n the prevous sectons, DCA can be extended to treat such mult-ter scenaros. VII. SIMULATIONS We smulate the DCA and DCA-M algorthms n an example vrtualzed data center consstng of 100 servers and hostng 10 applcatons. Each applcaton s CPU ntensve and receves requests exogenously accordng to a random arrval process of rate λ. In the smulaton setup, each CPU s assumed to follow a quadratc power-frequency relatonshp smlar to the expermentally obtaned quadratc powerfrequency curve n Fg. 2. Specfcally, each CPU s assumed
7 3 x 104 180 Average Total Utlty 2.5 2 1.5 1 0.5 DCA gamma 0.75 DCA M gamma 0.75 DCA gamma 1.0 DCA M gamma 1.0 DCA gamma 2.0 DCA M gamma 2.0 Average Delay of Admtted Requests 160 140 120 100 80 60 DCA gamma 0.75 DCA M gamma 0.75 DCA gamma 1.0 DCA M gamma 1.0 DCA gamma 2.0 DCA M gamma 2.0 40 0 20 0.5 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 V 0 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 V Fg. 3. Average total utlty vs V for dfferent values of γ. Fg. 4. Average delay of admtted requests vs V for dfferent values of γ. to have a dscrete set of frequency optons n the nterval [1.6GHz,..., 2.6GHz] at ncrements of 0.2 GHz and the correspondng power consumpton (n Watts) at frequency f s gven by P mn + θ(f 1.6GHz) 2 where P mn = 120W and θ = 120W/(GHz) 2. Thus, the CPU power consumpton at the hghest frequency s 240W. We assume that half of the servers n the data center are always ON and that decsons to dynamcally turn servers ON/OFF are appled to the remanng servers. Note that the dynamc operatng frequency decsons are stll appled to all servers. The frame sze T = 1000 slots and the smulatons were run for one mllon slots. The number of new requests generated for an applcaton n a slot s assumed to be unformly and randomly dstrbuted n [0, 2λ ]. On average, a server runnng at the mnmum (maxmum) speed can process 200 (400) requests/slot. In the smulatons, the throughput utlty weghts are chosen to be equal for all applcatons, so that α = α. In the frst experment, we fx the nput rate λ = 2000 requests/slot for all applcatons and smulate the DCA and DCA-M algorthms for dfferent choces of the rato γ = α/β. Fgs. 3 shows the total average utlty for dfferent values of the nput parameter V under the two control algorthms. We observe that the performance of DCA-M s very close to DCA. Further, the total average utlty acheved ncreases wth V and converges to a maxmum value for larger values of V as predcted by (16). Fg. 4 plots the average delay of the admtted requests vs V. It can be seen that the average delay ncreases lnearly wth V as predcted by the bounds n (14), (15). Fg. 5 shows the fracton of declned requests vs V under both algorthms. Ths, along wth Fgs. 3 and 4 shows the O(1/V, V ) utlty-delay tradeoff offered by the DCA algorthm where the average utlty acheved can be pushed closer to the optmal value wth a tradeoff n terms of a lnear ncrease n average delay. In the second experment, we fx the parameters V = 5000, γ = 1.0 and consder the scenaro where the nput rate changes n an unpredctable manner. Specfcally, for the frst 1/3 of the smulaton nterval, the nput rate λ = 1000 requests/slot for all applcatons. Then the nput rate abruptly ncreases to 3000 requests/slot before droppng to 2000 requests/slot n the last 1/3 of the smulaton nterval. In Fg. 6, we plot the number of actve servers vs. frame number under the DCA algorthm. It can be seen that the algorthm quckly adapts to the new workload by ncreasng or decreasng the number of actve servers (and hence the nstantaneous capacty) even when the workload changes n an unpredctable manner. VIII. CONCLUSIONS AND FUTURE WORK In ths paper, we consdered the problem of dynamc resource allocaton and power management n vrtualzed data centers. Pror work n ths area uses predcton based approaches for resource provsonng. In ths work, we have used an alternate approach that makes use of the queueng nformaton avalable n the system to make onlne control decsons. Ths approach s adaptve to unpredctable changes n workload and does not requre estmaton and predcton of ts statstcs. Our approach uses the recently developed technque of Lyapunov Optmzaton that allows us to derve analytcal performance guarantees of the algorthm. The man focus of ths work was on buldng an analytcal framework. As part of future work, we plan to have real system mplementaton of our algorthm and use standard benchmark workloads and applcatons to evaluate ts performance. APPENDIX A PROOF OF THEOREM 1 PART (2) Here, we prove part (2) of Theorem 1 usng the technque of Lyapunov Optmzaton [18]. Ths technque nvolves constructng an approprate Lyapunov functon of the queue backlogs n the system, defnng the condtonal Lyapunov drft of ths functon, and then developng a dynamc algorthm that mnmzes ths drft over all control polces. The performance bounds for ths algorthm are obtaned by comparng the Lyapunov drft of ths wth that of the backlog ndependent optmal statonary, randomzed polcy descrbed n Sec. IV-A. Let Q = (U 11,..., U NM, W 1,..., W M ) represent the collecton of all queue backlogs n the system. We defne the followng Lyapunov functon: L(Q) = 1 [ Uj 2 2 + ] W 2 A,j S A
8 0.5 0.45 0.4 DCA gamma 0.75 DCA M gamma 0.75 DCA gamma 1.0 DCA M gamma 1.0 DCA gamma 2.0 DCA M gamma 2.0 100 95 90 Fracton of Requests Declned 0.35 0.3 0.25 0.2 0.15 Number of Actve Servers 85 80 75 70 65 0.1 60 0.05 55 0 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 V 50 0 100 200 300 400 500 600 700 800 900 1000 Frame Number Fg. 5. Fracton of declned requests vs V for dfferent values of γ. Fg. 6. Number of actve servers over tme. Defne the condtonal Lyapunov drft (Q) as follows: (Q) =E {L(Q(t + 1)) L(Q) Q} Usng queueng dynamcs (2) and (5), the condtonal Lyapunov drft under any control polcy (ncludng DCA) can be computed as follows: B j where B = U j E {µ j (I j ) R j Q} W E R j R Q (17) j (Amax ) 2 + NMµ 2 max 2 (18) For a gven control parameter V 0, we subtract the reward metrc V E{ α R β } j P j Q from both sdes of the drft nequalty (17) and rearrange the terms to get the followng: V E α R β P j Q B j j j U j E {µ j (I j ) Q} + V β j E {R j (W U j ) Q} E {P j Q} E {R (V α W Q} (19) From the above, t can be seen that the dynamc control algorthm DCA descrbed n Sec. V-A s desgned to take Admsson Control, Routng and Resource Allocaton decsons that mnmze the rght hand sde of (19) over all possble optons, ncludng the statonary polcy of Sec. IV-A. Theorem 1 part (2) can now be shown usng a drect applcaton of the Lyapunov Optmzaton Theorem (see Theorem 5.4 n [18]) along wth a T -slot delayed Lyapunov analyss. 2 REFERENCES [1] A. Greenberg, J. Hamlton, D. A. Maltz, and P. Patel. The cost of a cloud: research problems n data center networks. ACM SIGCOMM Computer Communcaton Revew, vol. 39, no. 1, Jan. 2009. [2] X. Fan, W. Weber, and L. Borroso. Power provsonng for a warehouseszed computer. In Proceedngs of ISCA, June 2007. [3] R. Raghavendra, P. Ranganathan, V. Talwar, Z. Wang, and X. Zhu. No power struggles: coordnated mult-level power management for the data center. In Proceedngs of ASPLOS, March 2008. [4] A. Gandh, M. Harchol-Balter, R. Das, and C. Lefurgy. Optmal power allocaton n server farms. In Proceedngs of SIGMETRICS, June 2009. [5] A. Werman, L. L. H. Andrew, and A. Tang. Power-aware speed scalng n processor sharng systems. In Proceedngs of INFOCOM, Aprl 2009. [6] P. Padala, K-Y. Hou, K. G. Shn, X. Zhu, M. Uysal, Z. Wang, S. Snghal, and A. Merchant. Automatc control of multple vrtualzed resources. In Proceedngs of EuroSys, Aprl 2009. [7] P. Padala, K. G. Shn, X. Zhu, M. Uysal, Z. Wang, S. Snghal, A. Merchant, and K. Salem. Adaptve control of vrtualzed resources n utlty computng envronments. In Proceedngs of EuroSys, March 2007. [8] X. Zhu, M. Uysal, Z. Wang, S. Snghal, A. Merchant, P. Padala, and K. G. Shn. What does control theory brng to systems research? ACM SIGOPS Operatng Systems Revew, vol. 43, no. 1, Jan. 2009. [9] X. Lu, X. Zhu, P. Padala, Z. Wang, and S. Snghal. Optmal multvarate control for dfferentated servces on a shared hostng platform. In Proceedngs of CDC, Dec. 2007. [10] D. Kusc and N. Kandasamy. Power and performance management of vrtualzed computng envronments va lookahead control. In Proceedngs of ICAC, June 2009. [11] X. Wang, D. Lan, X. Fang, M. Ye, and Y. Chen. A resource management framework for mult-ter servce delvery n autonomc vrtualzed envronments. In Proceedngs of NOMS, Aprl 2008. [12] S. Govndan, J. Cho, B. Urgaonkar, A. Svasubramanam, and A. Baldn. Statstcal proflng-based technques for effectve power provsonng n data centers. In Proceedngs of EuroSys, Aprl 2009. [13] Y. Chen, A. Das, W Qn, A. Svasubramanam, Q. Wang, and N. Gautam. Managng server energy and operatonal costs n hostng centers. In Proceedngs of SIGMETRICS, June 2005. [14] J. S. Chase, D. C. Anderson, P. N. Thakur, A. M. Vahdat, and R. P. Doyle. Managng enrgy and server resources n hostng centers. In Proceedngs of SOSP, Oct. 2001. [15] B. Urgaonkar, P. Shenoy, and T. Roscoe. Resource overbookng and applcaton proflng n shared hostng platforms. In Proceedngs of OSDI, Dec. 2002. [16] B. Urgaonkar, G. Pacfc, P. Shenoy, M. Spretzer, and A. Tantaw. An analytcal model for mult-ter nternet servces and ts applcatons. In Proceedngs of SIGMETRICS, June 2005. [17] M. J. Neely. Energy optmal control for tme varyng wreless networks. IEEE Transactons on Informaton Theory, vol. 52, no. 7, pp. 2915-2934, July 2006. [18] L. Georgads, M. J. Neely, L. Tassulas. Resource allocaton and crosslayer control n wreless networks. Foundatons and Trends n Networkng, vol. 1, no. 1, pp. 1-149, 2006. 2 detals omtted for brevty.