3 Procdings IEEE INFOCOM Th Constraind Ski-Rntal Problm and its Application to Onlin Cloud Cost Optimization Ali Khanafr, Murali Kodialam, and Krishna P. N. Puttaswam Coordinatd Scinc Laborator, Univrsit of Illinois at Urbana-Champaign, USA, Email: khanaf@illinois.du ll Laboratoris, Alcatl-Lucnt, Murra Hill, NJ, USA, Email: murali.kodialam,krishna.puttaswam naga}@alcatl-lucnt.com Abstract Cloud srvic providrs CSPs) nabl tnants to lasticall scal thir rsourcs to mt thir dmands. In fact, thr ar various tps of rsourcs offrd at various pric points. Whil running applications on th cloud, a tnant aig to imiz cost is oftn facd with crucial trad-off considrations. For instanc, upon ach arrival of a qur, a wb application can ithr choos to pa for CPU to comput th rspons frsh, or pa for cach storag to stor th rspons so as to rduc th comput costs of futur rqusts. Th Ski- Rntal problm abstracts such scnarios whr a tnant is facd with a to-rnt-or-to-bu trad-off; in its basic form, a skir should choos btwn rnting or buing a st of skis without knowing th numbr of das sh will b skiing. In this papr, w introduc a variant of th classical Ski- Rntal problm in which w assum that th skir knows th first or scond) momnt of th distribution of th numbr of ski das in a sason. W dmonstrat that utilizing this information lads to achiving th bst worst-cas xpctd comptitiv ratio CR) prformanc. Our mthod ilds a nw class of randomizd algorithms that provid arrivals-distribution-fr prformanc guarants. Furthr, w appl our solution to a cloud fil sstm and dmonstrat th cost savings obtaind in comparison to othr compting schms. Simulations illustrat that our schm xhibits robust avrag-cost prformanc that combins th bst of th wll-known dtristic and randomizd schms prviousl proposd to tackl th Ski-Rntal problm. I. INTRODUCTION Cloud srvic providrs CSPs) such as Amazon and Microsoft rnt out rsourcs, such as CPU, mmor, storag, tc., at various pric points and offr thir tnants th abilit to lasticall scal th rsourcs up or down) dpnding on th dmand. Taking advantag of ths srvics, cloud-basd applications hav bn widl dplod in th rcnt ars at a rapid pac. Sinc man srvics hav bn virtualizd, it is as for an ntrpris to scal th amount of rsourcs ndd to satisf th currnt dmand for a srvic b scaling th numbr of virtual machins VMs) supporting that srvic. Intrstingl, scaling th numbr of VMs is not th onl wa to rduc costs in a cloud-basd srvic. Considr, for instanc, a wb application running on th cloud. Each tim this srvic rcivs a qur, th application has th following two options: Rcomput th qur from scratch. This involvs th CPU and I/O costs, if an, for using th disk. *This work was don whil th first author was a summr intrn at ll Laboratoris, Alcatl-Lucnt. Comput th rsult and stor it in th cach. This will incur th storag cost of th cach; howvr, it would sav th CPU and I/O costs th nxt tim th qur is xcutd with th sam paramtrs. Choosing th mor conomic option for a givn application will dpnd on th rlativ costs of CPU, I/O, and cach storag. In addition, this will dpnd on th frqunc at which this application is accssd. A similar scnario ariss whn running a fil sstm in th cloud. Whn a rqust for a block of data arrivs, th application has th following options: Rad th data from th disk and rturn it to th usr incurring an I/O cost. Stor th block in th cach and rturn it from th cach subsquntl, which incurs th I/O and storag costs of th cach. Howvr, it dos not incur th disk I/O cost which is tpicall mor xpnsiv than th cach I/O cost. Evidntl, thr ar man cost-basd dcisions that hav to b mad in th cloud vn whn th traffic is not varing considrabl. This problm bcoms quit important whn th costs of diffrnt options var widl. For xampl, considr th pricing in Tabl I. Within Amazon, a usr could choos to stor an objct in ElastiCach acts as a cach), which has ovr ) a high storag cost but fr I/Os, or Amazon S3 acts as a disk), which has a low storag cost but vr high pr I/O costs. If man rqusts to th sam fil arriv in a short intrval, thn it is chapr to stor and srv th fil from th cach instad of srving it from th disk. ut if th quris ar far apart, thn it is chapr to srv th fil from th disk dirctl. Srvic Nam Storag Rad Writ A) ElastiCach 38 A) S3. 6 6 M) Azur.5 6 6 TALE I COST OPTIMIZATION OPPORTUNITIES ACROSS PROVIDERS AS OF JULY 7TH ). READ AND WRITE COSTS ARE PER OPERATION, WHILE STORAGE IS PER G PER MONTH. A IS FOR AMAZON AND M IS FOR MICROSOFT. Anothr dimnsion to this problm is th fact that th costs of diffrnt options var across diffrnt CSPs. For instanc, in Tabl I, Azur has tims lowr writ costs whil % highr storag costs compard to S3. Hnc, thr is scop for 978--4673-5946-7/3/$3. 3 IEEE 49
3 Procdings IEEE INFOCOM splitting a srvic across multipl CSPs to furthr optimiz th locations of th disk, th rad opration, and th writ opration. Thr ar prformanc implications for som of ths dcisions which w do not dal with in this papr. Instad, w focus on cost optimization. In our ongoing work, w ar considring cost optimization along with prformanc constraints. Gnrall, du to th pr unit tim cost of various rsourcs, thr ar man situations whr costs can b optimizd in th cloud b trading off comput vs. storag, disk vs. cach, bandwidth vs. cach, tc. Som of ths problms hav bn xplord in th rcnt past [], []. In fact, ths problms can b abstractd using th classical Ski-Rntal problm, which ncapsulats th fundamntal trad-off btwn rnting or buing a crtain srvic whn th priod of usag is not known a priori to th prson intrstd in th srvic. Ski-Rntal problms wr first dscribd in [3] in th contxt of snoop caching. In its basic form, a dsignr is facd with th option of ithr buing or rnting a st of skis. Th dsignr dos not know th numbr of das sh will b skiing and is intrstd in imizing th ovrall cost of hr trip. Man variants of th Ski-Rntal problm hav bn studid in th litratur; s [4], [5] and th rfrncs thrin. In th comput vrsus storag xampl, th act of buing th skis can b mappd to rcomputing th qur, and th unknown numbr of ski das is quivalnt to th fact that w do not know how man tims and how frquntl a qur will b xcutd. Thr ar two classs of comptitiv algorithms usd to tackl th Ski-Rntal problm: dtristic and probabilistic. Th prformanc masur of ths algorithms is th comptitiv ratio CR): th ratio btwn th cost incurrd whn an onlin algorithm is usd and that incurrd whn an offlin algorithm that knows th futur) is utilizd. Th dtristic algorithm has a worst-cas CR of, whras th probabilistic algorithm has a worst-cas CR of. It has bn shown that ths ratios ar th bst possibl using a standard argumnt calld Yao s Mini Principl; s [6], p. 35. From a worst-cas CR standpoint, on would prfr th randomizd approach to th dtristic on. Nonthlss, on can ask th following intrsting qustion: can w furthr improv th worst-cas prformanc of th randomizd algorithm givn xtra information about th distribution of th arrivals? In this papr, w ar intrstd in prforg worst-cas xpctd CR analsis in sarch for an improvmnt upon th dtristic and randomizd approachs prviousl proposd. In othr words, w aim to dvis randomizd algorithms that provid worst-cas prformanc guarants indpndnt of th distribution of th arrivals. To this nd, w formulat a constraind vrsion of th Ski-Rntal problm and show that our solution to this problm givs ris to distributions that outprform both th dtristic and randomizd approachs. Th main contributions of this papr ar as follows: W formulat th problm as a continuous-krnl zrosum gam btwn th algorithm dsignr who sks to imiz th xpctd CR and an advrsar or natur) attmpting to imiz it. Also, w driv th optimal mixd-stratgis for both plars in closd form. W propos a nw variant of th Ski-Rntal problm whr th algorithm dsignr can xploit th knowldg of th first and scond momnts of th advrsar s stratg. W call this problm th Constraind Ski-Rntal Problm. Our formulation lads to a nw class of randomizd algorithms that provid arrivalsdistribution-fr prformanc guarants; w show that our algorithms outprform xisting approachs in th worst-cas xpctd CR sns. Finall, w appl our thortical findings to cloud fil sstms and assss th prformanc of our proposd approach using numrical studis. Th rst of th papr is organizd as follows. In Sction II, w outlin th standard Ski-Rntal problm and prsnt th Constraind Ski-Rntal problm in Sction III. In Sctions IV and V, w solv th problms of th dsignr and th advrsar. Finall, w valuat our solution using simulations as wll as snthtic and ral-world fil sstm datasts in Sction VI. W conclud th papr in Sction VII. II. THE SKI-RENTAL PROLEM Th Ski-Rntal problm capturs th trad-off btwn buing and rnting a product or srvic) whn th tim priod for which th product is going to b usd is not known in advanc. In th standard Ski-Rntal problm, a usr dsignr) is intrstd in dtring whthr to bu skis at a cost of $ or to rnt it at a cost of $ pr da. Clarl, if th usr skis for lss than das, it is bttr to rnt th skis. On th othr hand, if sh skis for mor than das, thn it is bttr to bu th skis at th outst. Th challng stms from th fact that th usr dos not know ahad of tim how man das sh is going to ski. Considr th quivalnt problm in a cloud cost optimization stting. In particular, considr th problm that was outlind in th introduction as to whthr to rcomput th rsult of a wb qur ach tim it is rqustd or whthr to stor th rsult and srv it out. Lt th cost of storing th qur b $ pr unit tim and th cost of rcomputation b $. If th nxt qur arrivs bfor tim units, it is bttr to stor th rsult and srv it out. If th nxt qur arrivs aftr tim units, thn it is bttr to rcomput whn th qur arrivs. Th sstm dos not know ahad of tim th tim unit of th nxt qur. A similar analog can b mad for th disk vrsus cach storag problm. To addrss this uncrtaint, it is of intrst to dvis onlin algorithms capabl of dtring th optimal choic for th usr at vr tim instant. In [3], th authors propos a dtristic approach rfrrd to as in th rst of th papr) which dictats that th dsignr should rnt th skis up until th -th da at which point sh should bu th skis. Th CR achivd b this schm is for arrivals bfor and for arrivals occurring aftr tim. Hnc, this schm is -comptitiv, i.., it ilds a worst-cas CR of. In [7], th authors propos an optimal randomizd schm rfrrd to as PRO in th rst of th papr) which switchs from rnting to buing 493
3 Procdings IEEE INFOCOM 3 th skis at a carfull chosn random tim. This algorithm achivs an xpctd CR of rgardlss of th arrival tim. It has also bn shown that th bst achivabl prformanc of dtristic and probabilistic algorithms ar and, rspctivl. Hr, w propos a nw approach which maks us of xtra information about th advrsar s stratg. W construct our schm in two stps: W start with th assumption that th dsignr posssss information about th advrsar s stratg. In particular, w considr two tps of information: th man and th scond momnt. Intuitivl, knowing th avrag numbr of ski das can hlp th dsignr dcid on th whthr sh should rnt or bu th skis. W first driv th optimal polic for this problm undr a constraint on th first momnt; w rfr to this polic b -PRO. Thn, w driv th optimal polic for th Ski- Rntal problm with a constraint on th scond momnt; w rfr to this polic b -PRO. W dmonstrat that th optimal policis ar almost indpndnt of th first and scond momnts in a sns mad prcis latr). Ths policis can b usd vn whn th numbr of ski das is not known. W show both thorticall as wll as xprimntall that th prformanc of -PRO and -PRO is mor robust than th prformanc of and PRO. W now formulat th Constraind Ski-Rntal Problm. III. THE CONSTRAINED SKI-RENTAL PROLEM Considr a Ski-Rntal problm whr th rntal pric is $ and th buing pric is $ > ). Lt x b th tim at which th dsignr dcids to bu th skis, and lt px) b th probabilit distribution ovr x. Also, lt b th arrival tim or th numbr of snow das) chosn b th advrsar with q) bing th probabilit distribution ovr. Th dsignr advrsar) is intrstd in slcting px) q)) in such a wa that would imiz imiz) th xpctd CR. Th cost incurrd b th dsignr is a function of th numbr of snow das, which is controlld b th advrsar and is not known b th dsignr, and th randomizd stratg applid b th dsignr. Th dsignr s stratg is an onlin algorithm as th dsignr acts without th knowldg of th numbr of snow das. W will dnot th xpctd cost incurrd b th dsignr b Cpx), ). Lt OPT) dnot th cost incurrd b an optimal offlin algorithm. With ths dfinitions, w can now writ th CR as: Cpx), ) c = OPT). Lt us first dtr th possibl valus for OPT). If, thn th stratg that imizs th ovrall cost is rnting for th priod [, ]. On th othr hand, if >, thn it is optimal to bu th skis. Formall, w hav, if, OPT) = ), othrwis. Th rsults w prsnt hr appl to problms with an rnting pric and stting it to unit is mrl a scaling adoptd for simplicit. Th valu of Cpx), ) dpnds on whn th dsignr dcids to bu th skis. If x, thn th dsignr will hav to pa $x for th rntal priod in addition to th buing pric of $. Howvr, if < x, thn th dsignr will not hav to bu th skis and will mak a pamnt of $ as rnting fs. Hnc, for, th xpctd cost can b writtn as Cpx), ) = x + )px)dx + px)dx. ) Th cas whn > will b discussd in th nxt sction. caus th objctivs of th dsignr and th advrsar ar conflicting, it is natural to us gam thor to driv th optimal stratgis for both plars. In Sction III-, w will formulat th problm as a continuous-krnl zro-sum gam. Howvr, bfor w indulg in formulating and analzing th continuous-krnl zro-sum gam, lt us first considr a discrtizd vrsion of th gam, i.., w will first formulat a matrix zro-sum gam. This will aid us in undrstanding th dcision procss of th dsignr and th advrsar. A. Matrix Zro-Sum Gam Assum that th pur) stratg spac for both th dsignr and th advrsar is th countabl infinit st,, 3, }. Lt A = [A ij ] b th matrix of th zro-sum gam with th dsignr bing th row plar and th advrsar bing th column plar. 3 + + + + + + 3 + + + + + 3........... + + + + + +........ } } :=A Th i-th row corrsponds to th cas whr th dsignr chooss to rnt th skis for i das and bu th skis on th i- th da. Th j-th column corrsponds to th advrsar choosing th numbr of snow das to b j. Hnc, th i, j)-th lmnt of A is th CR corrsponding to th dsignr choosing i and th advrsar choosing j. studing th matrix gam A, w notic that th +)- st column doats th -th column, i.., A i+) A i with A i+) > A i for at last on i. In fact, th + j)-th column doats th +j )-th column for j. Hnc, w can rmov th doatd columns from th matrix, and th rsulting matrix will b stratgicall quivalnt to A [8]. Aftr rmoving th doatd columns, w radil s that th -th row doats th + i)-th rows, i, i.., A j A +i)j with A j < A +i)j for at last on j, i. Aftr 494
3 Procdings IEEE INFOCOM 4 rmoving th doatd rows, th rsulting matrix gam à can b writtn as 3 + + + + 3 + + + à = 3........ Not that th first rows in th + j)-th, j, columns hav th sam valus. Hnc, th advrsar will xhibit th sam prformanc rgardlss of which column it chooss, as long as j >. It is important to not that, b strict doanc, w wr abl to convrt an infinit gam into a finit on. Mor importantl, on can obtain an xact quilibrium for th gam and dos not nd to construct an ɛ-quilibrium as th prformanc th advrsar achivs as j is idntical to what it achivs whn j = + k, k, k N.. Continuous-Krnl Zro-Sum Gam W can obtain insights from à whn driving th stratgis of th dsignr and th advrsar in th continuous-tim cas. In th discrt-tim cas, th stratg spac of th dsignr rducs to,,, }. Hnc, in th continuous-tim cas, th dsignr nds onl to assign probabilitis ovr th intrval [, ]. W can thn writ th dsignr s stratg spac as } P = px), x [, ] : p)d = Howvr, th situation is diffrnt for th advrsar as its stratg spac in th discrt-tim cas bcoms,,,, K}, whr K >. Hnc, th advrsar must construct a two-part randomizd stratg: a probabilit dnsit ovr th intrval [, ) and a probabilit mass at K. W will dnot th probabilit mass at K b q K = q = K). Also, w will considr two diffrnt constraints on th advrsar s stratg. In Sction IV-A, w will assum that th advrsar has a constraint on th first momnt. Formall, w can writ q)d + q K K =. 3) Th stratg spac of th advrsar in this cas is Q = q), [, ) K, K : q)d + q K = and 3) is satisfid In Sction IV-, w will rplac th constraint on th first momnt with on on th scond momnt as follows: q)d + K q K =. 4) W now formulat th zro-sum gam plad b th dsignr and th advrsar. Th objctiv function of th dsignr is th xpctd CR dnotd Jp, q) it follows that th objctiv function of th advrsar is Jp, q). Hnc, it is th avrag, with rspct to q), of th CR for [, ). }. and = K. caus K, and x [, ], w conclud that th cost incurrd b th plar will b x + at K. Thus, using ) and ), w can writ th xpctd CR as Jp, q) = E q [c] = Cpx), ) q)d + q K x + px)dx. W will dnot th zro-sum gam b G = P, Q, J}. Th solution concpt w adopt in studing G is th mixd-stratg saddl-point quilibrium dfind blow. Dfinition : Th pair p, q ) constituts a saddl-point quilibrium in mixd-stratgis for G if Jp, q) Jp, q ) Jp, q ), for an p P and q Q. Von Numann s i thorm [8], w know that px) P q) Q Jp, q) = q) Q px) P Jp, q). 5) In th following, w will mak us of this fact to driv th optimal mixd-stratgis for both plars. IV. THE DESIGNER S PROLEM Th dsignr s optimal stratg p x) can b obtaind b solving th following problm: px) P q) Q Jp, q). To solv this problm, w will tak th following stps:. W first construct th dual to th imization problm. Th dual turns out to b a linar program LP) with two qualit constraints.. diffrntiating on of th qualit constraints twic, w obtain a first-ordr ODE which can b usd, along with th fact that px) is a probabilit distribution function PDF), to obtain px) as a function of th Lagrang multiplir associatd with constraint on q). 3. substituting th obtaind PDF into th original qualit constraints, w obtain an LP in th Lagrang multiplirs which can thn b radil solvd. Not that in Stp 3, w manag to convrt an infinit dimnsional optimization problm as w wr originall solving for px)) to a finit scalar optimization problm. W will first driv p x) for th cas whn is constraind. Thn, w will procd to th cas whn is constraind. A. First-Momnt-Constraind Ski-Rntal Problm W will start b constructing th dual problm. Th Lagrangian associatd with th imization problm is ) Cpx), )) Lq), λ, λ ) = λ λ q)d } } :=h ) ) x + +q K px)dx λ λ K +λ + λ ). } } :=h 495
3 Procdings IEEE INFOCOM 5 Th dual function gλ, λ ) = sup q) Q Lq), λ, λ ) is thrfor givn b λ + λ gλ, λ ) =, if h ) =, h =,,, othrwis. Hnc, aftr adding th constraints on px), th dual bcoms λ + λ 6) px) P,λ,λ Cpx), ) s.t. = λ + λ, 7) x + px)dx = λ + λ K, 8) [, ], λ, λ. caus 7) holds for all, w can diffrntiat both sids twic with rspct to and rplac with x to obtain d dx px) = px) + λ ). This is a first-ordr ODE whos solution is px) = α x λ. 9) To solv for α, w us th fact that px) is a PDF to obtain α = + λ ). substituting 9) into 7) and 8), w obtain th following quivalnt conditions: ) λ + = λ, ) ) 3 K λ + = λ. ) Furthr, w must hav px), for x [, ], or quivalntl ) x λ x ). Hnc, rquiring th PDF to b positiv imposs th following constraints on λ : λ λ > x, x ) if x log ),) x, x ) if log ) < x. Not that th right hand sid RHS) of ) is positiv and strictl incrasing for x [, log )), whras th RHS of 3) is ngativ for x log ), ]. Thrfor, w must hav λ ). Th dsignr s problm is thus quivalnt to th following LP: λ + λ 4) λ,λ s.t. ) λ + = λ, ) 3 K λ + = λ, λ, λ ). th fundamntal thorm of LPs, w know that th solutions to this LP form a convx poltop, and that ach basic fasibl solution λ, λ ) is a cornr point of th poltop and vic vrsa). Not that if λ >, w must hav = K in ordr to satisf ) and ) simultanousl. ) Hnc, w hav ) two cornr points: b =, and b =, ). Th corrsponding valus to ths points ar: λ +λ ) = b, λ +λ ) = + b ). 5) Ths valus ar th rsulting xpctd CRs. Not that whn λ =, th problm bcoms a classical Ski-Rntal problm without a constraint on th man whos optimal xpctd CR undr a randomizd algorithm is known to b, which is what w obtain undr b. comparing th obtaind valus at b and b, w conclud that if, th optimal PDF is p x) = ) x ), x,, othrwis. Othrwis, th optimal solution is p x) = ) x, x,, othrwis. 6) 7) This is it to b xpctd; whn th first momnt is high mor prcisl, whn > ), knowing its valu dos not provid th dsignr with xtra information. Hnc, th optimal randomizd stratg bcoms PRO. Asid from th fact that th valu of dcids which PDF is optimal, it is intrsting to not that 6) is indpndnt of. Thus, on can us this optimal PDF without nding to comput.. Scond-Momnt-Constraind Ski-Rntal Problm W can prform similar analsis b rplacing th constraint on th first momnt with on on th scond momnt of th advrsar s stratg. Accordingl, w rplac 3) with 4) in th dfinition of Q. following th sam stps lading to 4), w can obtain th following LP to b solvd b th dsignr: λ,λ s.t. λ + λ 3 5 ) λ + = λ, ) 4 5 K λ + = λ, λ, λ 3 5). 496
3 Procdings IEEE INFOCOM 6 It radil follows that if λ >, w must hav = K in ordr to satisf th qualit constraints simultanousl. Also, w hav two cornr points: b ) =, and b = ), 3 5). Th corrsponding xpctd CRs ar: λ + λ ) =, λ + λ ) = + 3 b b 5). Hnc, if p x) = 3 5, th optimal stratg bcoms x ) x+, x,, othrwis. 5) 8) 9) Othrwis, th optimal solution is th PDF corrsponding to PRO givn in 7). W again notic that th optimal stratg is indpndnt of th xtra information th dsignr posssss, naml th scond momnt. Fig. shows th distributions of PRO, -PRO, and -PRO. Fig.. px).35.3.5..5..5 PRO -PRO -PRO 4 6 8 x Dpiction of th PDFs of PRO, -PRO, and -PRO. C. Prformanc Comparison W will first compar th four schms, PRO, - PRO, and -PRO) basd on thir CR prformanc for all arrival valus. Fig. compars th CR prformanc of th four schms for =. For th proposd mthods, th curvs wr computd using th LHS of 7) for and th LHS of 8) for >. W conclud that our approach striks a balanc btwn th CR prformanc of and PRO. Also, -PRO outprforms -PRO for. As w impos constraints on highr momnts, w xpct that th prformanc will b furthr improvd ovr this intrval. This improvmnt is accompanid b a slight dtrioration in prformanc ovr th intrval >. Lt us compar th CR achivd b -PRO, -PRO, and that achivd b PRO for >. W first comput λ + λ ) = 3 b ), λ + λ ) b = 6 4 3 5). Hnc, w find that th diffrnc btwn th prformanc of -PRO, -PRO, and PRO to b 3 ) =.4, 6 4 3 5) =.8. c.8.6.4. PRO -PRO -PRO 4 6 8 4 6 8 Fig.. Th CR of th four algorithms. Th buing pric is $. W radil s that our schm xhibits comparabl prformanc to PRO for >. Now, w avrag ovr th arrival valus and compar th schms basd on thir worst-cas xpctd CR. For PRO, w alrad know that th xpctd CR is. In ordr to compar our mthods with, w nd th following lmmas. Lmma : In th First-Momnt-Constraind Ski-Rntal problm, th worst-cas xpctd CR for is at last + whn, and it is at last whn >. Proof: caus w ar looking for a lowr bound on th worst-cas CR, it suffics to find a distribution ˆq) that ilds th proclaimd valus. Formall, w hav q) Q px) P Jp, q) Jp, ˆq). ) px) P Considr th following candidat distribution for :, =, ˆq) =, = +,, othrwis, whr + = + ɛ, ɛ > is small. This distribution clarl satisfis th first momnt constraint as ɛ. Assug is usd, w can now comput Eˆq = ) + = +. Similarl, to obtain Eˆq [c] = whn >, w can choos:, =, ˆq) =, othrwis. Lmma : In th Scond-Momnt-Constraind Ski-Rntal problm, th worst-cas xpctd CR for is at last + whn, and it is at last whn >. Proof: Th proof is similar to that of Lmma. Th PDF guaranting a worst-cas CR of + ˆq) =, =,, = +,, othrwis, whn is and that ilding an xpctd CR of whn > is, =, ˆq) =, othrwis. 497
3 Procdings IEEE INFOCOM 7 From Lmmas and, 5) and 8), and b comparing th obtaind xpctd CRs for th schms at hand, w can draw th following conclusions: Whn or 3 5 ), - PRO or -PRO) alwas outprforms and PRO; Whn > or > 3 5 ), PRO outprforms, bcaus ) < + or < + for ths valus of, and =.58 <. Howvr, not that th optimal solution obtaind using our mthodolog dfaults to PRO ovr this rang as can b sn from th conditions lading to 6) and 7) or 9)). Hnc, w conclud that for an valu of or ) our approach producs th polic ilding th bst worst-cas xpctd CR. Fig. 3 dmonstrats this fact for =. In ordr to rigorousl Worst-cas xpctd comptitv ratio.5.5 5 5 PRO -PRO Worst-cas xpctd comptitv ratio.5.5 5 5 PRO -PRO Fig. 3. Th worst-cas xpctd CR for th four algorithms. =. show that this is th bst possibl CR, w gnrat an arrival tim numbr of ski das) distribution such that th bound givn b th algorithm -PRO is tight. In ordr to do this, w solv th advrsar s problm. V. THE ADVERSARY S PROLEM Lt us rstrict our attntion to th cas whn onl th first momnt of q) is constraind. Th advrsar attmpts to solv th following problm: q) Q px) P Jp, q). W will tak th following stps to solv this problm:. W first construct th dual to th imization problm. Th dual is shown to b an LP with an qualit constraint.. diffrntiating th qualit constraint twic, w obtain a first-ordr ODE which can b usd to solv for q). 3. valuating th qualit constraint at spcific valus and b using Von Numann s thorm, w can solv for Th advrsar s problm undr a scond momnt constraint is similar to th on with a constraint on th man and is omittd du to spac limitation. q K. To obtain a rang of possibl valus for K, w us th constraint on th first momnt. W procd b constructing th dual problm. following similar stps to th abov, w obtain: q) Q,λ s.t. λ ) gx) + q K x + = λ, ) x x + gx) = q)d + q)d, x x [, ], λ. diffrntiating ) twic, w obtain th following ODE: whos solution is givn b d d q) = q), q) = β. To full charactriz q ), w nd to solv for β, K, and q K. Using th fact that q) must intgrat to q K, w gt β = q K ). To obtain q K, w invok Von Numann s thorm and mak us of th fact that th valus of 6) and ) must b qual, bcaus th ar dual to ach othr. Whn >, w hav λ =. valuating ) at x =, w obtain q K =. caus in this cas th man is larg, and th dsignr dos not us hr knowldg of th man, th advrsar can rlax th constraint on th man and K can b chosn frl. In th cas whn, w know that λ = + ), and hnc q K = ). Hr, th advrsar nds to satisf th man constraint b slcting K proprl. Using 3), w gt K = ) + 5) 5). If w slct K, thn w must hav >. ut this again bcoms th cas whr th information about th man dos not bnfit th dsignr. Hnc, th intrsting cas to considr is whn K. This translats to rquiring: 5) ) ) + 5). Hnc, th advrsar s optimal stratg whn is ) q 3 ) ), <, ) = ), = K, 3), othrwis. 498
3 Procdings IEEE INFOCOM 8 Othrwis, th optimal solution is q ), <, ) =, = K,, othrwis. Th following thorm amalgamats what w hav shown in Sctions IV and V. Thorm : Whn imposing a constraint on th first or scond) momnt of th advrsar s stratg, th bst possibl worst-cas xpctd CR) that can b achivd is ) + or + 3 5), for or 3 5 ). In both cass, th achivd CR outprforms that of and PRO. Proof: Th proof follows from Von Numann s i thorm. In Sction IV w hav drivd th optimal stratg of th dsignr undr th two diffrnt constraints. Thir corrsponding prformanc was shown in 5) and 8). Furthr, w hav drivd th optimal advrsarial stratg in 3) undr th first momnt constraint which achivs 5). 5), w conclud that th drivd optimal stratgis will ild th bst worst-cas xpctd CR. Also, Lmmas and and th subsqunt argumnts show that th obtaind xpctd CR outprforms that of and PRO. VI. NUMERICAL RESULTS W now prsnt a dtaild valuation of our schm using simulations as wll as snthtic and ral-world fil sstm workloads. A. Simulation Rsults Th simulations ar basd dirctl on our analsis in th prvious sctions. Thorm stats that our approach guarants th bst possibl worst-cas xpctd CR. Howvr, b ), w conclud that whn th arrivals distribution is not slctd optimall, w obtain a lowr bound on th worst-cas xpctd CR. Hr, w simulat this phnomnon using thr distributions: uniform, xponntial, and log-normal. From Fig. 3, w notic that -PRO and -PRO xhibit similar prformanc. Hnc, w will will onl show simulations for -PRO in th rst of this sction. Fig. 4 plots E q [c] for diffrnt valus of whn th arrivals ar uniforml distributd ovr [, ] and =. W s that xhibits optimal prformanc for ; this is bcaus th arrivals distribution in this cas falls ntirl in th intrval whr th CR of is. Howvr, as th valu of incrass, th CR of bcoms, and our approach outprforms vntuall. Furthr, our approach outprforms PRO for small valus. Fig. 5 dpicts th sam simulation but for an xponntial distribution with paramtr and = 5. W can again s that -PRO outprforms PRO for small valus of. W find that outprforms our approach for small valus of. This is to b xpctd sinc th xponntial distribution placs most of its wight on th intrval [, ] ovr which has a CR of. Not, howvr, that as incrass, th Eq[c].8.7.6.5.4.3.. PRO -PRO 4 6 8 4 6 8 Fig. 4. Expctd CR for U[, ]-distributd arrivals. =. xponntial distribution placs mor wight outsid [, ] and th prformanc of worsns. Eq[c].8.7.6.5.4.3.. PRO -PRO Fig. 5. Expctd CR for Exp 3 4 5 6 7 ) -distributd arrivals. = 5. Finall, Fig. 6 simulats a log-normal arrival distribution with a standard dviation of.5. Similar to th abov two cass, -PRO doats as th valu of incrass. From th abov thr xprimnts, w conclud that our schm xhibits an intrmdiat prformanc btwn and PRO: it outprforms PRO for small valus of and outprforms for larg valus. Eq[c].9.8.7.6.5.4.3.. PRO -PRO 5 5 5 3 35 4 45 Fig. 6. Expctd CR for Log-N,.5)-distributd arrivals. = 5. 499
3 Procdings IEEE INFOCOM 9. Cloud Fil Sstm asd Evaluation W valuat a scnario in which a fil sstm is running on th cloud, and it is using two tps of storag rsourcs: a disk akin to Amazon S3) and a cach Amazon EC VM instanc mmor). Th disk I/O is costlir than th cach I/O, whil th cach storag cost is mor xpnsiv than that of th disk. W st = to indicat that th disk I/O pr-block is tn tims costlir than storing th block for on unit of tim sc). In this fil sstm, th rqusts to th fil arriv at various points in tim, and th srvr chooss btwn storing th rspons in th cach and ftching it frsh from th disk basd on, PRO, or -PRO. Finall, not that whil our analsis focusd on th CR, our valuation using fil sstm tracs focuss on th total cost. Snthtic Workloads. W gnratd two tps of snthtic workloads basd on th intr-arrival tims of th rqusts. In Fig. 7, w adopt fixd intr-arrival tims and masur th cost of running th fil sstm on th cloud for diffrnt arrival valus. In Fig. 8, w rpat th sam xprimnt with xponntiall distributd intr-arrival tims and masur th avrag-cost for diffrnt man valus. Th figurs show that is th bst algorithm for arrivals occurring bfor tim, whil PRO is th bst for arrivals aftr tim. Th also highlight that -PRO is quit robust; it attmpts to approximat th bst of th two schms for diffrnt arrival valus. Cost in Millions) Cost in Millions).5.5 PRO -PRO 4 6 8 4 6 8.6.4..8 Intr-Arrival Tim scs) Fig. 7. Cost for arrivals at fixd intrvals. =..6 PRO.4 -PRO. 4 6 8 4 6 8 Man Intr-Arrival Tim scs) Fig. 8. Avrag-cost for xponntiall distributd intr-arrival tims. =. Ral Workload. W usd th publicl-rlasd workloads of a cloud fil sstm from a prior stud b Naraanan t al. [9], []. Thr ar tracs from 36 fil sstms hostd in an ntrpris data cntr at Microsoft. Each trac contains th arrival rqusts for disk blocks for a wk during Fbruar 8. Using ths tracs, w ran a cloud fil sstm using ElastiCach and S3 with th pricing valus as shown in Tabl I. For 4 K as th fil sstm block siz, th cost valus in Tabl I will st to 47 hours. W ran all th fil sstm tracs with = 47 and computd th total cost of all th fil sstms. Du to spac limitation, w onl prsnt a summar of th rsults without showing an graphs. Th total cost of running ths fil sstm tracs ar: $59 for, $345 for PRO, and $4 for -PRO. Again, -PRO provids a robust prformanc as it rducs th diffrnc btwn th chapst and th costlist schms b narl 6%. VII. CONCLUSION Man cloud cost optimization problms can b abstractd as a Ski-Rntal problm. Existing rsarch has dvlopd dtristic and probabilistic algorithms to tackl th Ski- Rntal problm. Th Ski-Rntal problm is usuall studid without xploiting common information about th application workload. In this papr, w introducd a nw variant of th problm calld th Constraind Ski-Rntal problm which assums that th first or scond) momnt of th arrivals distribution is known to th algorithm dsignr. W dmonstratd that using this limitd information can lad to a class of randomizd algorithms that provid th bst arrivals-distribution-fr prformanc guarants, bcaus th outprform xisting approachs in th worst-cas xpctd CR sns. appling th proposd schm to cloud fil sstms, w hav shown that it can lad to significant cost savings. caus of th growing importanc of optimizing th costs in th cloud, w bliv man othr applications can bnfit from our findings. REFERENCES [] K. P. Puttaswam, T. Nandagopal, and M. Kodialam, Frugal storag for cloud fil sstms, in Proc. ACM Europan Conf. Computr Sstms,, pp. 7 84. [] A. Kathpal, M. Kulkarni, and A. akr, Analzing comput vs. storag tradoff for vido-awar storag fficinc, in Proc. Workshop on Hot Topics in Storag and Fil Sstms,. [3] A. R. Karlin, M. S. Manass, L. Rudolph, and D. D. Slator, Comptitiv snoop caching, in Proc. Smp. Foundations of Computr Scinc, Octobr 986, pp. 44 54. [4] H. Fujiwara and K. Iwama, Avrag-cas comptitiv analss for Ski- Rntal problms, Algorithmica, vol. 4, no., pp. 95 7, 5. [5] Z. Lotkr,. Patt-Shamir, and D. Rawitz, Rnt, las or bu: Randomizd algorithms for multislop ski rntal, arxiv:9.35, 8. [6] R. Motwani and P. Raghavan, Randomizd algorithms. Nw York, NY, USA: Cambridg Univrsit Prss, 995. [7] A. R. Karlin, M. S. Manass, L. A. McGoch, and S. Owicki, Comptitiv randomizd algorithms for non-uniform problms, in Proc. ACM- SIAM Smp. Discrt Algorithms, 99, pp. 3 39. [8] T. aşar and G. J. Olsdr, Dnamic Noncooprativ Gam Thor. SIAM Sris in Classics in Applid Mathmatics, 999. [9] D. Naraanan, A. Donnll, and A. Rowstron, Writ off-loading: practical powr managmnt for ntrpris storag, in Proc. USENIX Conf. Fil and Storag Tchnologis, 8, pp. 56 67. [] MSR Cambridg Tracs, http://iotta.snia.org/tracs/388. 5