DynamicLoadBalancing ExploitingProcessLifetimeDistributionsfor MorHarchol-Balter and AllenB.Downey tionwhetherpreemptivemigration(migratingactiveprocesses)isnecessary,orwhetherremote WeconsiderpoliciesforCPUloadbalancinginnetworksofworkstations.Weaddresstheques- UniversityofCalifornia,Berkeley execution(migratingprocessesonlyatthetimeofbirth)issucientforloadbalancing.weshow thatresolvingthisisssueisstronglytiedtounderstandingtheprocesslifetimedistribution.our measurementsindicatethatthedistributionoflifetimesforunixprocessispareto(heavy-tailed), withaconsistentfunctionalformoveravarietyofworkloads.weshowhowtoapplythisdistributiontoderiveapreemptivemigrationpolicythatrequiresnohand-tunedparameters.weusea trace-drivensimulationtoshowthatourpreemptivemigrationstrategyisfarmoreeectivethan remoteexecution,evenwhenthememorytransfercostishigh. CategoriesandSubjectDescriptors:unknown[unknown]:unknown unknown bution workloadmodeling,trace-drivensimulation,networkofworkstations,heavy-tailed,paretodistri- GeneralTerms:unknown AdditionalKeyWordsandPhrases:Loadbalancing,loadsharing,migration,remoteexecution, 1.INTRODUCTION Mostsystemsthatperformloadbalancinguseremoteexecution(i.e.non-preemptive migration)basedonaprioriknowledgeofprocessbehavior,oftenintheformofa listofprocessnameseligibleformigration.althoughsomesystemsarecapableof NSFgrantnumberCCR-9201092.AllenDowneypartiallysupportedbyNSF(DARA)grant MorHarchol-BaltersupportedbyNationalPhysicalScienceConsortium(NPSC)Fellowshipand suchaspreservingautonomy.apreviousanalyticstudybyeageretal.discourages migratingactiveprocesses,mostdosoonlyforreasonsotherthanloadbalancing, Address:fharchol,downeyg@cs.berkeley.edu,UniversityofCalifornia,Berkeley,CA94720. DMW-8919074. AnearlierversionofthispaperappearedintheProceedingsoftheACMSigmetricsConference Permissiontomakedigitalorhardcopiesofpartorallofthisworkforpersonalorclassroomuseis onmeasurementandmodelingofcomputersystems(may23-26,1996)pp.13-24. grantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprotordirectcommercial advantageandthatcopiesshowthisnoticeontherstpageorinitialscreenofadisplayalong withthefullcitation.copyrightsforcomponentsofthisworkownedbyothersthanacmmust servers,toredistributetolists,ortouseanycomponentofthisworkinotherworks,requiresprior behonored.abstractingwithcreditispermitted.tocopyotherwise,torepublish,toposton specicpermissionand/orafee.permissionsmayberequestedfrompublicationsdept,acm Inc.,1515Broadway,NewYork,NY10036USA,fax+1(212)869-0481,orpermissions@acm.org.
2implementingpreemptivemigrationforloadbalancing,showingthattheadditional performancebenetofpreemptivemigrationissmallcomparedwiththebenetof M.Harchol-BalterandA.B.Downey simplenon-preemptivemigrationschemes[eageretal.1988].butsimulationstudies,whichcanusemorerealisticworkloaddescriptions,andimplementedsystems haveshowngreaterbenetsforpreemptivemigration[kruegerandlivny1988] 1.1Loadbalancingtaxonomy andatrace-drivensimulationtoinvestigatetheseconictingresults. Onanetworkofsharedprocessors,CPUloadbalancingistheideaofmigratingprocessesacrossthenetworkfromhostswithhighloadstohostswithlowerloads.The andimprovetheutilizationoftheprocessors.analyticmodelsandsimulationstud- motivationforloadbalancingistoreducetheaveragecompletiontimeofprocesses ieshavedemonstratedtheperformancebenetsofloadbalancing,andtheseresults havebeenconrmedinexistingdistributedsystems(seesection1.4). determineswhenmigrationsoccurandwhichprocessesaremigrated.thisisthe questionweaddressinthispaper.theotherhalfofaloadbalancingstrategyis thelocationpolicy theselectionanewhostforthemigratedprocess.previous workhassuggestedthatsimplychoosingthetargethostwiththeshortestcpurun Animportantpartoftheloadbalancingstrategyisthemigrationpolicy,which [Baraketal.1993].Thispaperusesameasureddistributionofprocesslifetimes execution,alsocallednon-preemptivemigration,inwhichsomenewprocessesare relativeunimportanceoflocationpolicy. queueisbothsimpleandeective[zhou1987][kunz1991].ourworkconrmsthe (possiblyautomatically)executedonremotehosts,andpreemptivemigration,in whichrunningprocessesmaybesuspended,movedtoaremotehost,andrestarted. Innon-preemptivemigrationonlynewbornprocessesaremigrated. Processmigrationforpurposesofloadbalancingcomesintwoforms:remote Implicitmigrationpoliciesmayormaynotuseaprioriinformationaboutthe functionofprocesses,howlongtheywillrun,etc. Loadbalancingmaybedoneexplicitly(bytheuser)orimplicitly(bythesystem). mationaboutjoblifetimes.thisinformationisoftenimplementedasaneligibility lifetimeofprocesses,implicitnon-preemptivepoliciesrequiresomeaprioriinfor- listthatspeciesbyprocessnamewhichprocessesareworthmigrating[svensson 1990][Zhouetal.1993]. Sincethecostofremoteexecutionisusuallysignicantrelativetotheaverage sincethisisoftendiculttomaintainandpreemptivestrategiescanperformwell withoutit.thesesystemsuseonlysystem-visibledatalikethecurrentageofeach processoritsmemorysize. Incontrast,mostpreemptivemigrationpoliciesdonotuseaprioriinformation, balancingstrategiesthatassumenoaprioriinformationaboutprocesses. (1)Ispreemptivemigrationworthwhile,giventheadditionalcost(CPUandlatency)associatedwithmigratinganactiveprocess? Thispaperexaminestheperformancebenetsofpreemptive,implicitload Weanswerthefollowingtwoquestions: (2)Whichactiveprocesses,ifany,areworthmigrating?
1.2Processmodel Inourmodel,processesusetworesources:CPUandmemory(wedonotconsider ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 3 scheduling;insection7wediscusstheeectofotherlocalschedulingpolicies.since completion).weassumethatprocessorsimplementtime-sharingwithround-robin I/O).Thus,weuse\age"tomeanCPUage(theCPUtimeaprocesshasused imposedonaprocessis processesmaybedelayedwhileontherunqueueorwhilemigrating,theslowdown thusfar)and\lifetime"tomeancpulifetime(thetotalcputimefromstartto wherewalltimeisthetotaltimeaprocessspendsrunning,waitinginqueue,or migrating. Slowdownofprocessp=wall-time(p) 1.3Outline CPU-time(p) Theeectivenessofloadbalancing eitherbyremoteexecutionorpreemptive balancingstrategies. trace-drivensimulationtoevaluatetheimpactofthisworkloadonproposedload observationsabouttheworkloadonanetworkofunixworkstations,andusesa migration dependsstronglyonthenatureoftheworkload,includingthedistributionofprocesslifetimesandthearrivalprocess.thispaperpresentsempiricabutionispredictablewithgoodnessoftgreaterthan99%andconsistentacross avarietyofmachinesandworkloads.asaruleofthumb,theprobabilitythata workloadsinanacademicenvironment,includinginstructionalmachines,research machines,andmachinesusedforsystemadministration.wendthatthedistri- Section2presentsastudyofthedistributionofprocesslifetimesforavarietyof processwithcpuageofonesecondusesmorethantsecondsoftotalcputime is1=t(seefigure1). 1986],butthispriorworkhasbeenincorporatedinfewsubsequentanalyticand simulatorstudiesofloadbalancing.thisomissionisunfortunate,sincetheresults ofthesearesensitivetothelifetimemodel(seesection2.2). OurmeasurementsareconsistentwiththoseofLelandandOtt[LelandandOtt loadbalancing: Theysuggestthatitispreferabletomigrateolderprocessesbecausetheseprocesseshaveahigherprobabilityoflivinglongenough(eventuallyusingenough CPU)toamortizetheirmigrationcost. eligibilityofaprocessformigrationasafunctionofitscurrentage,migration Ourobservationsoflifetimedistributionshavethefollowingconsequencesfor Afunctionalmodelofthedistributionprovidesananalytictoolforderivingthe withoutmigration.accordingtothiscriterion,aprocessiseligibleformigration slowdownimposedonamigrantprocessislowerinexpectationthanitwouldbe cost,andtheloadsatitssourceandtargethost. onlyifits InSection3wederiveamigrationeligibilitycriterionthatguaranteesthatthe CPUage>1 n?mmigrationcost
4wheren(respectivelym)isthenumberofprocessesatthesource(target)host. InSection5weuseatrace-drivensimulationtocompareourpreemptivemigrationpolicywithanon-preemptivepolicybasedonname-lists.Thesimulatoruses starttimesanddurationsfromtracesofarealsystem,andmigrationcostschosen fromameasureddistribution. M.Harchol-BalterandA.B.Downey evenwithsurprisinglylargemigrationcosts,despiteseveralconservativeassumptionsthatgivenon-preemptivemigrationanunfairadvantage. migrationstrategiesinmoredetail.wendthatpreemptivemigrationreduces migration.wealsoproposeseveralalternativemetricsintendedtomeasureusers' themeandelay(queueingandmigration)by35{50%,comparedtonon-preemptive Nextwechooseaspecicmodelofpreemptiveandnon-preemptivemigration migrationcostontherelativeperformanceofthetwostrategies.notsurprisingly, Nevertheless,preemptivemigrationperformsbetterthannon-preemptivemigration wendthatasthecostofpreemptivemigrationincreases,itbecomeslesseective. Weusethesimulatortorunthreeexperiments:rstweevaluatetheeectof costsbasedonrealsystems(seesection4),andusethismodeltocomparethetwo perceptionofsystemperformance.bythesemetrics,theadditionalbenetsof preemptivemigrationcomparedtonon-preemptivemigrationappearevenmore moreeectivethanevenawell-tunednon-preemptivemigrationpolicy.insection5.5weusethesimulatortocompareourpreemptivemigrationstrategywith previouslyproposedpreemptivestrategies. InSection5.4wediscussindetailwhyasimplepreemptivemigrationpolicyis signicant. 1.4Relatedwork insection7andconclusionsinsection8. WenishwithacriticismofourmodelinSection6,adiscussionoffuturework jobs,fewhaveimplementedimplicitloadbalancingpolicies.mostsystemsonly allowforexplicitloadbalancing.thatis,thereisnoloadbalancingpolicy;theuser decideswhichprocessestomigrate,andwhen.examplesincludeaccent[zayas 1987],Locus[Thiel1991],Utopia[Zhouetal.1993],DEMOS/MP[Powelland 1.4.1Systems.Althoughseveralsystemshavethemechanismtomigrateactive preemptivepolicies(activeprocessesareonlymigratedforpurposesotherthanload [DePaoliandGoscinski1995],andMIST[Casasetal.1995]. Miller1983],V[Theimeretal.1985],NEST[AgrawalandEzzet1987],RHODOS [Tanenbaumetal.1990],Charlotte[ArtsyandFinkel1989],Sprite[Douglisand balancing,suchaspreservingworkstationautonomy).examplesincludeamoeba Ousterhout1991],Condor[Litzkowetal.1988],andMach[Milojicic1993].In Afewsystemshaveimplicitloadbalancingpolicies,howevertheyarestrictlynon- aboutprocesses;e.g.,explicitknowledgeabouttheruntimesofprocessesoruserprovidedlistsofmigratableprocesses[agrawalandezzet1987][litzkowandlivny 1990][DouglisandOusterhout1991][Zhouetal.1993]. general,non-preemptiveloadbalancingpoliciesdependonaprioriinformation thattheirschemeiseectiveandrobust. loadbalancingismosix[baraketal.1993].ourresultssupportthemosixclaim Oneexistingsystemthathasimplementedimplementedautomatedpreemptive
ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 5 1.4.2Studies.Althoughfewsystemsincorporatemigrationpolicies,therehave beenmanysimulationandanalyticalstudiesofvariousmigrationpolicies.most ofthesestudieshavefocusedonloadbalancingbyremoteexecution[livnyand Melman1982][WangandMorris1985][CasavantandKuhl1987][Zhou1987][Pulidasetal.1988][Kunz1991][BonomiandKumar1990][EvansandButt1993][Lin andraghavendra1993][mirchandaneyetal.1990][zhangetal.1995][zhouand Ferrari1987][HacandJin1990][Eageretal.1986]. Onlyafewstudiesaddresspreemptivemigrationpolicies[LelandandOtt1986] [KruegerandLivny1988].TheLelandandOttmigrationpolicyisalsoagebased, butdoesn'ttakemigrationcostintoaccount. Eageret.al.,[Eageretal.1988],concludethattheadditionalperformancebenet ofpreemptivemigrationistoosmallcomparedwiththebenetofnon-preemptive migrationtomakepreemptivemigrationworthwhile.thisresulthasbeenwidely cited,andinseveralcasesusedtojustifythedecisionnottoimplementpreemptive migration,asintheutopiasystem,[zhouetal.1993].ourworkdiersfrom [Eageretal.1988]inbothsystemmodelandworkloaddescription.[Eageretal. 1988]modelaserverfarminwhichincomingjobshavenoanityforaparticular processor,andthusthecostofinitialplacement(remoteexecution)isfree.thisis dierentfromourmodel,anetworkofworkstations,inwhichincomingjobsarrive ataparticularhostandthecostofmovingthemaway,evenbyremoteexecution, issignicantcomparedtomostprocesslifetimes.also,[eageretal.1988]usea degeneratehyperexponentialdistributionoflifetimesthatincludesfewjobswith non-zerolifetimes.whenthecoecientofvariationofthisdistributionmatches thedistributionsweobserved,fewerthan4%ofthesimulatedprocesseshavenonzerolifetimes.withsofewjobs(andbalancedinitialplacement)thereisseldom anyloadimbalanceinthesystem,andthuslittlebenetforpreemptivemigration. Furthermore,the[Eageretal.1988]processlifetimedistributionisexponentialfor jobswithnon-zerolifetimes,theconsequencesofwhichwediscussinsection2.2. Foramoredetailedexplanationofthisdistributionanditseectonthestudy,see [DowneyandHarchol-Balter1995]. KruegerandLivnyinvestigatethebenetsofsupplementingnon-preemptivemigrationwithpreemptivemigrationandndthatpreemptivemigrationisworthwhile.Theyuseahyperexponentiallifetimedistributionthatapproximatesclosely thedistributionweobserved;asaresult,theirndingsarelargelyinaccordwith ours.onedierencebetweentheirworkandoursisthattheyusedasynthetic workloadwithpoissonarrivals.theworkloadweobserved,andusedinourtracedrivensimulations,exhibitsserialcorrelation;i.e.itismoreburstythanapoisson process.anotherdierenceisthattheirmigrationpolicyrequiresseveralhandtunedparameters.insection3.1weshowhowtousethedistributionoflifetimes toeliminatetheseparameters. Likeus,BryantandFinkeldiscussthedistributionofprocesslifetimesandits eectonpreemptivemigrationpolicy,buttheirhypotheticaldistributionsarenot basedonsystemmeasurements[bryantandfinkel1981].alsolikeus,theychoose migrantprocessesonthebasisofexpectedslowdownonthesourceandtargethosts, buttheirestimationofthoseslowdownsisverydierentfromours.inparticular, theyusethedistributionofprocesslifetimestopredictahost'sfutureloadasa functionofitscurrentloadandtheagesoftheprocessesrunningthere.wehave
6examinedthisissueandfound(1)thatthismodelfailstopredictfutureloads becauseitignoresfuturearrivals,and(2)thatcurrentloadisthebestpredictorof M.Harchol-BalterandA.B.Downey futureload(seesection3.1).thus,inourestimatesofslowdown,weassumethat thefutureloadonahostisequaltothecurrentload. 2.DISTRIBUTIONOFLIFETIMES Thegeneralshapeofthedistributionofprocesslifetimesinanacademicenvironmenthasbeenknownforalongtime[Rosin1965]:therearemanyshortjobsand afewlongjobs,andthevarianceofthedistributionisgreaterthanthatofan theircurrentage[cabrera1986].thatsameyear,lelandandottproposeda functionalformfortheprocesslifetimedistribution,basedonmeasurementsofthe lifetimesof9.5millionunixprocessesbetween1984and1985[lelandandott 1986].TheyconcludethatprocesslifetimeshaveaUBNE(used-better-than-new- In1986,CabrerameasuredUNIXprocessesandfoundthatover40%doubled exponentialdistribution. aprocess,thegreateritsexpectedremainingcpulifetime.specically,theynd in-expectation)typeofdistribution.thatis,thegreaterthecurrentcpuageof \longprocesseshaveexponentialservicetimes."manysubsequentstudiesassume anexponentiallifetimedistribution. thatfort>3seconds,theprobabilityofaprocess'slifetimeexceedingtseconds isrtk,where?1:25<k<?1:05andrnormalizesthedistribution. Incontrast,Rommel[Rommel1991]claimsthathismeasurementsshowthat functionalformproposedbylelandandotttstheobserveddistributionswell,for processeswithlifetimesgreaterthan1second.thisfunctionalformisconsistent policies,weperformedanindependentstudyofthisdistribution,andfoundthatthe acrossavarietyofmachinesandworkloads,andalthoughtheparameter,k,varies Becauseoftheimportanceoftheprocesslifetimedistributionforloadbalancing from-1.3to-.8,itisgenerallynear-1.thus,asaruleofthumb, Theprobabilitythataprocesswithage1secondusesatleastTsecondsoftotal TheprobabilitythataprocesswithageTsecondsusesatleastanadditional TsecondsofCPUtimeisabout1=2.Thus,themedianremaininglifetimeofa processisequaltoitscurrentage. CPUtimeisabout1=T. tionpolicies. served.section2.2discussesothermodelsforthedistributionoflifetimes,and arguesthattheparticularshapeofthisdistributioniscriticalforevaluatingmigra- Section2.1describesourmeasurementsandthedistributionoflifetimesweob- 2.1Lifetimedistributionwhenlifetime>1s TodeterminethedistributionoflifetimesforUNIXprocesses,wemeasuredthe lifetimesofoveronemillionprocesses,generatedfromavarietyofacademicworkloads,includinginstructionalmachines,researchmachines,andmachinesusedfor systemadministration.weobtainedourdatausingtheunixcommandlastcomm, showsonlyprocesseswhoselifetimesexceedonesecond.thedotted(heavy)line whichoutputsthecputimeusedbyeachcompletedprocess. Figure1showsthedistributionoflifetimesfromoneofthemachines.Theplot
ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 7 1.0 Distribution of process lifetimes (fraction of processes with duration > T) 0.8 0.6 0.4 0.2 0.0 1 2 4 8 16 32 64 Duration (T secs.) 1 Distribution of process lifetimes (log plot) (fraction of processes with duration > T) 1/2 1/4 1/8 1/16 onmachinepomid-semester.thedotted(thicker)lineshowsthemeasureddistribution;thesolid Fig.1.(a)Distributionoflifetimesforprocesseswithlifetimesgreaterthan1second,observed 1/32 (thinner)lineshowstheleastsquarescurvet.(b)thesamedistributiononalog-logscale.the straightlineinlog-logspaceindicatesthatthedistributioncanbemodeledbytk,wherekisthe 1/64 slopeoftheline. 1 2 4 8 16 32 64 Duration (T secs.)
8showsthemeasureddistribution;thesolid(thinner)lineshowstheleast-squarest tothedatausingtheproposedfunctionalformprflifetime>tg=tk. M.Harchol-BalterandA.B.Downey well.incontrast,figure2showsthatitisimpossibletondanexponentialcurve thattsthedistributionoflifetimesweobserved. Byvisualinspection,itisclearthattheproposedmodeltstheobserveddata oftheformtk,withkvaryingfrom?1:3to?0:8fordierentmachines.table1 showsthevalueoftheparameterforeachmachinewestudied,estimatedbyan iterativelyweightedleast-squarest(withnointercept,inaccordancewiththe functionalmodel).wecalculatedtheseestimateswiththeblsscommandrobust Forallthemachineswestudied,thedistributionofprocesslifetimestsacurve shownhereindicatethatthettedcurveaccountsforgreaterthan99%ofthe ofcertainty).ther2valueindicatesthegoodnessoftofthemodel thevalues thatparameter(alloftheseparametersarestatisticallysignicantatahighdegree [AbrahamsandRizzardi1988]. variationoftheobservedvalues.thus,thegoodnessoftofthesemodelsishigh Thestandarderrorassociatedwitheachestimategivesacondenceintervalfor andconditionaldistributionfunctionforprocesslifetimes.thesecondcolumn (foranexplanationofr2values,see[larsenandmarx1986]). showsthesefunctionswhenk=?1,whichwewillassumeforouranalysisin Section3. Table2showsthecumulativedistributionfunction,probabilitydensityfunction, thatitsmoments(mean,variance,etc.)areinnite.ofcourse,sincetheobserved distributions,though,becausecalculatedmomentstendtobedominatedbyafew distributionshavenitesamplesize,theyhavenitemean(0.4seconds)andcoecientofvariation(5{7).onemustbecautiouswhensummarizinglong-tailed Thefunctionalformweareproposing(thetteddistribution)hastheproperty likethemedian,ortheestimatedparameterk)tosummarizedistributions,rather outliers.inouranalysesweusemorerobustsummarystatistics(orderstatistics thanmoments. second,wedidnotndaconsistentfunctionalform;however,forthemachines westudiedtheseprocesseshadanevenlowerhazardratethanthoseofage>1 second.thatis,theprobabilitythataprocessofaget<1secondslivesanothert secondsisalwaysgreaterthan1=2.thusforjobswithlifetimeslessthan1second, 2.1.1Processwithlifetime<1second.Forprocesseswithlifetimeslessthan1 Manypriorstudiesofprocessmigrationassumeanexponentialdistributionofprocesslifetimes,bothinanalyticalpapers[LinandRaghavendra1993][Mirchandaney andmelman1982][zhangetal.1995][chowdhury1990].thereasonsforthisas- 1991][Pulidasetal.1988][WangandMorris1985][EvansandButt1993][Livny willnotaecttheresultsofloadbalancingstudies. lifetimedistributionisinfactnotexponential,assuminganexponentialdistribution sumptioninclude:(1)analytictractability,and(2)thebeliefthateveniftheactual 2.2Whythedistributioniscritical themedianremaininglifetimeisgreaterthanthecurrentage. etal.1990][eageretal.1986][ahmadetal.1991]andinsimulationstudies[kunz Regardingtherstpoint,althoughthefunctionalformthatweandLelandand
ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 9 NameNumberNumberEstimated of Host po1 77440 Procs>1sec of 4107 Distrib. T?0:97 Error.0160.997 Std R2 po2 po3 cory pors 154368 111997 182523 11468 14253 7524 T?1:22 T?1:27 T?0:88.0120.999 bugs 141950 83600 10402 4940 T?0:94 T?0:82.0210.997.0300.982.0150.997 Table1.Theestimatedlifetimedistributionforeachmachinemeasured,andtheassociated faith 76507 3328 T?0:78.0070.999 goodnessoftstatistics.descriptionofmachines:poisaheavily-useddecserver5000/240,.0450.964 usedprimarilyforundergraduatecoursework.po1,po2,andpo3refertomeasurementsmade onpomid-semester,late-semester,andend-semester.coryisaheavily-usedmachine,usedfor courseworkandresearch.porscheisalessfrequently-usedmachine,usedprimarilyforresearch onscienticcomputing.bugsisaheavily-usedmachine,usedprimarilyformultimediaresearch. Faithisaninfrequently-usedmachine,usedbothforvideoapplicationsandsystemadministration. 1 Observed distribution and two curve fits (fraction of processes with duration > T) 1/2 1/4 k T fit (one parameter) 1/8 exponential fit (two parameters) 1/16 Fig.2.Inlog-logspace,thisplotshowsthedistributionoflifetimesforthe13000processesfrom ourtraceswithlifetimesgreaterthansecond,andtwoattemptstotacurvetothisdata.one 1/32 ofthetsisbasedonthemodelproposedinthispaper,tk.theothertisanexponential curve,ce?t.althoughtheexponentialcurveisgiventhebenetofanextrafreeparameter, 1/64 iteratively-weightedleastsquares. itfailstomodeltheobserveddata.theproposedmodeltswell.bothtswereperformedby 1 2 4 8 16 32 Duration (T secs.)
10 M.Harchol-BalterandA.B.Downey Distributionoflifetimesforprocesses>1secWhenk=?1 PrfT<L<T+dTsecg=?kTk?1dT PrfL>bsecjage=ag=?bak PrfL>Tsecg=Tk 1=T2dT distributionfunctionfortheprocesslifetimel.thesecondcolumnshowsthefunctionalformof Table2.Thecumulativedistributionfunction,probabilitydensityfunction,andconditional eachforthetypicalvaluek=1. a=b tion,itneverthelesslendsitselftosomeformsofanalysis,asweshowinsection3.1. Ottproposecannotbeusedinqueueingmodelsaseasilyasanexponentialdistribu- migrationpolicydependsonhowtheexpectedremaininglifetimeofajobvaries isimportanttomodelthisdistributionaccurately.specically,thechoiceofa distributionaectstheperformanceofmigrationpolicies,andthereforethatit withage.inourobservationswefoundadistributionwiththeubneproperty Regardingthesecondpoint,wearguethattheparticularshapeofthelifetime theexpectedremaininglifetimeofajobincreaseslinearlywithage.asaresult,we choseamigrationpolicythatmigratesonlyoldjobs. processanditsremaininglifetime.forexample,auniformdistributionhasthe NBUEproperty theexpectedremaininglifetimedecreaseslinearlywithage. choosetomigrateonlyyoungprocesses.inthiscase,weexpectnon-preemptive Thusifthedistributionoflifetimeswereuniform,themigrationpolicyshould Butdierentdistributionsyieldindierentrelationshipsbetweentheageofa migrationtoperformbetterthanpreemptivemigration. withthelowestmigrationcost,regardlessofage. lifetimeofajobisindependentofitsage.inthiscase,sinceallprocesseshavethe sameexpectedlifetimes,themigrationpolicymightchoosetomigratetheprocess Asanotherexample,theexponentialdistributionismemoryless theremaining distribution(auniformdistributioninlog-space)havearemaininglifetimethat increasesuptoapointandthenbeginstodecrease.inthiscase,thebestmigration policymightbetomigratejobsthatareoldenough,butnottooold. Asanalexample,processeswhoselifetimesarechosenfromauniformlog dierentmigrationpolicies.inordertoevaluateaproposedpolicy,itiscritical tochooseadistributionmodelwiththeappropriaterelationshipbetweenexpected remaininglifetimeandage. Thusdierentdistributions,evenwiththesamemeanandvariance,canleadto oflifetimes.thesedistributionsmayormaynothavetherightbehavior,depending onhowaccuratelytheytobserveddistributions.[kruegerandlivny1988]use athree-stagehyperexponentialwithparametersestimatedtotobservedvalues. ThisdistributionhastheappropriateUBNEproperty.Butthetwo-stagehyperexponentialdistribution[Eageretal.1988]useismemoryless;theremaininglifetime ofajobisindependentofitsage(forjobswithnonzerolifetimes).accordingtothis distribution,migrationpolicyisirrelevant;allprocessesareequallygoodcandidates Somestudieshaveusedhyperexponentialdistributionstomodelthedistribution formigration.thisresultisclearlyinconictwithourobservations.
preemptivemigration.theheavytailofourmeasuredlifetimedistributionimplies Assumingthewronglifetimedistributionmayalsounderestimatethebenetsof ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 11 abilitytoidentifythosefewhogs.inalifetimedistributionwithoutsuchaheavy tail,preemptivemigrationmightnotbeaseective. Aswe'lldiscussinSection5.4,partofthepowerofpreemptivemigrationisits thatatinyfractionofthejobsrequiremorecputhanalltheotherjobscombined. 3.MIGRATIONPOLICY Amigrationpolicyisbasedontwodecisions:whentomigrateprocessesandwhich processestomigrate.therstquestionconcernshowoftenoratwhattimesthe Thefocusofthispaperisthesecondquestion,alsoknownastheselectionpolicy: systemchecksforeligiblemigrants.weaddressthisissuebrieyinsection5.1.1. Giventhattheloadatahostistoohigh,howdowechoosewhichprocess tive,migrationtimehasalargeimpactonresponsetime.aprocesswouldchoose lifetimes.themotivationforthisheuristicistwofold.fromtheprocess'sperspec- Ourheuristicistomigrateprocessesthatareexpectedtohavelongremaining tomigrate? tomigrateonlyifthemigrationoverheadcouldbeamortizedoveralongerlifetime. tion),becausetheseprocesseshavenoallocatedmemoryandthustheirmigration Fromtheperspectiveofthesourcehost,ittakesasignicantamountofworkto costislow.theideaofmigratingnewbornprocessesmightalsostemfromthe thatarelikelytobemoreexpensivetorunthantomigrate. packageaprocessformigration.thehostwouldonlychoosetomigrateprocesses processeshaveequalexpectedremaininglifetimesregardlessoftheirage,soone fallacythatprocesslifetimeshaveanexponentialdistribution,implyingthatall Manyexistingmigrationpoliciesonlymigratenewbornprocesses(nopreemp- shouldmigratethecheapestprocesses.theproblemwithonlymigratingnewborn processesisthat,accordingtotheprocesslifetimedistribution,newbornprocesses ourmeasurementsshowthatover70%ofprocesseshavelifetimessmallerthanthe areunlikelytolivelongenoughtojustifythecostofremoteexecution.infact, smallestnon-preemptivemigrationcost(seetable3). Wehavefound,though,thattheabilityofthesystemtopredictprocesslifetimes edgeaboutprocessesandcanselectivelymigrateprocesseslikelytobecpuhogs. bynameislimited(section5.4). Thusanewbornmigrationpolicyisonlyjustiedifthesystemhaspriorknowl- processes. processtorunlongerthanayoungprocess;thus,itispreferabletomigrateold jorityofprocessesareshort,theremightnotbeenougholdprocessestohavea Therearetwopotentialproblemswiththisapproach.First,sincethevastma- Canwedobetter?Thedistributionoflifetimesimpliesthatweexpectandold signicantloadbalancingeect.infact,althoughtherearefewlong-livedprocesses,theyaccountforalargepartofthetotalcpuload.accordingtoour measurements,typicallyfewerthan4%ofprocesseslivelongerthan2seconds,yet thelongtailoftheprocesslifetimedistribution.furthermore,wewillseethatthe abilitytomigrateevenafewlargejobscanhavealargeeectonsystemperfor- theseprocessesmakeupmorethan60%ofthetotalcpuload.thisisdueto
12 mance,sinceasinglelongprocessonabusyhostimposesslowdownsonmanyshort processes. M.Harchol-BalterandA.B.Downey migratingprocesseswithlongerexpectedlives. activeprocessismuchgreaterthanthecostofremoteexecution.ifpreemptive migrationisdonecarelessly,thisadditionalcostmightoverwhelmthebenetof Asecondproblemwithmigratingoldprocessesisthatthemigrationcostforan 3.1OurMigrationPolicy Forthisreason,weproposeastrategythatguaranteesthateverymigrationimprovestheexpectedperformanceofthemigrantprocessandtheotherprocesses atthesourcehost.thisstrategymigratesaprocessonlyifitimprovestheexpectedslowdownoftheprocess,whereslowdownisdenedasinsection1.2.of course,processesonthetargethostareslowedbyanarrivingmigrant,butona moderately-loadedsystemtherearealmostalwaysidlehosts;thusthenumberof processesatthetargethostisusuallyzero.inanycase,thenumberofprocesses atthetargetisalwayslessthanthenumberatthesource. migrationisdone.ifmigrationcostsarehigh,fewprocesseswillbeeligiblefor migration;intheextremetherewillbenomigrationatall.butinnocaseisthe performanceofthesystemworse(inexpectation)thantheperformancewithout migration. Ifthereisnoprocessonthehostthatsatisestheabovemigrationcriterion,no imposedonamigrantprocess,andusethisresulttoderiveaminimumagefor migrationbasedonthecostofmigration.denotingtheageofthemigrantprocess thenumberofprocessesatthesourcehostbyn;andthenumberofprocessesat bya;thecostofmigrationbyc;the(eventualtotal)lifetimeofthemigrantbyl, Usingthedistributionofprocesslifetimes,wecalculatetheexpectedslowdown thetargethost(includingthemigrant)bym,wehave: Efslowdownofmigrantg t=aprlifetimeof migrantistslowdowngiven =Z1 t=aprftl<t+dtjlagna+c+m(t?a) lifetimeistdt =12ca+m+n t=aat2na+c+m(t?a) t t formigrationonlyifitsexpectedslowdownaftermigrationislessthann(whichis theslowdownitexpectsintheabsenceofmigration). Iftherearenprocessesataheavilyloadedhost,thenaprocessshouldbeeligible Thus,werequire12(ca+m+n)<n,whichimplies Wecanextendthisanalysistothecaseofheterogeneousprocessorspeedsby Minimummigrationage=Migrationcost n?m
applyingascalefactortonorm. Thisanalysisassumesthatcurrentloadpredictsfutureload;thatis,thattheload ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 13 atthesourceandtargethostswillbeconstantduringthemigration.inanattempt calculatingapredictionofsurvivorsandfuturearrivalsbasedonthedistribution toevaluatethisassumption,andpossiblyimproveit,weconsideredanumberof oftime),(2)summingtheagesoftheprocessesrunningonthehost,anda(3) predictor,andthatusingseveralpredictivevariablesincombinationdidnotgreatly alternativeloadpredictors,including(1)takingaloadaverage(overaninterval modelproposedhere.wefoundthatcurrent(instantaneous)loadisthebestsingle improvetheaccuracyofprediction.theseresultsareinaccordwithzhou[zhou 3.2PriorPreemptivePolicies Onlyafewpreemptivestrategieshavebeenimplementedinrealsystemsorproposed 1987]andKunz[Kunz1991]. inpriorstudies.thethreethatwehavefoundare,likeours,basedontheprinciple thataprocessshouldbemigratedifitisoldenough. freeparameterwhosevalueischosenwithoutexplanation,andthatwouldneedto bere-tunedforadierentsystemoranotherworkload. UnderLelandandOtt'spolicy,aprocesspiseligibleformigrationif Inmanycases,thedenitionofoldenoughdependsona\voodoo"constant1:a andlivny'spolicy,likeours,takesthejob'smigrationcostintoaccount.aprocess piseligibleformigrationif wherekisafreeparametercalledmincrit[lelandandott1986].krueger age(p)>agesofkyoungerjobsathost buttheydonotexplainhowtheychosethevalue0:1[kruegerandlivny1988]. TheMOSIXpolicyissimilar[Baraketal.1993];aprocessiseligibleformigration if age(p)>0:1migrationcost(p) ofamigrantprocessisnevermorethan2,sinceintheworstcasethemigrant completesimmediatelyuponarrivalatthetarget. Thechoiceoftheconstant(1:0)intheMOSIXpolicyensuresthattheslowdown age(p)>1:0migrationcost(p): theslowdownthatwouldbeimposedatthesourcehostintheabsenceofmigration(presumablythereismorethanoneprocessthere,orthesystemwouldnot beattemptingtomigrateprocessesaway).second,itisbasedontheworst-case Despitethisjustication,thechoiceofthemaximumslowdown(2)isarbitrary. WeexpecttheMOSIXpolicytobetoorestrictive,fortworeasons.First,itignores 1ThistermwascoinedbyProfessorJohnOusterhoutatU.C.Berkeley. onload. bestchoiceforthisparameter,forourworkload,isusuallynear0.4,butitdepends slowdownratherthantheexpectedslowdown.insection5.5,weshowthatthe
14 Migrationcosthassuchalargeeectontheperformanceofpreemptiveloadbalancing;thissectionpresentsthemodelofmigrationcostsweuseinoursimulation Wemodelthecostofmigratinganactiveprocessasthesumofaxedmigration 4.MODELOFMIGRATIONCOSTS M.Harchol-BalterandA.B.Downey proportionaltotheamountoftheprocess'smemorythatmustbetransferred. studies. costformigratingtheprocess'ssystemstateandamemorytransfercostthatis remotehost,logginginorotherwiseauthenticatingtheprocess,andcreatinganew Thecostofremoteexecutionincludessendingthecommandandargumentstothe shellandenvironmentontheremotehost. Wemodelremoteexecutioncostasaxedcost;itisthesameforallprocesses. r:thecostofremoteexecution,inseconds f:thexedcostofpreemptivemigration,inseconds b:thememorytransferbandwidth,inmbpersecond Throughoutthispaper,weusethefollowingnotation: andthus: m:thememorysizeofmigrantprocesses,inmb wherethequotientm=bisthememorytransfercost. costofpreemptivemigration=f+m=b costofremoteexecution=r [DouglisandOusterhout1991]haveanexcellentdiscussionofthisissue,andwe migrationdependsonpropertiesofthedistributedsystem.douglisandousterhout 4.1Memorytransfercosts borrowfromthemhere. Theamountofaprocess'smemorythatmustbetransferredduringpreemptive systemlikesprite,whichintegratesvirtualmemorywithadistributedlesystem, itisonlynecessarytowritedirtypagestothelesystembeforemigration.when thecostofmigrationisproportionaltothesizeoftheresidentsetratherthanthe theprocessisrestartedatthetargethost,itwillretrievethesepages.inthiscase Atthemost,itmightbenecessarytotransferaprocess'sentirememory.Ona sizeofmemory. ferredwhiletheprogramcontinuestorunatthesourcehost.whenthejobstops creased,thedelayimposedonthemigrantprocessisgreatlydecreased.additional dirtyduringtheprecopy.althoughthenumberofpagestransferredmightbein- executionatthesource,itwillhavetotransferagainanypagesthathavebecome Insystemsthatuseprecopying,suchasV[Theimeretal.1985],pagesaretrans- techniquescanreducethecostoftransferringmemoryevenmore[zayas1987]. 4.2Migrationcostsinrealsystems Thespecicparametersofmigrationcostdependnotonlyonthenatureofthe system(asdiscussedabove)butalsoonthespeedofthenetwork.tables3and4 showreportedcostsfromavarietyofrealsystems.laterwewilluseatrace-driven simulatortoevaluatetheeectoftheseparametersonsystemperformance.we
System Sprite ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing Hardware SPARCstation1 0.33sec Costofrexec,r 15 [DouglisandOusterhout1991] GLUNIX [Vahdatetal.1994][Vahdat1995]ATMnetwork MIST [Prouty1996] 10Mb/secEthernet Utopia HPworkstations HP9000/720 10Mb/secEthernet 0.25to0.5sec [Zhouetal.1993] DEC3100andSPARCIPC0.1sec 0.33sec Table3.Costofnon-preemptivemigrationinvarioussystems.Someofthesenumberswere obtainedfrompersonalcommunicationwiththeauthors. System Sprite [DouglisandOusterhout1991] MOSIX 10Mb/secEthernet Hardware SPARCstation1 Fixed 0.33sec Cost,f 2.00sec/MB Inverse [Baraketal.1993][Braverman1995]and48666MHz IntelPentium90MHz0.006sec0.44sec/MB Bandwidth,1=b Table4.Valuesforpreemptivemigrationcostsfromvarioussystems.Manyofthesenumbers wereobtainedfrompersonalcommunicationwiththeauthors.thememorytransfercostisthe MIST [Prouty1996] HP9000/720s 10Mb/secEthernet 0.24sec 0.99sec/MB willmakethepessimisticsimplicationthatamigrant'sentirememorymustbe transferred,although,aspointedoutabove,thisisnotnecessarilythecase. productoftheinversebandwidth,1=b,andtheamountofmemorythatmustbetransferred,m. gration.wecomparetwomigrationstrategies:ourproposedage-basedpreemptive Inthissectionwepresenttheresultsofatrace-drivensimulationofprocessmi- 5.TRACE-DRIVENSIMULATION migrationstrategy(section3.1)andanon-preemptivestrategythatmigratesnewbornprocessesaccordingtotheprocessname(similartostrategiesproposedby [Wangetal.1993]and[Svensson1990]).Withtheintentionofndingaconservativeestimateofthebenetofpreemptivemigration,wegivethename-based usethesimulatortorunthreeexperiments.first,insection5.2,weevaluate thesensitivityofeachstrategytothemigrationcostsr,f,b,andmdiscussed strategythebenetofseveralunrealisticadvantages;forexample,thename-lists arederivedfromthesametracedatausedbythesimulator. insection4.next,insection5.3,wechoosevaluesfortheseparametersthatare representativeofcurrentsystemsandcomparetheperformanceofthetwostrategies Section5.1describesthesimulatorandthetwostrategiesinmoredetail.We indetail.insection5.4wediscusswhythepreemptivepolicyoutperformsthe non-preemptivepolicy.lastly,insection5.5,weevaluatetheanalyticcriterion formigrationageproposedinsection3.1,comparedtocriteriausedinprevious studies.
16 5.1Thesimulator Wehaveimplementedatrace-drivensimulationofanetworkofsixidenticalworkstations.2 M.Harchol-BalterandA.B.Downey eachfrom9:00a.m.to5:00p.m.fromthesixtracesweextractedthestarttimes Weselectedsixdaytimeintervalsfromthetracesonmachinepo(seeSection2.1), andcpudurationsoftheprocesses.wethensimulatedanetworkwhereeachof thedaytimetraces. sixhostsexecutes(concurrentlywiththeothers)theprocessarrivalsfromoneof fractionofprocesses(0:1%)arrivetondallhostsbusy.inordertoevaluatethe activityduringtheeight-hourtrace.formostofthetraces,everyarrivingprocess mixanddistributionoflifetimes,thereisconsiderablevariationinthelevelof ndsatleastoneidlehostinthesystem,butinthetwobusiesttraces,asmall Althoughtheworkloadsonthesixhostsarehomogeneousintermsofthejob lowesttohighestload.run0hasatotalof15000processessubmittedtothesix intervals.werefertotheseasruns0through7,wheretherunsaresortedfrom simulatedhosts;run7has30000processes.theaveragedurationofprocesses(for eectofchangesinsystemload,wedividedtheeight-hourtraceintoeightone-hour allruns)is0.4seconds.thusthetotalutilizationofthesystem,,isbetween0.27 and0.54. givenrunandagivenhost,theserialcorrelationininterarrivaltimesistypically between.08and.24,whichissignicantlyhigherthanonewouldexpectfroma Poissonprocess(uncorrelatedinterarrivaltimesyieldaserialcorrelationof0.0; perfectcorrelationis1.0). ThebirthprocessofjobsatourhostsisburstierthanaPoissonprocess.Fora randomlyfromameasureddistribution(seesection5.2).thissimplicationobliteratesanycorrelationsbetweenmemorysizeandotherprocesscharacteristics,but thememorysizeofeachprocess,whichdeterminesitsmigrationcost,ischosen itallowsustocontrolthemeanmemorysizeasaparameterandexamineitseect Althoughthestarttimesanddurationsoftheprocessescomefromtracedata, onsystemperformance. areneverblockedoni/o.duringagiventimeinterval,wedividecputimeequally amongtheprocessesonthehost(processorsharing). Inoursystemmodel,weassumethatprocessesarealwaysreadytorun;i.e.they migrationtothesourcehost.thissimplicationisconservativeinthesensethat thetransferredpages,partintransitinthenetwork,andpartonthetargethost unpackingthedata.thesizeofthesepartsandwhethertheycanbeoverlapped dependondetailsofthesystem.inoursimulationwechargetheentirecostof Inrealsystems,partofthemigrationtimeisspentonthesourcehostpackaging itmakespreemptivemigrationlesseective. thepoliciesassimpleandassimilaraspossible.forbothtypesofmigration,we considerperformingamigrationonlywhenanewprocessisborn,eventhougha tion3.1withanon-preemptivemigrationstrategy,wherethenon-preemptivestrat- egyisgivenunfairadvantages.forpurposesofcomparison,wehavetriedtomake 5.1.1Strategies.WecomparethepreemptivemigrationstrategyproposedinSec- http://http.cs.berkeley.edu/harchol/loadbalancing.html. 2The trace-driven simulator and the trace data are available at
preemptivestrategymightbenetbyinitiatingmigrationsatothertimes.also, forbothstrategies,ahostisconsideredheavily-loadedanytimeitcontainsmore ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 17 gration.finally,weusethesamelocationpolicyinbothcases:thehostwiththe lowestinstantaneousloadischosenasthetargethost(tiesarebrokenbyrandom thanoneprocess;inotherwords,anytimeitwouldbesensibletoconsidermi- selection). areconsideredeligibleformigration: Thustheonlydierencebetweenthetwomigrationpoliciesiswhichprocesses isbornataheavily-loadedhost,theprocessisexecutedremotelyontheselected itsnameisonalistofprocessesthattendtobelong-lived.ifaneligibleprocess host.processescannotbemigratedoncetheyhavebegunexecution. Name-basednon-preemptivemigration.Aprocessiseligibleformigrationonlyif choseathresholdonmeandurationthatisempiricallyoptimal(forthissetofruns). Addingmorenamestothelistdetractsfromtheperformanceofthesystem,asit durationandselectingthe15commonnameswiththelongestmeandurations.we Wederivedthislistbysortingtheprocessesfromthetracesaccordingtonameand Theperformanceofthisstrategydependsonthelistofeligibleprocessnames. allowsmoreshort-livedprocessestobemigrated.removingnamesfromthelist detractsfromperformanceasitbecomesimpossibletomigrateenoughprocesses agedforsomefractionofitsmigrationcost.basedonthederivationinsection3.1, list,ourresultsmayoverestimatetheperformancebenetsofthisstrategy. thisfractionis1 tobalancetheloadeectively.sinceweusedthetracedataitselftoconstructthe Age-basedpreemptivemigration.Aprocessiseligibleformigrationonlyifithas causeitdoesnotallowthesystemtoinitiatemigrationsexceptwhenanewprocess processesthatsatisfythemigrationcriterionaremigratedaway. source(target)host.whenanewprocessisbornataheavily-loadedhost,all Thisstrategyunderstatestheperformancebenetsofpreemptivemigration,be- n?m,wheren(respectivelym)isthenumberofprocessesatthe arrives. morecomplicatedpredictorsoffutureloads,butnoneofthesepredictorsyielded signicantlybetterperformancethantheinstantaneousloadweusehere. AsdescribedinSection3.1,wealsomodeledotherlocationpoliciesbasedon theperformanceofthemigrantwillimproveinexpectation).oneofthestrategies wheneveraprocessbecomeseligible(sincetheeligibilitycriterionguaranteesthat thanwhenanewprocessarrives.ideally,onewouldliketoinitiateamigration weconsideredperformsperiodicchecksofeachprocessonaheavily-loadedhost Wealsoconsideredtheeectofallowingpreemptivemigrationattimesother followingperformancemetrics: toseeifanysatisfythecriterion.theperformanceofthisstrategyissignicantly betterthanthatofthesimplerpolicy(migratingonlyatprocessarrivaltimes). 5.1.2Metrics.Weevaluatetheeectivenessofeachstrategyaccordingtothe metricofsystemperformance.whenwecomputetheratioofmeanslowdowns(as fromdierentstrategies)wewillusenormalizedslowdown,whichistheratioof (thus,itisalwaysgreaterthanone).theaverageslowdownofalljobsisacommon Meanslowdown.Slowdownistheratioofwall-clockexecutiontimetoCPUtime
18 M.Harchol-BalterandA.B.Downey Fraction of procs 0.5 Distribution of slowdowns 0.4 0.3 0.2 smallslowdowns,buttheprocessesinthetailofthedistributionaremorenoticeableandannoying Fig.3.Distributionofprocessslowdownsforrun0(withnomigration).Mostprocessessuer 0.1 0 CPUtime.Forexample,ifthe(unnormalized)meanslowdowndropsfrom2:0to tousers. inactivetime(theexcessslowdowncausedbyqueueingandmigrationdelays)to 1 2 3 4 5 6 7 8 9 10 1:5,theratioofnormalizedmeanslowdownsis0:5=1:0=0:5:a50%reductionin Slowdown ofthetwostrategies;itunderstatestheadvantagesofthepreemptivestrategyfor thesetworeasons: delay. Skeweddistributionofslowdowns:Evenintheabsenceofmigration,themajority Meanslowdownaloneisnotasucientmeasureofthedierenceinperformance Userperception:Fromtheuser'spointofview,theimportantprocessesarethe Thevalueofthemeanslowdownisdominatedbythismajority. onesinthetailofthedistribution,becausealthoughtheyaretheminority,they ofprocessessuersmallslowdowns(typically80%arelessthan3.0.seefigure3). haveasmalleectonthemeanslowdown,butalargeeectonauser'sperception causethemostnoticeableandannoyingdelays.eliminatingthesedelaysmight ofperformance. Therefore,wewillalsoconsiderthefollowingthreemetrics: tailofthedistribution;i.e.thenumberofjobsthatexperiencelongdelays.(see maybemoremeaningfultointerpretthismetricasameasureofthelengthofthe tryingtoscheduletasks.inlightofthedistributionofslowdowns,however,it dictabilityofresponsetime[silberschatzetal.1994],whichisanuisanceforusers Varianceofslowdown.Thismetricisoftencitedasameasureoftheunpre- severelyimpactedbyqueueingandmigrationpenalties.(seefigures5cand5d). abledelaysexplicitly,weconsiderthenumber(orpercentage)ofprocessesthatare Figure5b). Numberofseverelyslowedprocesses.Inordertoquantifythenumberofnotice-
than:5seconds)aremoreperceivabletousersthandelaysinshortjobs.(see Meanslowdownoflongjobs.Delaysinlongerjobs(thosewithlifetimesgreater ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 19 Figure6). 5.2Sensitivitytomigrationcosts Inthissectionwecomparetheperformanceofthenon-preemptiveandpreemptive strategiesoverarangeofvaluesofr,f,bandm(themigrationcostparameters process)andb(thebandwidthofthenetwork).wechosethememorytransfercost Weconsideredarangeforthexedmigrationcostof:1<f<10seconds. denedinsection4). fromadistributionwiththesameshapeasthedistributionofprocesslifetimes, Thememorytransfercostisthequotientofm(thememorysizeofthemigrant Forthefollowingexperiments,wechosetheremoteexecutioncostr=:3seconds. featureofthisdistributionisthattherearemanyjobswithsmallmemorydemands seconds.theshapeofthisdistributionisbasedonaninformalstudyofmemory-use patternsonthesamemachinesfromwhichwecollectedtracedata.theimportant settingthemeanmemorytransfercost(mmtc)toarangeofvaluesfrom1to64 distributiondoesnotaecttheperformanceofeithermigrationstrategystrongly, butofcoursethemean(mmtc)doeshaveastrongeect. andafewjobswithverylargememorydemands.empirically,theexactformofthis migrationstrategiesusingnormalizedslowdown.specically,foreachoftheeight one-hourrunswecalculatethemean(respectivelystandarddeviation)oftheslowdownimposedonallprocessesthatcompleteduringthehour.foreachrun,we Figures4aand4barecontourplotsoftheratiooftheperformanceofthetwo thentaketheratioofthemeans(standarddeviations)ofthetwostrategies.lastly wetakethegeometricmeanoftheeightratios(fordiscussionofthegeometric mean,see[hennessyandpatterson1990]). tivemigration,namelythexedcost,f,andthemmtc,m=b.thecostofnon- preemptivemigration,r,isxedat0:3seconds.asexpected,increasingeitherthe xedcostofmigrationorthemmtchurtstheperformanceofpreemptivemigration.thecontourlinemarked1:0indicatesthecrossoverwheretheperformance ThetwoaxesinFigure4representthetwocomponentsofthecostofpreemp- non-preemptivemigration.whenthexedcostofmigrationorthemmtcare ofpreemptiveandnon-preemptivemigrationisequal(theratiois1:0).forsmaller valuesofthecostparameters,preemptivemigrationperformsbetter;forexample, ifthexedmigrationcostis0.3secondsandthemmtcis2seconds,thenormalizedmeanslowdownwithpreemptivemigrationisalmost40%lowerthanwith unaectedbythesecostssothenon-preemptivestrategycanbemoreeective. veryhigh,almostallprocessesareineligibleforpreemptivemigration;thus,the preemptivestrategydoesalmostnomigrations.thenon-preemptivestrategyis aregionwherepreemptivemigrationyieldsahighermeanslowdownthannonpreemptivemigration,butalowerstandarddeviation.thereasonforthisisthadowns.thecrossoverpoint wherenon-preemptivemigrationsurpassespreemp- Figure4bshowstheeectofmigrationcostsonthestandarddeviationofslowtivemigration isconsiderablyhigherherethaninfigure4a.thusthereis turnsouttobeshort-lived.theseprocessessuerlargedelays(relativetotheir non-preemptivemigrationoccasionallychoosesaprocessforremoteexecutionthat
20 M.Harchol-BalterandA.B.Downey (a) 10 Ratio of mean slowdowns Fixed migration cost (sec.) 3 1 0.6 X 0.7 0.8 0.9 NON PREEMPTIVE BETTER HERE 1.0 PREEMPTIVE BETTER HERE 1.1 1.2 (b) 0.1 1 2 4 8 16 32 64 Mean memory transfer cost (sec.) 10 Ratio of std of slowdowns 1.0 Fixed migration cost (sec.) 0.9 0.8 3 0.7 0.6 1 0.5 Fig.4.(a)Theperformanceofpreemptivemigrationrelativetonon-preemptivemigrationdeterioratesasthecostofpreemptivemigrationincreases.Thetwoaxesarethetwocomponentsof PREEMPTIVE BETTER EXCEPT UPPER RIGHT X ofslowdownmaygiveabetterindicationofauser'sperceptionofsystemperformancethanmean theparticularsetofparameterswewillconsiderinthenextsection.(b)thestandarddeviation thepreemptivemigrationcost.thecostofnon-preemptivemigrationisheldxed.thexmarks slowdown.bythismetric,thebenetofpreemptivemigrationisevenmoresignicant. 0.1 1 2 4 8 16 32 64 Mean memory transfer (sec.)
runtimes)andaddtothetailofthedistributionofslowdowns.inthenextsection, weshowcasesinwhichthestandarddeviationofslowdownsisactuallyworsewith ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 21 non-preemptivemigrationthanwithnomigrationatall(threeoftheeightruns). tems(seesection4.2)andusethemtoexaminemorecloselytheperformanceof thetwomigrationstrategies.thevalueswechoseare: Inthissectionwechoosemigrationcostparametersrepresentativeofcurrentsys- 5.3Comparisonofpreemptiveandnon-preemptivestrategies r:thecostofremoteexecution,0.3seconds f:thexedcostofpreemptivemigration,0.3seconds m:themeanmemorysizeofmigrantprocesses,1mb b:thememorytransferbandwidth,0.5mbpersecond point(comparedtothecaseofnomigration). withanx.figure5showstheperformanceofthetwomigrationstrategiesatthis InFigures4aand4b,thepointcorrespondingtotheseparametervaluesismarked bylessthan20%formostruns(and40%forthetworunswiththehighestloads). Preemptivemigrationreducesthenormalizedmeanslowdownby50%formost runs(andmorethan60%fortwooftheruns).theperformanceimprovementof preemptivemigrationovernon-preemptivemigrationistypicallybetween35%and Non-preemptivemigrationreducesthenormalizedmeanslowdown(Figure5a) 50%. metricstotrytoquantifythesebenets.figure5bshowsthestandarddeviation and5dexplicitlymeasurethenumberofseverelyimpactedprocesses,accordingto statestheperformancebenetsofpreemptivemigration.wehaveproposedother ofslowdowns,whichreectsthenumberofseverelyimpactedprocesses.figures5c Asdiscussedabove,wefeelthatthemeanslowdown(normalizedornot)under- twodierentthresholdsofacceptableslowdown.bythesemetrics,thebenetsof non-preemptivemigrationappearsmuchgreater.forexampleinfigure5d,inthe migrationingeneralappeargreater,andthediscrepancybetweenpreemptiveand absenceofmigration,7{18%ofprocessesareslowedbyafactorof5ormore. Non-preemptivemigrationisabletoeliminate42{62%ofthese,whichisasignicantbenet,butpreemptivemigrationconsistentlyeliminatesnearlyall(86{97%) severedelays. nomigrationatall.forthepreemptivemigrationstrategy,thisoutcomeisnearly allprocessesinvolved(inexpectation).intheworstcase,then,thepreemptive migrationactuallymakestheperformanceofthesystemworsethaniftherewere impossible,sincemigrationsareonlyperformediftheyimprovetheslowdownsof AnimportantobservationfromFigure5bisthatforseveralruns,non-preemptive formanceasloadincreases(asshowninfigure5).inthepresenceofpreemptive strategywilldonoworsethanthecaseofnomigration(inexpectation). regardlessoftheoverallloadonthesystem. migration,boththemeanandstandarddeviationofslowdownarenearlyconstant, Anotherbenetofpreemptivemigrationisgracefuldegradationofsystemper-
22 M.Harchol-BalterandA.B.Downey 4.0 Mean slowdown no migration non-preemptive, name-based migration preemptive, age-based migration 3.0 2.0 1.0 4.0 0 1 2 3 4 5 6 7 run number Standard deviation of slowdown no migration non-preemptive, name-based migration preemptive, age-based migration 3.0 2.0 1.0 0.0 30% 0 1 2 3 4 5 6 7 run number Processes slowed by a factor of 3 or more no migration non-preemptive, name-based migration preemptive, age-based migration 20% 10% 0% 20% 0 1 2 3 4 5 6 7 run number Processes slowed by a factor of 5 or more no migration non-preemptive, name-based migration preemptive, age-based migration 15% Fig.5.(a)Meanslowdown.(b)Standarddeviationofslowdown.(c)Percentageofprocesses slowedbyafactorof3ormore.(d)percentageofprocessesslowedbyafactorof5ormore. 10% 5% 0% 0 1 2 3 4 5 6 7 run number
Thealternatemetricsdiscussedaboveshedsomelightonthereasonsfortheperformancedierencebetweenpreemptiveandnon-preemptivemigration.Weconsider onapotentialmigrant,and,moreimportantly,inictsdelaysonshortjobsthat twokindsofmistakesamigrationsystemmightmake: areforcedtoshareaprocessorwithacpuhog.undernon-preemptivemigration, thiserroroccurswheneveralong-livedprocessisnotonthename-list,possibly becauseitisanunknownprogramoranunusuallylongexecutionofatypically Failingtomigratelong-livedjobs.Thistypeoferrorimposesmoderateslowdowns 5.4Whypreemptivemigrationoutperformsnon-preemptivemigration ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 23 short-livedprogram.preemptivemigrationcancorrecttheseerrorsbymigrating ancing.undernon-preemptivemigration,thiserroroccurswhenaprocesswhose longjobslaterintheirlives. strategyallbuteliminatesthistypeoferrorbyguaranteeingthattheperformance nameisontheeligiblelistturnsouttobeshort-lived.ourpreemptivemigration migratedprocess,wastesnetworkresources,andfailstoeectsignicantloadbal- Migratingshort-livedjobs.Thistypeoferrorimposeslargeslowdownsonthe ofamigrantimprovesinexpectation. becauseonelongjobonabusymachinewillimpedemanysmalljobs.thiseectis suggeststhatabusyhostislikelytoreceivemanyfuturearrivals. aggravatedbytheserialcorrelationbetweenarrivaltimes(seesection5.1),which Evenoccasionalmistakesoftherstkindcanhavealargeimpactonperformance, jobsformigration.toevaluatethisability,weconsidertheaveragelifetimeofthe processeschosenformigrationundereachpolicy.undernon-preemptivemigration, processesis0:4seconds),andthemedianlifetimeofmigrantswas0:9seconds.the theaveragelifetimeofmigrantprocesseswas2:0seconds(themeanlifetimeforall Thus,animportantfeatureofamigrationpolicyisitsabilitytoidentifylong-lived non-preemptivepolicymigratedabout1%ofalljobs,whichaccountedfor5:7%of thetotalcpu. medianlifetimeofmigrantswas2:0seconds.thepreemptivepolicymigrated4% agelifetimeofmigrantprocessesunderpreemptivemigrationwas4:9seconds;the ofalljobs,butsincethesemigrantswerelong-lived,theyaccountedfor55%ofthe totalcpu. Thepreemptivemigrationpolicywasbetterabletoidentifylongjobs;theaver- identifylongjobsaccuratelyandtomigratethosejobsawayfrombusyhosts. Theseoutliersarereectedinthestandarddeviationofslowdowns becausethe downforallprocesses,butitdidimposelargeslowdownsonsomesmallprocesses. Thustheprimaryreasonforthesuccessofpreemptivemigrationisitsabilityto darddeviationofslowdownsworsethanwithnomigration(seefigure5b).the non-preemptivepolicysometimesmigratesveryshortjobs,itcanmakethestan- Thesecondtypeoferrordidnothaveasgreatanimpactonthemeanslow- age-basedpreemptivemigrationcriterioneliminatesmosterrorsofthistypeby guaranteeingthattheperformanceofthemigrantwillimproveinexpectation. targethostmayhavealowloadwhenamigrationisinitiated,butitsloadmay emptivemigrationthanfornon-preemptivemigration:staleloadinformation.a haveincreasedbythetimethemigrantarrives.thisismorelikelyforapreemptive Thereis,however,onetypeofmigrationerrorthatismoreproblematicforpre-
24 migrationbecausethemigrationtimeislonger.inoursimulations,wefoundthat theseerrorsdooccur,althoughinfrequentlyenoughthattheydonothaveasevere M.Harchol-BalterandA.B.Downey runswithhighloadsitwas0:7%).somewhatmoreoften,3%ofthetime,amigrant hostandfoundthattheloadwashigherthanithadbeenatthesourcehostwhen impactonperformance. migrationbegan.formostruns,thisoccurredlessthan0:5%ofthetime(fortwo processarrivedatatargethostandfoundthattheloadatthetargetwasgreater Specically,wecountedthenumberofmigrantprocessesthatarrivedatatarget thanthecurrentloadatthesource.theseresultssuggestthattheperformanceof worktracthatresultsfromlargememorytransfers.inoursimulations,wedid MOSIX. apreemptivemigrationstrategymightbeimprovedbyareservationsystemasin grationwouldnotbeexcessive.thisassumptionseemstobereasonable:under notmodelnetworkcongestion,ontheassumptionthatthetracgeneratedbymi- ourpreemptivemigrationstrategyfewerthan4%ofprocessesaremigratedonce Oneotherpotentialproblemwithpreemptivemigrationisthevolumeofnet- jobsandmovethemawayfrombusyhosts overcomesitsdisadvantages(longer andfewerthan:25%ofprocessesaremigratedmorethanonce.furthermore,there migrationtimesandstaleloadinformation). isseldommorethanonemigrationinprogressatatime. Insummary,theadvantageofpreemptivemigration itsabilitytoidentifylong inglongjobsandmigratingthemawayfrombusyhostshelpsnotonlythelongjobs (whichrunonmorelightly-loadedhosts)butalsotheshortjobsthatrunonthe andmeasuredtheperformancebenetforeachgroupduetomigration.thenumberofjobsinshortgroupisroughlytentimesthenumberinthemediumgroup, whichinturnisroughlytentimesthenumberinthelonggroup.figure6shows thatmigrationreducesthemeanslowdownofallthreegroups:fornon-preemptive sourcehost.totestthisclaim,wedividedtheprocessesintothreelifetimegroups 5.4.1Eectofmigrationonshortandlongjobs.Wehaveclaimedthatidentify- migrationtheimprovementisthesameforallgroups;underpreemptivemigration thelongjobsenjoyaslightlygreaterbenet. accordingtotheirlifetimes.thusthemeanresidencetimemetricreects,primarily,theperformancebenetforlongjobs.underthemeanresidencetimemetric, group,shortjobs.anothercommonmetric,residencetime,eectivelyweightsjobs alljobs;asaresult,themeanslowdownmetricisdominatedbythemostpopulous temperformance.themetricweareusinghere,slowdown,givesequalweightto Thisbreakdownbylifetimegroupisusefulforevaluatingvariousmetricsofsys- then,preemptivemigrationappearsevenmoreeective. AsderivedinSection3.1,theminimumageforamigrantprocessaccordingtoour 5.5Evaluationofanalyticmigrationcriterion analyticcriterionis wherenistheloadatthesourcehostandmistheloadatthetargethost(including migrationage=migrationcost Minimum n?m
ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 25 Mean slowdown by age group 3.0 2.5 no migration non preemptive migration preemptive migration 2.0 Fig.6.Meanslowdownbrokendownbylifetimegroup. 1.5 1.0 dur <.5s.5s < dur < 5s dur > 5s. Duration Mean slowdown 2.5 best fixed parametric min. age analytic minimum age 2.0 Fig.7.Themeanslowdownforeightruns,usingthetwocriteriaforminimummigrationage. 1.5 Thevalueofthebestxedparameterisshowninparenthesesforeachrun. 1.0 0 1 2 3 4 5 6 7 (0.3) (0.5) (0.5) (0.3) (0.3) (0.5) (0.8) (0.9)
26 thepotentialmigrant). Wecomparetheanalyticcriterionwiththexedparametercriterion: M.Harchol-BalterandA.B.Downey whereisafreeparameter.thisparameterismeanttomodelpreemptivemigrationstrategiesintheliterature,asdiscussedinsection3.2.forcomparison, migrationage=migrationcost Minimum considerableadvantage. thebestxedparameter.thebestxedparametervariesconsiderablyfromrun thesmallestmeanslowdown.ofcourse,thisgivesthexedparametercriteriona wewillusethebestxedparameter,whichis,foreachrun,thevaluethatyields torun,andappearstoberoughlycorrelatedwiththeaverageloadduringtherun (therunsaresortedinincreasingorderoftotalload). Figure7comparestheperformanceoftheanalyticminimumagecriterionwith somany(usuallyhand-tuned)parameters. Wefeelthattheeliminationofonefreeparameterisausefulresultinanareawith isthatitisparameterless,andthereforemorerobustacrossavarietyofworkloads. performanceofthebestxedvaluecriterion.theadvantageoftheanalyticcriterion Theperformanceoftheanalyticcriterionisalwayswithinafewpercentofthe 6.WEAKNESSESOFTHEMODEL thisworkload(seesection3.2). istoolow,andtheparameterusedinmosix(=1:0)istoohigh,atleastfor ThisresultalsosuggeststhattheparameterusedbyKruegerandLivny(=0:1) Oursimulationignoresanumberoffactorsthatwouldaecttheperformanceof migrationinrealsystems: boundjobs,however,cpucontentionhaslittleeectonresponsetime.thesejobs sponsetimenecessarilyimprovesiftheyrunonahostwithalighterload.fori/o wouldbenetlessfrommigration.toseehowlargearolethisplaysinourresults, wenotedthenamesoftheprocessesthatappearmostfrequentlyinourtraces CPU-boundjobsonly.OurmodelconsidersalljobsCPU-bound;thus,theirre- (withcputimegreaterthan1second,sincethesearetheprocessesmostlikelyto bemigrated).themostcommonnameswerecc1plusandcc1,bothofwhichare CPUbound.Nextmostfrequentwere:trn,cpp,ld,jove(aversionofemacs),and isreasonableformanyofthemostcommonjobs.insection7wediscussfurther implicationsofaworkloadincludinginteractive,i/o-bound,andnon-migratable ps.soalthoughsomejobsinourtracesareinrealityinteractive,oursimplemodel newpropertyofprocesslifetimes.inanenvironmentwithadierentdistribution, thisstrategywillnotbeeective. Environment.Ourmigrationstrategytakesadvantageoftheused-better-than- round-robin.otherpolicies,likefeedbackscheduling,canreducetheimpactof longjobsontheperformanceofshortjobs,andtherebyreducetheneedforload balancing.weexplorethisissueinmoredetailinsection7andndthatpreemptive migrationisstillbenecialunderfeedbackscheduling. Localscheduling.Weassumethatlocalschedulingonthehostsissimilarto
ameasureddistributionandthereforeourmodelignoresanycorrelationbetween Memorysize.Oneweaknessofourmodelisthatwechosememorysizesfrom ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 27 lationbetweenmemorysizeandprocesscpuusage.kruegerandlivnyreport conductedaninformalstudyofprocessesinourdepartment,andfoundnocorre- memorysizeandotherprocesscharacteristics.tojustifythissimplication,we asimilarobservation[kruegerandlivny1988].thus,thismaybeareasonable simplication. tracasaresultofprocessmigration.weobserve,however,thatfortheloadlevels wesimulated,migrationsareoccasional(oneeveryfewseconds),andthatthereis seldommorethanonemigrationinprogressatatime. Networkcontention.Ourmodeldoesnotconsidertheeectofincreasednetwork 7.FUTUREWORK InourworkloadmodelwehaveassumedthatallprocessesareCPU-bound.Ofprimaryinterestinfutureworkisincludinginteractive,I/O-bound,andnon-migratable willbesomejobsthatshouldnotorcannotbemigrated.ani/o-boundjob,for example,willnotnecessarilyrunfasteronamorelightly-loadedhost,andmight migratedinteractivejobmightbenetbyrunningonamorelightly-loadedhost runslowerifitismigratedawayfromthediskorotheri/odeviceituses.a Inaworkloadthatincludesinteractivejobs,I/O-boundjobs,anddaemons,there jobsintoourworkload. interactions.finally,somejobs(e.g.manydaemons)cannotbemigratedaway ifitusessignicantcputime,butwillsuerperformancepenaltiesforallfuture fromtheirhosts. thesecostsmightbebasedontherecentbehaviorofthejob;e.g.thenumberand gration,includingnetworkdelays,accesstonon-localdata,etc.theestimatesof migrationcosttheadditionalcoststhatwillbeimposedonthesejobsaftermipropriatelywithinteractiveandi/oboundjobsbyincludinginthedenitionof Thepolicyweproposedforpreemptivemigrationcanbeextendedtodealap- frequencyofi/orequestsandinteractions.jobsthatareexplicitlyforbiddento migratecouldbeassignedaninnitemigrationcost. mightreducetheabilityofthemigrationpolicytomoveworkaroundthenetwork majorityofcputime.thus,evenifthemigrationpolicywereonlyabletomigrate andbalanceloadseectively.however,weobservethatthemajorityoflong-lived jobsare,infact,cpu-bound,anditistheselong-livedjobsthatconsumethe Thepresenceofasetofjobsthatareeitherexpensiveorimpossibletomigrate asubsetofthejobsinthesystem,itcouldstillhaveasignicantload-balancing eect. signicantincpuloadbalancing. generalenvironmentisthatn(thenumberofjobsatthesource)andm(thenumber typesofjobs,sinceonlycpu-boundjobsaectcpucontention,andthereforeare ofjobsatthetargethost)shoulddistinguishbetweencpu-boundjobsandother Anotherwayinwhichtheproposedmigrationpolicyshouldbealteredinamore atthehosts.mostpriorstudiesofloadbalancinghaveassumed,aswedo,thatthe localschedulingisround-robin(orprocessor-sharing).afewassumerst-come- rst-serve(fcfs)scheduling,butfewerstillhavestudiedtheeectoffeedback Anotherimportantconsiderationinloadbalancingistheeectoflocalscheduling
28 scheduling,whereprocessesthathaveusedtheleastcputimearegivenpriority overolderprocesses.wesimulatedfeedbackschedulingandfoundthatitgreatly M.Harchol-BalterandA.B.Downey onload)evenwithoutmigration.thus,thepotentialbenetofeithertypeof reducedmeanslowdown(fromapproximately2:5tobetween1:2and1:7,depending slowdownin5ofthe8runs,andonlydecreasingitby11%inthebestcase(highest load). backscheduling,andfoundthatitoftenmakesthingsworse,increasingthemean migrationisgreatlyreduced. Toevaluateourpreemptivepolicy,wehadtochangethemigrationcriterion Weevaluatedthenon-preemptivemigrationpolicyfromSection5.1.1underfeed- toreecttheeectoflocalscheduling.underprocessorsharing,weassumethat theslowdownimposedonaprocessisequaltothenumberofprocessesonthe andtargethoststhatareyoungerthanthemigrantprocess.usingthiscriterion, processes,sinceolderprocesseshavelowerpriority.thus,wemodiedthemigration host.underfeedbackscheduling,theslowdownisclosertothenumberofyounger preemptivemigrationreducesmeannormalizedslowdownby12{32%,andreduces criterioninsection3.1sothatnandmarethenumberofprocessesatthesource thenumberofseverelyslowedprocesses(slowdowngreaterthan5)by30{60%. realsystemsasitwasinoursimulations.forexample,decay-usageschedulingas interactionorotheri/oaregivenhigherpriority,whichallowsthemtointerfere usedinunixhassomecharacteristicsofbothround-robinandfeedbackpolicies [Epema1995].Youngjobsdohavesomeprecedence,butoldjobsthatperform Anissuethatremainsunresolvediswhetherfeedbackschedulingisaseectivein withshortjobs.inourexperimentsonasparcworkstationrunningsunos,we foundthatalong-runningjobthatperformsperiodici/ocanobtainmorethan50% ofthecputime,evenifitissharingahostwithmuchyoungerprocesses.the morerecentlotteryschedulingbehavesmorelikeprocessor-sharing[waldspurger andweihl1994].tounderstandtheeectoflocalschedulingonloadbalancing requiresaprocessmodelthatincludesinteractionandi/o. 8.CONCLUSIONS Toevaluatemigrationstrategies,itisimportanttomodelthedistributionof jobsareexpectedtobelong-lived.evenalifetimedistributionthatmatchesthe timatethebenetsofpreemptivemigration,becauseitignoresthefactthatold processlifetimesaccurately.assuminganexponentialdistributioncanunderes- Preemptivemigrationoutperformsnon-preemptivemigrationevenwhenmemorytransfercostsarehigh,forthefollowingreason:non-preemptivename-based andevaluatingloadbalancingpolicies. strategieschooseprocessesformigrationthatareexpectedtohavelonglives.if thispredictioniswrong,andaprocessrunslongerthanexpected,itcannotbe migratedaway,andmanysubsequentsmallprocesseswillbedelayed.apreemp- measureddistributioninbothmeanandvariancemaybemisleadingindesigning Migratingalongjobawayfromabusyhosthelpsnotonlythelongjob,but tivestrategyisabletopredictlifetimesmoreaccurately(basedonage)and,more importantly,ifthepredictioniswrong,thesystemcanrecoverbymigratingthe processlater.
alsothemanyshortjobsthatareexpectedtoarriveatthehostinthefuture.a busyhostisexpectedtoreceivemanyarrivalsbecauseoftheserialcorrelation ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 29 Usingthefunctionalformofthedistributionofprocesslifetimes,wehavederived (burstiness)ofthearrivalprocess. Exclusiveuseofmeanslowdownasametricofsystemperformanceunderstates acriterionfortheminimumtimeaprocessmustagebeforebeingmigrated.this benetsofpreemptiveloadbalancing.oneperformancemetricwhichismore thebenetsofloadbalancingasperceivedbyusers,andespeciallyunderstatesthe criterionisparameterlessandrobustacrossarangeofloads. Althoughpreemptivemigrationisdiculttoimplement,severalsystemshave relatedtouserperceptionisthenumberofseverelyslowedprocesses.while chosentoimplementitforreasonsotherthanloadbalancing.ourresultssuggestthesesystemswouldbenetfromincorporatingapreemptiveloadbalancing migrationreducesthembyafactoroften. non-preemptivemigrationeliminateshalfofthesenoticeabledelays,preemptive ACKNOWLEDGMENTS WewouldliketothankTomAndersonandthemembersoftheNOWgroupfor policy. commentsandsuggestionsonourexperimentalsetup.wearealsogreatlyindebted totheanonymousreviewersfromsigmetricsandsosp,andtojohnzahorjan, REFERENCES whosecommentsgreatlyimprovedthequalityofthispaper. Abrahams,D.M.andRizzardi,F.1988.TheBerkeleyInteractiveStatisticalSystem.W. Agrawal,R.andEzzet,A.1987.LocationindependentremoteexecutioninNEST.IEEE Ahmad,I.,Ghafoor,A.,andMehrotra,K.1991.Performancepredictionofdistributed W.NortonandCo. TransactionsonSoftwareEngineering13,8(August),905{912. Barak,A.,Shai,G.,andWheeler,R.G.1993.TheMOSIXDistributedOperating Artsy,Y.andFinkel,R.1989.Designingaprocessmigrationfacility:TheCharlotte loadbalancingonmulticomputersystems.insupercomputing(1991),pp.830{839. experience.ieeecomputer,47{56. Braverman,A.1995.PersonalCommunication. Bonomi,F.andKumar,A.1990.Adaptiveoptimalloadbalancinginanonhomogeneous (October),1232{1250. multiserversystemwithacentraljobscheduler.ieeetransactionsoncomputers39,10 System:LoadBalancingforUNIX.SpringerVerlag,Berlin. Bryant,R.M.andFinkel,R.A.1981.Astabledistributedschedulingalgorithm.In2nd Cabrera,F.1986.Theinuenceofworkloadonloadbalancingstrategies.InProceedings Casas,J.,Clark,D.L.,Konuru,R.,Otto,S.W.,Prouty,R.M.,andWalpole,J. oftheusenixsummerconference(june1986),pp.446{458. InternationalConferenceonDistributedComputingSystems(1981),pp.314{323. Casavant,T.L.andKuhl,J.G.1987.Analysisofthreedynamicdistributedloadbalancingstrategieswithvaryingglobalinformationrequirements.In7thInternational 1995.Mpvm:Amigrationtransparentversionofpvm.ComputingSystems8,2(Spring), ConferenceonDistributedComputingSystems(September1987),pp.185{192. 171{216.
30 M.Harchol-BalterandA.B.Downey Chowdhury,S.1990.Thegreedyloadsharingalgorithm.JournalofParallelandDistributedComputing9,93{99. DePaoli,D.andGoscinski,A.1995.Therhodosmigrationfacility.JournalofSystems andsoftware.submitted.seealsohttp://www.cm.deakin.edu.au/rhodos/. Douglis,F.andOusterhout,J.1991.Transparentprocessmigration:Designalternatives andthespriteimplementation.software{practiceandexperience21,8(august),757{ 785. Downey,A.B.andHarchol-Balter,M.1995.Anoteon\Thelimitedperformance benetsofmigratingactiveprocessesforloadsharing".technicalreportucb/csd-95-888(november),universityofcalifornia,berkeley. Eager,D.L.,Lazowska,E.D.,andZahorjan,J.1986.Adaptiveloadsharinginhomogeneousdistributedsystems.IEEETransactionsonSoftwareEngineering12,5(May), 662{675. Eager,D.L.,Lazowska,E.D.,andZahorjan,J.1988.Thelimitedperformancebenets ofmigratingactiveprocessesforloadsharing.inacmsigmetricsconferenceonmeasuring andmodelingofcomputersystems(may1988),pp.662{675. Epema,D.1995.Ananalysisofdecay-usageschedulinginmultiprocessors.InACMSigmetricsConferenceonMeasurementandModelingofComputerSystems(1995),pp.74{85. Evans,D.J.andButt,W.U.N.1993.Dynamicloadbalancingusingtask-transferprobabilites.ParallelComputing19,897{916. Hac,A.andJin,X.1990.Dynamicloadbalancinginadistributedsystemusingasenderinitiatedalgorithm.JournalofSystemsSoftware11,79{94. Hennessy,J.L.andPatterson,D.A.1990.ComputerArchitectureAQuantitativeApproach.MorganKaufmannPublishers,SanMateo,CA. Krueger,P.andLivny,M.1988.Acomparisonofpreemptiveandnon-preemptiveload distributing.in8thinternationalconferenceondistributedcomputingsystems(june 1988),pp.123{130. Kunz,T.1991.Theinuenceofdierentworkloaddescriptionsonaheuristicloadbalancing scheme.ieeetransactionsonsoftwareengineering17,7(july),725{730. Larsen,R.J.andMarx,M.L.1986.Anintroductiontomathematicalstatisticsandits applications(2nded.).prenticehall,englewoodclis,n.j. Leland,W.E.andOtt,T.J.1986.Load-balancingheuristicsandprocessbehavior.In ProceedingsofPerformance'86andACMSigmetrics,Volume14(1986),pp.54{69. Lin,H.-C.andRaghavendra,C.1993.Astate-aggregationmethodforanalyzingdynamic load-balancingpolicies.inieee13thinternationalconferenceondistributedcomputing Systems(May1993),pp.482{489. Litzkow,M.andLivny,M.1990.ExperiencewiththeCondordistributedbatchsystem. InIEEEWorkshoponExperimentalDistributedSystems(1990),pp.97{101. Litzkow,M.,Livny,M.,andMutka,M.1988.Condor-ahunterofidleworkstations. In8thInternationalConferenceonDistributedComputingSystems(June1988). Livny,M.andMelman,M.1982.Loadbalancinginhomogeneousbroadcastdistributed systems.inacmcomputernetworkperformancesymposium(april1982),pp.47{55. Milojicic,D.S.1993.LoadDistribution:ImplementationfortheMachMicrokernel.PhD Dissertation,UniversityofKaiserslautern. Mirchandaney,R.,Towsley,D.,andStankovic,J.A.1990.Adaptiveloadsharing inheterogeneousdistributedsystems.journalofparallelanddistributedcomputing9, 331{346. Powell,M.andMiller,B.1983.ProcessmigrationsinDEMOS/MP.In6thACMSymposiumonOperatingSystemsPrinciples(November1983),pp.110{119. Prouty,R.1996.PersonalCommunication. Pulidas,S.,Towsley,D.,andStankovic,J.A.1988.Imbeddinggradientestimators inloadbalancingalgorithms.in8thinternationalconferenceondistributedcomputing Systems(June1988),pp.482{490.
Rommel,C.G.1991.Theprobabilityofloadbalancingsuccessinahomogeneousnetwork. Rosin,R.F.1965.Determiningacomputingcenterenvironment.Communicationsofthe IEEETransactionsonSoftwareEngineering17,922{933. ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 31 Silberschatz,A.,Peterson,J.,andGalvin,P.1994.OperatingSystemConcepts,4th Svensson,A.1990.History,anintelligentloadsharinglter.InIEEE10thInternational ACM8,7. Tanenbaum,A.,vanRenesse,R.,vanStaveren,H.,andSharp,G.1990.Experiences Edition.Addison-Wesley,Reading,MA. Theimer,M.M.,Lantz,K.A.,andCheriton,D.R.1985.Preemptableremoteexecution ConferenceonDistributedComputingSystems(1990),pp.546{553. withtheamoebadistributedoperatingsystem.communicationsoftheacm,336{346. facilitiesforthev-system.in10thacmsymposiumonoperatingsystemsprinciples Vahdat,A.M.,Ghormley,D.P.,andAnderson,T.E.1994.Ecient,portable,and Vahdat,A.1995.PersonalCommunication. Thiel,G.1991.Locusoperatingsystem,atransparentsystem.ComputerCommunications14,6,336{346. (December1985),pp.2{12. Waldspurger,C.A.andWeihl,W.E.1994.Lotteryscheduling:Flexibleproportionalshareresourcemanagement.InProceedingsoftheFirstSymposiumonOperatingSystem robustextensionofoperatingsystemfunctionality.technicalreportucb//csd-94-842, Wang,J.,Zhou,S.,K.Ahmed,andLong,W.1993.LSBATCH:Adistributedloadsharing UniversityofCalifornia,Berkeley. Wang,Y.-T.andMorris,R.J.1985.Loadsharingindistributedsystems.IEEETransactionsonComputersc-94,3(March),204{217. batchsystem.technicalreportcsri-286(april),computersystemsresearchinstitute, DesignandImplementation(November1994),pp.1{11. Zayas,E.R.1987.Attackingtheprocessmigrationbottleneck.In11thACMSymposium UniversityofToronto. Zhang,Y.,Hakozaki,K.,Kameda,H.,andShimizu,K.1995.Aperformancecomparison onoperatingsystemsprinciples(1987),pp.13{24. ofadaptiveandstaticloadbalancinginheterogeneousdistributedsystems.inproceedings Zhou,S.andFerrari,D.1987.Ameasurementstudyofloadbalancingperformance.In Zhou,S.1987.Performancestudiesfordynamicloadbalancingindistributedsystems. Ph.D.Dissertation,UniversityofCalifornia,Berkeley. ofthe28thannualsimulationsymposium(april1995),pp.332{340. Zhou,S.,Wang,J.,Zheng,X.,andDelisle,P.1993.Utopia:aload-sharingfacil- IEEE7thInternationalConferenceonDistributedComputingSystems(October1987), ityforlargeheterogeneousdistributedcomputingsystems.software{practiceandex- peience23,2(december),1305{1336. pp.490{497.