Similar documents
Exploiting Process Lifetime Distributions for Dynamic Load Balancing

Lecture Outline Overview of real-time scheduling algorithms Outline relative strengths, weaknesses

Chapter 19: Real-Time Systems. Overview of Real-Time Systems. Objectives. System Characteristics. Features of Real-Time Systems

Operating Systems. III. Scheduling.

Real-Time Scheduling (Part 1) (Working Draft) Real-Time System Example

Deciding which process to run. (Deciding which thread to run) Deciding how long the chosen process can run

Simplest Scalable Architecture

Road Map. Scheduling. Types of Scheduling. Scheduling. CPU Scheduling. Job Scheduling. Dickinson College Computer Science 354 Spring 2010.

3.2 LOGARITHMIC FUNCTIONS AND THEIR GRAPHS. Copyright Cengage Learning. All rights reserved.

Scheduling. Scheduling. Scheduling levels. Decision to switch the running process can take place under the following circumstances:

Betting with the Kelly Criterion

SIMULATION OF LOAD BALANCING ALGORITHMS: A Comparative Study

Scheduling Allowance Adaptability in Load Balancing technique for Distributed Systems

Logs Transformation in a Regression Equation

International Journal of Innovative Research in Science, Engineering and Technology Vol. 2, Issue 5, May 2013

SUMAN DUVVURU STAT 567 PROJECT REPORT

CPU Scheduling. CPU Scheduling

2.500 Threshold e Threshold. Exponential phase. Cycle Number

OPERATING SYSTEMS SCHEDULING

LOAD BALANCING TECHNIQUES

Module 6. Embedded System Software. Version 2 EE IIT, Kharagpur 1

The Case for Massive Arrays of Idle Disks (MAID)

CTERA Cloud Care. Support Services. Mar Version , CTERA Networks. All rights reserved.

Scheduling. Yücel Saygın. These slides are based on your text book and on the slides prepared by Andrew S. Tanenbaum

Chapter 12: Multiprocessor Architectures. Lesson 01: Performance characteristics of Multiprocessor Architectures and Speedup

Corporate Defaults and Large Macroeconomic Shocks

Embedded Systems Lecture 9: Reliability & Fault Tolerance. Björn Franke University of Edinburgh

Dynamic Load Balancing of SSH Sessions Using User-Specific Selection Policies

Objectives. Chapter 5: Process Scheduling. Chapter 5: Process Scheduling. 5.1 Basic Concepts. To introduce CPU scheduling

Bus u i s n i e n s e s s s Cas a e s, e, S o S l o u l t u io i n o n & A pp p r p oa o c a h

SOFTWARE FOR GENERATION OF SPECTRUM COMPATIBLE TIME HISTORY

Mathematical Harmonies Mark Petersen

The Effect of Network Cabling on Bit Error Rate Performance. By Paul Kish NORDX/CDT

Kappa: A system for Linux P2P Load Balancing and Transparent Process Migration

Virtualization: Concepts, Applications, and Performance Modeling

SEQUENCES ARITHMETIC SEQUENCES. Examples

Characterizing Task Usage Shapes in Google s Compute Clusters

Performance Analysis of Web based Applications on Single and Multi Core Servers

Characterizing Task Usage Shapes in Google s Compute Clusters

Exercises : Real-time Scheduling analysis

Fault Characteristics in Electrical Equipment

Chapter 5 Process Scheduling

CPU Scheduling. Core Definitions

SIDN Server Measurements

Pre-Session Review. Part 2: Mathematics of Finance

CHAPTER 5 WLDMA: A NEW LOAD BALANCING STRATEGY FOR WAN ENVIRONMENT

Detecting Flooding Attacks Using Power Divergence

A Policy-Based Admission Control Scheme for Voice over IP Networks

Vertical Scaling of Oracle 10g Performance on Red Hat Enterprise Linux 5 on Intel Xeon Based Servers. Version 1.0

Performance Test Process

Mathematics Review for MS Finance Students

Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005

Web Application Testing. Web Performance Testing

Load Balancing in Distributed Web Server Systems With Partial Document Replication

Evaluating Trading Systems By John Ehlers and Ric Way

MOSIX: High performance Linux farm

Operating System: Scheduling

CHAPTER 3 CALL CENTER QUEUING MODEL WITH LOGNORMAL SERVICE TIME DISTRIBUTION

Market Technician, TradeStation Labs

Real-time PCR: Understanding C t

QoS-Driven Server Migration for Internet Data Centers

How To Test A Web Server

SIMPLIFIED PERFORMANCE MODEL FOR HYBRID WIND DIESEL SYSTEMS. J. F. MANWELL, J. G. McGOWAN and U. ABDULWAHID

Web Load Stress Testing

Real-time KVM from the ground up

UNIVERSITY of MASSACHUSETTS DARTMOUTH Charlton College of Business Decision and Information Sciences Fall 2010

Discrete-Event Simulation

Process Migration and Load Balancing in Amoeba

Advanced Statistical Analysis of Mortality. Rhodes, Thomas E. and Freitas, Stephen A. MIB, Inc. 160 University Avenue. Westwood, MA 02090

QUICKCLOSE HP35S Surveying Programs - Version 1.1

So in order to grab all the visitors requests we add to our workbench a non-test-element of the proxy type.

Hybrid processing of SCADA and synchronized phasor measurements for tracking network state

Fault-Tolerant Framework for Load Balancing System

Huawei Enterprise Warranty for Barebone Servers

ENSC 427: Communication Networks. Analysis of Voice over IP performance on Wi-Fi networks

Lecture 3 Theoretical Foundations of RTOS

BiDAl: Big Data Analyzer for Cluster Traces

How To Improve Performance On A Single Chip Computer

Distinguishing between FE and DDoS using Randomness Check

HSR HOCHSCHULE FÜR TECHNIK RA PPERSW I L

2. The Vector Network Analyzer

HP Windows 7 Onsite Upgrade Service

Chapter 10: Fixed Income Analysis. Joel Barber. Department of Finance. Florida International University. Miami, FL 33199

Waiting Times Chapter 7

Transcription:

DynamicLoadBalancing ExploitingProcessLifetimeDistributionsfor MorHarchol-Balter and AllenB.Downey tionwhetherpreemptivemigration(migratingactiveprocesses)isnecessary,orwhetherremote WeconsiderpoliciesforCPUloadbalancinginnetworksofworkstations.Weaddresstheques- UniversityofCalifornia,Berkeley execution(migratingprocessesonlyatthetimeofbirth)issucientforloadbalancing.weshow thatresolvingthisisssueisstronglytiedtounderstandingtheprocesslifetimedistribution.our measurementsindicatethatthedistributionoflifetimesforunixprocessispareto(heavy-tailed), withaconsistentfunctionalformoveravarietyofworkloads.weshowhowtoapplythisdistributiontoderiveapreemptivemigrationpolicythatrequiresnohand-tunedparameters.weusea trace-drivensimulationtoshowthatourpreemptivemigrationstrategyisfarmoreeectivethan remoteexecution,evenwhenthememorytransfercostishigh. CategoriesandSubjectDescriptors:unknown[unknown]:unknown unknown bution workloadmodeling,trace-drivensimulation,networkofworkstations,heavy-tailed,paretodistri- GeneralTerms:unknown AdditionalKeyWordsandPhrases:Loadbalancing,loadsharing,migration,remoteexecution, 1.INTRODUCTION Mostsystemsthatperformloadbalancinguseremoteexecution(i.e.non-preemptive migration)basedonaprioriknowledgeofprocessbehavior,oftenintheformofa listofprocessnameseligibleformigration.althoughsomesystemsarecapableof NSFgrantnumberCCR-9201092.AllenDowneypartiallysupportedbyNSF(DARA)grant MorHarchol-BaltersupportedbyNationalPhysicalScienceConsortium(NPSC)Fellowshipand suchaspreservingautonomy.apreviousanalyticstudybyeageretal.discourages migratingactiveprocesses,mostdosoonlyforreasonsotherthanloadbalancing, Address:fharchol,downeyg@cs.berkeley.edu,UniversityofCalifornia,Berkeley,CA94720. DMW-8919074. AnearlierversionofthispaperappearedintheProceedingsoftheACMSigmetricsConference Permissiontomakedigitalorhardcopiesofpartorallofthisworkforpersonalorclassroomuseis onmeasurementandmodelingofcomputersystems(may23-26,1996)pp.13-24. grantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprotordirectcommercial advantageandthatcopiesshowthisnoticeontherstpageorinitialscreenofadisplayalong withthefullcitation.copyrightsforcomponentsofthisworkownedbyothersthanacmmust servers,toredistributetolists,ortouseanycomponentofthisworkinotherworks,requiresprior behonored.abstractingwithcreditispermitted.tocopyotherwise,torepublish,toposton specicpermissionand/orafee.permissionsmayberequestedfrompublicationsdept,acm Inc.,1515Broadway,NewYork,NY10036USA,fax+1(212)869-0481,orpermissions@acm.org.

2implementingpreemptivemigrationforloadbalancing,showingthattheadditional performancebenetofpreemptivemigrationissmallcomparedwiththebenetof M.Harchol-BalterandA.B.Downey simplenon-preemptivemigrationschemes[eageretal.1988].butsimulationstudies,whichcanusemorerealisticworkloaddescriptions,andimplementedsystems haveshowngreaterbenetsforpreemptivemigration[kruegerandlivny1988] 1.1Loadbalancingtaxonomy andatrace-drivensimulationtoinvestigatetheseconictingresults. Onanetworkofsharedprocessors,CPUloadbalancingistheideaofmigratingprocessesacrossthenetworkfromhostswithhighloadstohostswithlowerloads.The andimprovetheutilizationoftheprocessors.analyticmodelsandsimulationstud- motivationforloadbalancingistoreducetheaveragecompletiontimeofprocesses ieshavedemonstratedtheperformancebenetsofloadbalancing,andtheseresults havebeenconrmedinexistingdistributedsystems(seesection1.4). determineswhenmigrationsoccurandwhichprocessesaremigrated.thisisthe questionweaddressinthispaper.theotherhalfofaloadbalancingstrategyis thelocationpolicy theselectionanewhostforthemigratedprocess.previous workhassuggestedthatsimplychoosingthetargethostwiththeshortestcpurun Animportantpartoftheloadbalancingstrategyisthemigrationpolicy,which [Baraketal.1993].Thispaperusesameasureddistributionofprocesslifetimes execution,alsocallednon-preemptivemigration,inwhichsomenewprocessesare relativeunimportanceoflocationpolicy. queueisbothsimpleandeective[zhou1987][kunz1991].ourworkconrmsthe (possiblyautomatically)executedonremotehosts,andpreemptivemigration,in whichrunningprocessesmaybesuspended,movedtoaremotehost,andrestarted. Innon-preemptivemigrationonlynewbornprocessesaremigrated. Processmigrationforpurposesofloadbalancingcomesintwoforms:remote Implicitmigrationpoliciesmayormaynotuseaprioriinformationaboutthe functionofprocesses,howlongtheywillrun,etc. Loadbalancingmaybedoneexplicitly(bytheuser)orimplicitly(bythesystem). mationaboutjoblifetimes.thisinformationisoftenimplementedasaneligibility lifetimeofprocesses,implicitnon-preemptivepoliciesrequiresomeaprioriinfor- listthatspeciesbyprocessnamewhichprocessesareworthmigrating[svensson 1990][Zhouetal.1993]. Sincethecostofremoteexecutionisusuallysignicantrelativetotheaverage sincethisisoftendiculttomaintainandpreemptivestrategiescanperformwell withoutit.thesesystemsuseonlysystem-visibledatalikethecurrentageofeach processoritsmemorysize. Incontrast,mostpreemptivemigrationpoliciesdonotuseaprioriinformation, balancingstrategiesthatassumenoaprioriinformationaboutprocesses. (1)Ispreemptivemigrationworthwhile,giventheadditionalcost(CPUandlatency)associatedwithmigratinganactiveprocess? Thispaperexaminestheperformancebenetsofpreemptive,implicitload Weanswerthefollowingtwoquestions: (2)Whichactiveprocesses,ifany,areworthmigrating?

1.2Processmodel Inourmodel,processesusetworesources:CPUandmemory(wedonotconsider ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 3 scheduling;insection7wediscusstheeectofotherlocalschedulingpolicies.since completion).weassumethatprocessorsimplementtime-sharingwithround-robin I/O).Thus,weuse\age"tomeanCPUage(theCPUtimeaprocesshasused imposedonaprocessis processesmaybedelayedwhileontherunqueueorwhilemigrating,theslowdown thusfar)and\lifetime"tomeancpulifetime(thetotalcputimefromstartto wherewalltimeisthetotaltimeaprocessspendsrunning,waitinginqueue,or migrating. Slowdownofprocessp=wall-time(p) 1.3Outline CPU-time(p) Theeectivenessofloadbalancing eitherbyremoteexecutionorpreemptive balancingstrategies. trace-drivensimulationtoevaluatetheimpactofthisworkloadonproposedload observationsabouttheworkloadonanetworkofunixworkstations,andusesa migration dependsstronglyonthenatureoftheworkload,includingthedistributionofprocesslifetimesandthearrivalprocess.thispaperpresentsempiricabutionispredictablewithgoodnessoftgreaterthan99%andconsistentacross avarietyofmachinesandworkloads.asaruleofthumb,theprobabilitythata workloadsinanacademicenvironment,includinginstructionalmachines,research machines,andmachinesusedforsystemadministration.wendthatthedistri- Section2presentsastudyofthedistributionofprocesslifetimesforavarietyof processwithcpuageofonesecondusesmorethantsecondsoftotalcputime is1=t(seefigure1). 1986],butthispriorworkhasbeenincorporatedinfewsubsequentanalyticand simulatorstudiesofloadbalancing.thisomissionisunfortunate,sincetheresults ofthesearesensitivetothelifetimemodel(seesection2.2). OurmeasurementsareconsistentwiththoseofLelandandOtt[LelandandOtt loadbalancing: Theysuggestthatitispreferabletomigrateolderprocessesbecausetheseprocesseshaveahigherprobabilityoflivinglongenough(eventuallyusingenough CPU)toamortizetheirmigrationcost. eligibilityofaprocessformigrationasafunctionofitscurrentage,migration Ourobservationsoflifetimedistributionshavethefollowingconsequencesfor Afunctionalmodelofthedistributionprovidesananalytictoolforderivingthe withoutmigration.accordingtothiscriterion,aprocessiseligibleformigration slowdownimposedonamigrantprocessislowerinexpectationthanitwouldbe cost,andtheloadsatitssourceandtargethost. onlyifits InSection3wederiveamigrationeligibilitycriterionthatguaranteesthatthe CPUage>1 n?mmigrationcost

4wheren(respectivelym)isthenumberofprocessesatthesource(target)host. InSection5weuseatrace-drivensimulationtocompareourpreemptivemigrationpolicywithanon-preemptivepolicybasedonname-lists.Thesimulatoruses starttimesanddurationsfromtracesofarealsystem,andmigrationcostschosen fromameasureddistribution. M.Harchol-BalterandA.B.Downey evenwithsurprisinglylargemigrationcosts,despiteseveralconservativeassumptionsthatgivenon-preemptivemigrationanunfairadvantage. migrationstrategiesinmoredetail.wendthatpreemptivemigrationreduces migration.wealsoproposeseveralalternativemetricsintendedtomeasureusers' themeandelay(queueingandmigration)by35{50%,comparedtonon-preemptive Nextwechooseaspecicmodelofpreemptiveandnon-preemptivemigration migrationcostontherelativeperformanceofthetwostrategies.notsurprisingly, Nevertheless,preemptivemigrationperformsbetterthannon-preemptivemigration wendthatasthecostofpreemptivemigrationincreases,itbecomeslesseective. Weusethesimulatortorunthreeexperiments:rstweevaluatetheeectof costsbasedonrealsystems(seesection4),andusethismodeltocomparethetwo perceptionofsystemperformance.bythesemetrics,theadditionalbenetsof preemptivemigrationcomparedtonon-preemptivemigrationappearevenmore moreeectivethanevenawell-tunednon-preemptivemigrationpolicy.insection5.5weusethesimulatortocompareourpreemptivemigrationstrategywith previouslyproposedpreemptivestrategies. InSection5.4wediscussindetailwhyasimplepreemptivemigrationpolicyis signicant. 1.4Relatedwork insection7andconclusionsinsection8. WenishwithacriticismofourmodelinSection6,adiscussionoffuturework jobs,fewhaveimplementedimplicitloadbalancingpolicies.mostsystemsonly allowforexplicitloadbalancing.thatis,thereisnoloadbalancingpolicy;theuser decideswhichprocessestomigrate,andwhen.examplesincludeaccent[zayas 1987],Locus[Thiel1991],Utopia[Zhouetal.1993],DEMOS/MP[Powelland 1.4.1Systems.Althoughseveralsystemshavethemechanismtomigrateactive preemptivepolicies(activeprocessesareonlymigratedforpurposesotherthanload [DePaoliandGoscinski1995],andMIST[Casasetal.1995]. Miller1983],V[Theimeretal.1985],NEST[AgrawalandEzzet1987],RHODOS [Tanenbaumetal.1990],Charlotte[ArtsyandFinkel1989],Sprite[Douglisand balancing,suchaspreservingworkstationautonomy).examplesincludeamoeba Ousterhout1991],Condor[Litzkowetal.1988],andMach[Milojicic1993].In Afewsystemshaveimplicitloadbalancingpolicies,howevertheyarestrictlynon- aboutprocesses;e.g.,explicitknowledgeabouttheruntimesofprocessesoruserprovidedlistsofmigratableprocesses[agrawalandezzet1987][litzkowandlivny 1990][DouglisandOusterhout1991][Zhouetal.1993]. general,non-preemptiveloadbalancingpoliciesdependonaprioriinformation thattheirschemeiseectiveandrobust. loadbalancingismosix[baraketal.1993].ourresultssupportthemosixclaim Oneexistingsystemthathasimplementedimplementedautomatedpreemptive

ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 5 1.4.2Studies.Althoughfewsystemsincorporatemigrationpolicies,therehave beenmanysimulationandanalyticalstudiesofvariousmigrationpolicies.most ofthesestudieshavefocusedonloadbalancingbyremoteexecution[livnyand Melman1982][WangandMorris1985][CasavantandKuhl1987][Zhou1987][Pulidasetal.1988][Kunz1991][BonomiandKumar1990][EvansandButt1993][Lin andraghavendra1993][mirchandaneyetal.1990][zhangetal.1995][zhouand Ferrari1987][HacandJin1990][Eageretal.1986]. Onlyafewstudiesaddresspreemptivemigrationpolicies[LelandandOtt1986] [KruegerandLivny1988].TheLelandandOttmigrationpolicyisalsoagebased, butdoesn'ttakemigrationcostintoaccount. Eageret.al.,[Eageretal.1988],concludethattheadditionalperformancebenet ofpreemptivemigrationistoosmallcomparedwiththebenetofnon-preemptive migrationtomakepreemptivemigrationworthwhile.thisresulthasbeenwidely cited,andinseveralcasesusedtojustifythedecisionnottoimplementpreemptive migration,asintheutopiasystem,[zhouetal.1993].ourworkdiersfrom [Eageretal.1988]inbothsystemmodelandworkloaddescription.[Eageretal. 1988]modelaserverfarminwhichincomingjobshavenoanityforaparticular processor,andthusthecostofinitialplacement(remoteexecution)isfree.thisis dierentfromourmodel,anetworkofworkstations,inwhichincomingjobsarrive ataparticularhostandthecostofmovingthemaway,evenbyremoteexecution, issignicantcomparedtomostprocesslifetimes.also,[eageretal.1988]usea degeneratehyperexponentialdistributionoflifetimesthatincludesfewjobswith non-zerolifetimes.whenthecoecientofvariationofthisdistributionmatches thedistributionsweobserved,fewerthan4%ofthesimulatedprocesseshavenonzerolifetimes.withsofewjobs(andbalancedinitialplacement)thereisseldom anyloadimbalanceinthesystem,andthuslittlebenetforpreemptivemigration. Furthermore,the[Eageretal.1988]processlifetimedistributionisexponentialfor jobswithnon-zerolifetimes,theconsequencesofwhichwediscussinsection2.2. Foramoredetailedexplanationofthisdistributionanditseectonthestudy,see [DowneyandHarchol-Balter1995]. KruegerandLivnyinvestigatethebenetsofsupplementingnon-preemptivemigrationwithpreemptivemigrationandndthatpreemptivemigrationisworthwhile.Theyuseahyperexponentiallifetimedistributionthatapproximatesclosely thedistributionweobserved;asaresult,theirndingsarelargelyinaccordwith ours.onedierencebetweentheirworkandoursisthattheyusedasynthetic workloadwithpoissonarrivals.theworkloadweobserved,andusedinourtracedrivensimulations,exhibitsserialcorrelation;i.e.itismoreburstythanapoisson process.anotherdierenceisthattheirmigrationpolicyrequiresseveralhandtunedparameters.insection3.1weshowhowtousethedistributionoflifetimes toeliminatetheseparameters. Likeus,BryantandFinkeldiscussthedistributionofprocesslifetimesandits eectonpreemptivemigrationpolicy,buttheirhypotheticaldistributionsarenot basedonsystemmeasurements[bryantandfinkel1981].alsolikeus,theychoose migrantprocessesonthebasisofexpectedslowdownonthesourceandtargethosts, buttheirestimationofthoseslowdownsisverydierentfromours.inparticular, theyusethedistributionofprocesslifetimestopredictahost'sfutureloadasa functionofitscurrentloadandtheagesoftheprocessesrunningthere.wehave

6examinedthisissueandfound(1)thatthismodelfailstopredictfutureloads becauseitignoresfuturearrivals,and(2)thatcurrentloadisthebestpredictorof M.Harchol-BalterandA.B.Downey futureload(seesection3.1).thus,inourestimatesofslowdown,weassumethat thefutureloadonahostisequaltothecurrentload. 2.DISTRIBUTIONOFLIFETIMES Thegeneralshapeofthedistributionofprocesslifetimesinanacademicenvironmenthasbeenknownforalongtime[Rosin1965]:therearemanyshortjobsand afewlongjobs,andthevarianceofthedistributionisgreaterthanthatofan theircurrentage[cabrera1986].thatsameyear,lelandandottproposeda functionalformfortheprocesslifetimedistribution,basedonmeasurementsofthe lifetimesof9.5millionunixprocessesbetween1984and1985[lelandandott 1986].TheyconcludethatprocesslifetimeshaveaUBNE(used-better-than-new- In1986,CabrerameasuredUNIXprocessesandfoundthatover40%doubled exponentialdistribution. aprocess,thegreateritsexpectedremainingcpulifetime.specically,theynd in-expectation)typeofdistribution.thatis,thegreaterthecurrentcpuageof \longprocesseshaveexponentialservicetimes."manysubsequentstudiesassume anexponentiallifetimedistribution. thatfort>3seconds,theprobabilityofaprocess'slifetimeexceedingtseconds isrtk,where?1:25<k<?1:05andrnormalizesthedistribution. Incontrast,Rommel[Rommel1991]claimsthathismeasurementsshowthat functionalformproposedbylelandandotttstheobserveddistributionswell,for processeswithlifetimesgreaterthan1second.thisfunctionalformisconsistent policies,weperformedanindependentstudyofthisdistribution,andfoundthatthe acrossavarietyofmachinesandworkloads,andalthoughtheparameter,k,varies Becauseoftheimportanceoftheprocesslifetimedistributionforloadbalancing from-1.3to-.8,itisgenerallynear-1.thus,asaruleofthumb, Theprobabilitythataprocesswithage1secondusesatleastTsecondsoftotal TheprobabilitythataprocesswithageTsecondsusesatleastanadditional TsecondsofCPUtimeisabout1=2.Thus,themedianremaininglifetimeofa processisequaltoitscurrentage. CPUtimeisabout1=T. tionpolicies. served.section2.2discussesothermodelsforthedistributionoflifetimes,and arguesthattheparticularshapeofthisdistributioniscriticalforevaluatingmigra- Section2.1describesourmeasurementsandthedistributionoflifetimesweob- 2.1Lifetimedistributionwhenlifetime>1s TodeterminethedistributionoflifetimesforUNIXprocesses,wemeasuredthe lifetimesofoveronemillionprocesses,generatedfromavarietyofacademicworkloads,includinginstructionalmachines,researchmachines,andmachinesusedfor systemadministration.weobtainedourdatausingtheunixcommandlastcomm, showsonlyprocesseswhoselifetimesexceedonesecond.thedotted(heavy)line whichoutputsthecputimeusedbyeachcompletedprocess. Figure1showsthedistributionoflifetimesfromoneofthemachines.Theplot

ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 7 1.0 Distribution of process lifetimes (fraction of processes with duration > T) 0.8 0.6 0.4 0.2 0.0 1 2 4 8 16 32 64 Duration (T secs.) 1 Distribution of process lifetimes (log plot) (fraction of processes with duration > T) 1/2 1/4 1/8 1/16 onmachinepomid-semester.thedotted(thicker)lineshowsthemeasureddistribution;thesolid Fig.1.(a)Distributionoflifetimesforprocesseswithlifetimesgreaterthan1second,observed 1/32 (thinner)lineshowstheleastsquarescurvet.(b)thesamedistributiononalog-logscale.the straightlineinlog-logspaceindicatesthatthedistributioncanbemodeledbytk,wherekisthe 1/64 slopeoftheline. 1 2 4 8 16 32 64 Duration (T secs.)

8showsthemeasureddistribution;thesolid(thinner)lineshowstheleast-squarest tothedatausingtheproposedfunctionalformprflifetime>tg=tk. M.Harchol-BalterandA.B.Downey well.incontrast,figure2showsthatitisimpossibletondanexponentialcurve thattsthedistributionoflifetimesweobserved. Byvisualinspection,itisclearthattheproposedmodeltstheobserveddata oftheformtk,withkvaryingfrom?1:3to?0:8fordierentmachines.table1 showsthevalueoftheparameterforeachmachinewestudied,estimatedbyan iterativelyweightedleast-squarest(withnointercept,inaccordancewiththe functionalmodel).wecalculatedtheseestimateswiththeblsscommandrobust Forallthemachineswestudied,thedistributionofprocesslifetimestsacurve shownhereindicatethatthettedcurveaccountsforgreaterthan99%ofthe ofcertainty).ther2valueindicatesthegoodnessoftofthemodel thevalues thatparameter(alloftheseparametersarestatisticallysignicantatahighdegree [AbrahamsandRizzardi1988]. variationoftheobservedvalues.thus,thegoodnessoftofthesemodelsishigh Thestandarderrorassociatedwitheachestimategivesacondenceintervalfor andconditionaldistributionfunctionforprocesslifetimes.thesecondcolumn (foranexplanationofr2values,see[larsenandmarx1986]). showsthesefunctionswhenk=?1,whichwewillassumeforouranalysisin Section3. Table2showsthecumulativedistributionfunction,probabilitydensityfunction, thatitsmoments(mean,variance,etc.)areinnite.ofcourse,sincetheobserved distributions,though,becausecalculatedmomentstendtobedominatedbyafew distributionshavenitesamplesize,theyhavenitemean(0.4seconds)andcoecientofvariation(5{7).onemustbecautiouswhensummarizinglong-tailed Thefunctionalformweareproposing(thetteddistribution)hastheproperty likethemedian,ortheestimatedparameterk)tosummarizedistributions,rather outliers.inouranalysesweusemorerobustsummarystatistics(orderstatistics thanmoments. second,wedidnotndaconsistentfunctionalform;however,forthemachines westudiedtheseprocesseshadanevenlowerhazardratethanthoseofage>1 second.thatis,theprobabilitythataprocessofaget<1secondslivesanothert secondsisalwaysgreaterthan1=2.thusforjobswithlifetimeslessthan1second, 2.1.1Processwithlifetime<1second.Forprocesseswithlifetimeslessthan1 Manypriorstudiesofprocessmigrationassumeanexponentialdistributionofprocesslifetimes,bothinanalyticalpapers[LinandRaghavendra1993][Mirchandaney andmelman1982][zhangetal.1995][chowdhury1990].thereasonsforthisas- 1991][Pulidasetal.1988][WangandMorris1985][EvansandButt1993][Livny willnotaecttheresultsofloadbalancingstudies. lifetimedistributionisinfactnotexponential,assuminganexponentialdistribution sumptioninclude:(1)analytictractability,and(2)thebeliefthateveniftheactual 2.2Whythedistributioniscritical themedianremaininglifetimeisgreaterthanthecurrentage. etal.1990][eageretal.1986][ahmadetal.1991]andinsimulationstudies[kunz Regardingtherstpoint,althoughthefunctionalformthatweandLelandand

ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 9 NameNumberNumberEstimated of Host po1 77440 Procs>1sec of 4107 Distrib. T?0:97 Error.0160.997 Std R2 po2 po3 cory pors 154368 111997 182523 11468 14253 7524 T?1:22 T?1:27 T?0:88.0120.999 bugs 141950 83600 10402 4940 T?0:94 T?0:82.0210.997.0300.982.0150.997 Table1.Theestimatedlifetimedistributionforeachmachinemeasured,andtheassociated faith 76507 3328 T?0:78.0070.999 goodnessoftstatistics.descriptionofmachines:poisaheavily-useddecserver5000/240,.0450.964 usedprimarilyforundergraduatecoursework.po1,po2,andpo3refertomeasurementsmade onpomid-semester,late-semester,andend-semester.coryisaheavily-usedmachine,usedfor courseworkandresearch.porscheisalessfrequently-usedmachine,usedprimarilyforresearch onscienticcomputing.bugsisaheavily-usedmachine,usedprimarilyformultimediaresearch. Faithisaninfrequently-usedmachine,usedbothforvideoapplicationsandsystemadministration. 1 Observed distribution and two curve fits (fraction of processes with duration > T) 1/2 1/4 k T fit (one parameter) 1/8 exponential fit (two parameters) 1/16 Fig.2.Inlog-logspace,thisplotshowsthedistributionoflifetimesforthe13000processesfrom ourtraceswithlifetimesgreaterthansecond,andtwoattemptstotacurvetothisdata.one 1/32 ofthetsisbasedonthemodelproposedinthispaper,tk.theothertisanexponential curve,ce?t.althoughtheexponentialcurveisgiventhebenetofanextrafreeparameter, 1/64 iteratively-weightedleastsquares. itfailstomodeltheobserveddata.theproposedmodeltswell.bothtswereperformedby 1 2 4 8 16 32 Duration (T secs.)

10 M.Harchol-BalterandA.B.Downey Distributionoflifetimesforprocesses>1secWhenk=?1 PrfT<L<T+dTsecg=?kTk?1dT PrfL>bsecjage=ag=?bak PrfL>Tsecg=Tk 1=T2dT distributionfunctionfortheprocesslifetimel.thesecondcolumnshowsthefunctionalformof Table2.Thecumulativedistributionfunction,probabilitydensityfunction,andconditional eachforthetypicalvaluek=1. a=b tion,itneverthelesslendsitselftosomeformsofanalysis,asweshowinsection3.1. Ottproposecannotbeusedinqueueingmodelsaseasilyasanexponentialdistribu- migrationpolicydependsonhowtheexpectedremaininglifetimeofajobvaries isimportanttomodelthisdistributionaccurately.specically,thechoiceofa distributionaectstheperformanceofmigrationpolicies,andthereforethatit withage.inourobservationswefoundadistributionwiththeubneproperty Regardingthesecondpoint,wearguethattheparticularshapeofthelifetime theexpectedremaininglifetimeofajobincreaseslinearlywithage.asaresult,we choseamigrationpolicythatmigratesonlyoldjobs. processanditsremaininglifetime.forexample,auniformdistributionhasthe NBUEproperty theexpectedremaininglifetimedecreaseslinearlywithage. choosetomigrateonlyyoungprocesses.inthiscase,weexpectnon-preemptive Thusifthedistributionoflifetimeswereuniform,themigrationpolicyshould Butdierentdistributionsyieldindierentrelationshipsbetweentheageofa migrationtoperformbetterthanpreemptivemigration. withthelowestmigrationcost,regardlessofage. lifetimeofajobisindependentofitsage.inthiscase,sinceallprocesseshavethe sameexpectedlifetimes,themigrationpolicymightchoosetomigratetheprocess Asanotherexample,theexponentialdistributionismemoryless theremaining distribution(auniformdistributioninlog-space)havearemaininglifetimethat increasesuptoapointandthenbeginstodecrease.inthiscase,thebestmigration policymightbetomigratejobsthatareoldenough,butnottooold. Asanalexample,processeswhoselifetimesarechosenfromauniformlog dierentmigrationpolicies.inordertoevaluateaproposedpolicy,itiscritical tochooseadistributionmodelwiththeappropriaterelationshipbetweenexpected remaininglifetimeandage. Thusdierentdistributions,evenwiththesamemeanandvariance,canleadto oflifetimes.thesedistributionsmayormaynothavetherightbehavior,depending onhowaccuratelytheytobserveddistributions.[kruegerandlivny1988]use athree-stagehyperexponentialwithparametersestimatedtotobservedvalues. ThisdistributionhastheappropriateUBNEproperty.Butthetwo-stagehyperexponentialdistribution[Eageretal.1988]useismemoryless;theremaininglifetime ofajobisindependentofitsage(forjobswithnonzerolifetimes).accordingtothis distribution,migrationpolicyisirrelevant;allprocessesareequallygoodcandidates Somestudieshaveusedhyperexponentialdistributionstomodelthedistribution formigration.thisresultisclearlyinconictwithourobservations.

preemptivemigration.theheavytailofourmeasuredlifetimedistributionimplies Assumingthewronglifetimedistributionmayalsounderestimatethebenetsof ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 11 abilitytoidentifythosefewhogs.inalifetimedistributionwithoutsuchaheavy tail,preemptivemigrationmightnotbeaseective. Aswe'lldiscussinSection5.4,partofthepowerofpreemptivemigrationisits thatatinyfractionofthejobsrequiremorecputhanalltheotherjobscombined. 3.MIGRATIONPOLICY Amigrationpolicyisbasedontwodecisions:whentomigrateprocessesandwhich processestomigrate.therstquestionconcernshowoftenoratwhattimesthe Thefocusofthispaperisthesecondquestion,alsoknownastheselectionpolicy: systemchecksforeligiblemigrants.weaddressthisissuebrieyinsection5.1.1. Giventhattheloadatahostistoohigh,howdowechoosewhichprocess tive,migrationtimehasalargeimpactonresponsetime.aprocesswouldchoose lifetimes.themotivationforthisheuristicistwofold.fromtheprocess'sperspec- Ourheuristicistomigrateprocessesthatareexpectedtohavelongremaining tomigrate? tomigrateonlyifthemigrationoverheadcouldbeamortizedoveralongerlifetime. tion),becausetheseprocesseshavenoallocatedmemoryandthustheirmigration Fromtheperspectiveofthesourcehost,ittakesasignicantamountofworkto costislow.theideaofmigratingnewbornprocessesmightalsostemfromthe thatarelikelytobemoreexpensivetorunthantomigrate. packageaprocessformigration.thehostwouldonlychoosetomigrateprocesses processeshaveequalexpectedremaininglifetimesregardlessoftheirage,soone fallacythatprocesslifetimeshaveanexponentialdistribution,implyingthatall Manyexistingmigrationpoliciesonlymigratenewbornprocesses(nopreemp- shouldmigratethecheapestprocesses.theproblemwithonlymigratingnewborn processesisthat,accordingtotheprocesslifetimedistribution,newbornprocesses ourmeasurementsshowthatover70%ofprocesseshavelifetimessmallerthanthe areunlikelytolivelongenoughtojustifythecostofremoteexecution.infact, smallestnon-preemptivemigrationcost(seetable3). Wehavefound,though,thattheabilityofthesystemtopredictprocesslifetimes edgeaboutprocessesandcanselectivelymigrateprocesseslikelytobecpuhogs. bynameislimited(section5.4). Thusanewbornmigrationpolicyisonlyjustiedifthesystemhaspriorknowl- processes. processtorunlongerthanayoungprocess;thus,itispreferabletomigrateold jorityofprocessesareshort,theremightnotbeenougholdprocessestohavea Therearetwopotentialproblemswiththisapproach.First,sincethevastma- Canwedobetter?Thedistributionoflifetimesimpliesthatweexpectandold signicantloadbalancingeect.infact,althoughtherearefewlong-livedprocesses,theyaccountforalargepartofthetotalcpuload.accordingtoour measurements,typicallyfewerthan4%ofprocesseslivelongerthan2seconds,yet thelongtailoftheprocesslifetimedistribution.furthermore,wewillseethatthe abilitytomigrateevenafewlargejobscanhavealargeeectonsystemperfor- theseprocessesmakeupmorethan60%ofthetotalcpuload.thisisdueto

12 mance,sinceasinglelongprocessonabusyhostimposesslowdownsonmanyshort processes. M.Harchol-BalterandA.B.Downey migratingprocesseswithlongerexpectedlives. activeprocessismuchgreaterthanthecostofremoteexecution.ifpreemptive migrationisdonecarelessly,thisadditionalcostmightoverwhelmthebenetof Asecondproblemwithmigratingoldprocessesisthatthemigrationcostforan 3.1OurMigrationPolicy Forthisreason,weproposeastrategythatguaranteesthateverymigrationimprovestheexpectedperformanceofthemigrantprocessandtheotherprocesses atthesourcehost.thisstrategymigratesaprocessonlyifitimprovestheexpectedslowdownoftheprocess,whereslowdownisdenedasinsection1.2.of course,processesonthetargethostareslowedbyanarrivingmigrant,butona moderately-loadedsystemtherearealmostalwaysidlehosts;thusthenumberof processesatthetargethostisusuallyzero.inanycase,thenumberofprocesses atthetargetisalwayslessthanthenumberatthesource. migrationisdone.ifmigrationcostsarehigh,fewprocesseswillbeeligiblefor migration;intheextremetherewillbenomigrationatall.butinnocaseisthe performanceofthesystemworse(inexpectation)thantheperformancewithout migration. Ifthereisnoprocessonthehostthatsatisestheabovemigrationcriterion,no imposedonamigrantprocess,andusethisresulttoderiveaminimumagefor migrationbasedonthecostofmigration.denotingtheageofthemigrantprocess thenumberofprocessesatthesourcehostbyn;andthenumberofprocessesat bya;thecostofmigrationbyc;the(eventualtotal)lifetimeofthemigrantbyl, Usingthedistributionofprocesslifetimes,wecalculatetheexpectedslowdown thetargethost(includingthemigrant)bym,wehave: Efslowdownofmigrantg t=aprlifetimeof migrantistslowdowngiven =Z1 t=aprftl<t+dtjlagna+c+m(t?a) lifetimeistdt =12ca+m+n t=aat2na+c+m(t?a) t t formigrationonlyifitsexpectedslowdownaftermigrationislessthann(whichis theslowdownitexpectsintheabsenceofmigration). Iftherearenprocessesataheavilyloadedhost,thenaprocessshouldbeeligible Thus,werequire12(ca+m+n)<n,whichimplies Wecanextendthisanalysistothecaseofheterogeneousprocessorspeedsby Minimummigrationage=Migrationcost n?m

applyingascalefactortonorm. Thisanalysisassumesthatcurrentloadpredictsfutureload;thatis,thattheload ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 13 atthesourceandtargethostswillbeconstantduringthemigration.inanattempt calculatingapredictionofsurvivorsandfuturearrivalsbasedonthedistribution toevaluatethisassumption,andpossiblyimproveit,weconsideredanumberof oftime),(2)summingtheagesoftheprocessesrunningonthehost,anda(3) predictor,andthatusingseveralpredictivevariablesincombinationdidnotgreatly alternativeloadpredictors,including(1)takingaloadaverage(overaninterval modelproposedhere.wefoundthatcurrent(instantaneous)loadisthebestsingle improvetheaccuracyofprediction.theseresultsareinaccordwithzhou[zhou 3.2PriorPreemptivePolicies Onlyafewpreemptivestrategieshavebeenimplementedinrealsystemsorproposed 1987]andKunz[Kunz1991]. inpriorstudies.thethreethatwehavefoundare,likeours,basedontheprinciple thataprocessshouldbemigratedifitisoldenough. freeparameterwhosevalueischosenwithoutexplanation,andthatwouldneedto bere-tunedforadierentsystemoranotherworkload. UnderLelandandOtt'spolicy,aprocesspiseligibleformigrationif Inmanycases,thedenitionofoldenoughdependsona\voodoo"constant1:a andlivny'spolicy,likeours,takesthejob'smigrationcostintoaccount.aprocess piseligibleformigrationif wherekisafreeparametercalledmincrit[lelandandott1986].krueger age(p)>agesofkyoungerjobsathost buttheydonotexplainhowtheychosethevalue0:1[kruegerandlivny1988]. TheMOSIXpolicyissimilar[Baraketal.1993];aprocessiseligibleformigration if age(p)>0:1migrationcost(p) ofamigrantprocessisnevermorethan2,sinceintheworstcasethemigrant completesimmediatelyuponarrivalatthetarget. Thechoiceoftheconstant(1:0)intheMOSIXpolicyensuresthattheslowdown age(p)>1:0migrationcost(p): theslowdownthatwouldbeimposedatthesourcehostintheabsenceofmigration(presumablythereismorethanoneprocessthere,orthesystemwouldnot beattemptingtomigrateprocessesaway).second,itisbasedontheworst-case Despitethisjustication,thechoiceofthemaximumslowdown(2)isarbitrary. WeexpecttheMOSIXpolicytobetoorestrictive,fortworeasons.First,itignores 1ThistermwascoinedbyProfessorJohnOusterhoutatU.C.Berkeley. onload. bestchoiceforthisparameter,forourworkload,isusuallynear0.4,butitdepends slowdownratherthantheexpectedslowdown.insection5.5,weshowthatthe

14 Migrationcosthassuchalargeeectontheperformanceofpreemptiveloadbalancing;thissectionpresentsthemodelofmigrationcostsweuseinoursimulation Wemodelthecostofmigratinganactiveprocessasthesumofaxedmigration 4.MODELOFMIGRATIONCOSTS M.Harchol-BalterandA.B.Downey proportionaltotheamountoftheprocess'smemorythatmustbetransferred. studies. costformigratingtheprocess'ssystemstateandamemorytransfercostthatis remotehost,logginginorotherwiseauthenticatingtheprocess,andcreatinganew Thecostofremoteexecutionincludessendingthecommandandargumentstothe shellandenvironmentontheremotehost. Wemodelremoteexecutioncostasaxedcost;itisthesameforallprocesses. r:thecostofremoteexecution,inseconds f:thexedcostofpreemptivemigration,inseconds b:thememorytransferbandwidth,inmbpersecond Throughoutthispaper,weusethefollowingnotation: andthus: m:thememorysizeofmigrantprocesses,inmb wherethequotientm=bisthememorytransfercost. costofpreemptivemigration=f+m=b costofremoteexecution=r [DouglisandOusterhout1991]haveanexcellentdiscussionofthisissue,andwe migrationdependsonpropertiesofthedistributedsystem.douglisandousterhout 4.1Memorytransfercosts borrowfromthemhere. Theamountofaprocess'smemorythatmustbetransferredduringpreemptive systemlikesprite,whichintegratesvirtualmemorywithadistributedlesystem, itisonlynecessarytowritedirtypagestothelesystembeforemigration.when thecostofmigrationisproportionaltothesizeoftheresidentsetratherthanthe theprocessisrestartedatthetargethost,itwillretrievethesepages.inthiscase Atthemost,itmightbenecessarytotransferaprocess'sentirememory.Ona sizeofmemory. ferredwhiletheprogramcontinuestorunatthesourcehost.whenthejobstops creased,thedelayimposedonthemigrantprocessisgreatlydecreased.additional dirtyduringtheprecopy.althoughthenumberofpagestransferredmightbein- executionatthesource,itwillhavetotransferagainanypagesthathavebecome Insystemsthatuseprecopying,suchasV[Theimeretal.1985],pagesaretrans- techniquescanreducethecostoftransferringmemoryevenmore[zayas1987]. 4.2Migrationcostsinrealsystems Thespecicparametersofmigrationcostdependnotonlyonthenatureofthe system(asdiscussedabove)butalsoonthespeedofthenetwork.tables3and4 showreportedcostsfromavarietyofrealsystems.laterwewilluseatrace-driven simulatortoevaluatetheeectoftheseparametersonsystemperformance.we

System Sprite ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing Hardware SPARCstation1 0.33sec Costofrexec,r 15 [DouglisandOusterhout1991] GLUNIX [Vahdatetal.1994][Vahdat1995]ATMnetwork MIST [Prouty1996] 10Mb/secEthernet Utopia HPworkstations HP9000/720 10Mb/secEthernet 0.25to0.5sec [Zhouetal.1993] DEC3100andSPARCIPC0.1sec 0.33sec Table3.Costofnon-preemptivemigrationinvarioussystems.Someofthesenumberswere obtainedfrompersonalcommunicationwiththeauthors. System Sprite [DouglisandOusterhout1991] MOSIX 10Mb/secEthernet Hardware SPARCstation1 Fixed 0.33sec Cost,f 2.00sec/MB Inverse [Baraketal.1993][Braverman1995]and48666MHz IntelPentium90MHz0.006sec0.44sec/MB Bandwidth,1=b Table4.Valuesforpreemptivemigrationcostsfromvarioussystems.Manyofthesenumbers wereobtainedfrompersonalcommunicationwiththeauthors.thememorytransfercostisthe MIST [Prouty1996] HP9000/720s 10Mb/secEthernet 0.24sec 0.99sec/MB willmakethepessimisticsimplicationthatamigrant'sentirememorymustbe transferred,although,aspointedoutabove,thisisnotnecessarilythecase. productoftheinversebandwidth,1=b,andtheamountofmemorythatmustbetransferred,m. gration.wecomparetwomigrationstrategies:ourproposedage-basedpreemptive Inthissectionwepresenttheresultsofatrace-drivensimulationofprocessmi- 5.TRACE-DRIVENSIMULATION migrationstrategy(section3.1)andanon-preemptivestrategythatmigratesnewbornprocessesaccordingtotheprocessname(similartostrategiesproposedby [Wangetal.1993]and[Svensson1990]).Withtheintentionofndingaconservativeestimateofthebenetofpreemptivemigration,wegivethename-based usethesimulatortorunthreeexperiments.first,insection5.2,weevaluate thesensitivityofeachstrategytothemigrationcostsr,f,b,andmdiscussed strategythebenetofseveralunrealisticadvantages;forexample,thename-lists arederivedfromthesametracedatausedbythesimulator. insection4.next,insection5.3,wechoosevaluesfortheseparametersthatare representativeofcurrentsystemsandcomparetheperformanceofthetwostrategies Section5.1describesthesimulatorandthetwostrategiesinmoredetail.We indetail.insection5.4wediscusswhythepreemptivepolicyoutperformsthe non-preemptivepolicy.lastly,insection5.5,weevaluatetheanalyticcriterion formigrationageproposedinsection3.1,comparedtocriteriausedinprevious studies.

16 5.1Thesimulator Wehaveimplementedatrace-drivensimulationofanetworkofsixidenticalworkstations.2 M.Harchol-BalterandA.B.Downey eachfrom9:00a.m.to5:00p.m.fromthesixtracesweextractedthestarttimes Weselectedsixdaytimeintervalsfromthetracesonmachinepo(seeSection2.1), andcpudurationsoftheprocesses.wethensimulatedanetworkwhereeachof thedaytimetraces. sixhostsexecutes(concurrentlywiththeothers)theprocessarrivalsfromoneof fractionofprocesses(0:1%)arrivetondallhostsbusy.inordertoevaluatethe activityduringtheeight-hourtrace.formostofthetraces,everyarrivingprocess mixanddistributionoflifetimes,thereisconsiderablevariationinthelevelof ndsatleastoneidlehostinthesystem,butinthetwobusiesttraces,asmall Althoughtheworkloadsonthesixhostsarehomogeneousintermsofthejob lowesttohighestload.run0hasatotalof15000processessubmittedtothesix intervals.werefertotheseasruns0through7,wheretherunsaresortedfrom simulatedhosts;run7has30000processes.theaveragedurationofprocesses(for eectofchangesinsystemload,wedividedtheeight-hourtraceintoeightone-hour allruns)is0.4seconds.thusthetotalutilizationofthesystem,,isbetween0.27 and0.54. givenrunandagivenhost,theserialcorrelationininterarrivaltimesistypically between.08and.24,whichissignicantlyhigherthanonewouldexpectfroma Poissonprocess(uncorrelatedinterarrivaltimesyieldaserialcorrelationof0.0; perfectcorrelationis1.0). ThebirthprocessofjobsatourhostsisburstierthanaPoissonprocess.Fora randomlyfromameasureddistribution(seesection5.2).thissimplicationobliteratesanycorrelationsbetweenmemorysizeandotherprocesscharacteristics,but thememorysizeofeachprocess,whichdeterminesitsmigrationcost,ischosen itallowsustocontrolthemeanmemorysizeasaparameterandexamineitseect Althoughthestarttimesanddurationsoftheprocessescomefromtracedata, onsystemperformance. areneverblockedoni/o.duringagiventimeinterval,wedividecputimeequally amongtheprocessesonthehost(processorsharing). Inoursystemmodel,weassumethatprocessesarealwaysreadytorun;i.e.they migrationtothesourcehost.thissimplicationisconservativeinthesensethat thetransferredpages,partintransitinthenetwork,andpartonthetargethost unpackingthedata.thesizeofthesepartsandwhethertheycanbeoverlapped dependondetailsofthesystem.inoursimulationwechargetheentirecostof Inrealsystems,partofthemigrationtimeisspentonthesourcehostpackaging itmakespreemptivemigrationlesseective. thepoliciesassimpleandassimilaraspossible.forbothtypesofmigration,we considerperformingamigrationonlywhenanewprocessisborn,eventhougha tion3.1withanon-preemptivemigrationstrategy,wherethenon-preemptivestrat- egyisgivenunfairadvantages.forpurposesofcomparison,wehavetriedtomake 5.1.1Strategies.WecomparethepreemptivemigrationstrategyproposedinSec- http://http.cs.berkeley.edu/harchol/loadbalancing.html. 2The trace-driven simulator and the trace data are available at

preemptivestrategymightbenetbyinitiatingmigrationsatothertimes.also, forbothstrategies,ahostisconsideredheavily-loadedanytimeitcontainsmore ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 17 gration.finally,weusethesamelocationpolicyinbothcases:thehostwiththe lowestinstantaneousloadischosenasthetargethost(tiesarebrokenbyrandom thanoneprocess;inotherwords,anytimeitwouldbesensibletoconsidermi- selection). areconsideredeligibleformigration: Thustheonlydierencebetweenthetwomigrationpoliciesiswhichprocesses isbornataheavily-loadedhost,theprocessisexecutedremotelyontheselected itsnameisonalistofprocessesthattendtobelong-lived.ifaneligibleprocess host.processescannotbemigratedoncetheyhavebegunexecution. Name-basednon-preemptivemigration.Aprocessiseligibleformigrationonlyif choseathresholdonmeandurationthatisempiricallyoptimal(forthissetofruns). Addingmorenamestothelistdetractsfromtheperformanceofthesystem,asit durationandselectingthe15commonnameswiththelongestmeandurations.we Wederivedthislistbysortingtheprocessesfromthetracesaccordingtonameand Theperformanceofthisstrategydependsonthelistofeligibleprocessnames. allowsmoreshort-livedprocessestobemigrated.removingnamesfromthelist detractsfromperformanceasitbecomesimpossibletomigrateenoughprocesses agedforsomefractionofitsmigrationcost.basedonthederivationinsection3.1, list,ourresultsmayoverestimatetheperformancebenetsofthisstrategy. thisfractionis1 tobalancetheloadeectively.sinceweusedthetracedataitselftoconstructthe Age-basedpreemptivemigration.Aprocessiseligibleformigrationonlyifithas causeitdoesnotallowthesystemtoinitiatemigrationsexceptwhenanewprocess processesthatsatisfythemigrationcriterionaremigratedaway. source(target)host.whenanewprocessisbornataheavily-loadedhost,all Thisstrategyunderstatestheperformancebenetsofpreemptivemigration,be- n?m,wheren(respectivelym)isthenumberofprocessesatthe arrives. morecomplicatedpredictorsoffutureloads,butnoneofthesepredictorsyielded signicantlybetterperformancethantheinstantaneousloadweusehere. AsdescribedinSection3.1,wealsomodeledotherlocationpoliciesbasedon theperformanceofthemigrantwillimproveinexpectation).oneofthestrategies wheneveraprocessbecomeseligible(sincetheeligibilitycriterionguaranteesthat thanwhenanewprocessarrives.ideally,onewouldliketoinitiateamigration weconsideredperformsperiodicchecksofeachprocessonaheavily-loadedhost Wealsoconsideredtheeectofallowingpreemptivemigrationattimesother followingperformancemetrics: toseeifanysatisfythecriterion.theperformanceofthisstrategyissignicantly betterthanthatofthesimplerpolicy(migratingonlyatprocessarrivaltimes). 5.1.2Metrics.Weevaluatetheeectivenessofeachstrategyaccordingtothe metricofsystemperformance.whenwecomputetheratioofmeanslowdowns(as fromdierentstrategies)wewillusenormalizedslowdown,whichistheratioof (thus,itisalwaysgreaterthanone).theaverageslowdownofalljobsisacommon Meanslowdown.Slowdownistheratioofwall-clockexecutiontimetoCPUtime

18 M.Harchol-BalterandA.B.Downey Fraction of procs 0.5 Distribution of slowdowns 0.4 0.3 0.2 smallslowdowns,buttheprocessesinthetailofthedistributionaremorenoticeableandannoying Fig.3.Distributionofprocessslowdownsforrun0(withnomigration).Mostprocessessuer 0.1 0 CPUtime.Forexample,ifthe(unnormalized)meanslowdowndropsfrom2:0to tousers. inactivetime(theexcessslowdowncausedbyqueueingandmigrationdelays)to 1 2 3 4 5 6 7 8 9 10 1:5,theratioofnormalizedmeanslowdownsis0:5=1:0=0:5:a50%reductionin Slowdown ofthetwostrategies;itunderstatestheadvantagesofthepreemptivestrategyfor thesetworeasons: delay. Skeweddistributionofslowdowns:Evenintheabsenceofmigration,themajority Meanslowdownaloneisnotasucientmeasureofthedierenceinperformance Userperception:Fromtheuser'spointofview,theimportantprocessesarethe Thevalueofthemeanslowdownisdominatedbythismajority. onesinthetailofthedistribution,becausealthoughtheyaretheminority,they ofprocessessuersmallslowdowns(typically80%arelessthan3.0.seefigure3). haveasmalleectonthemeanslowdown,butalargeeectonauser'sperception causethemostnoticeableandannoyingdelays.eliminatingthesedelaysmight ofperformance. Therefore,wewillalsoconsiderthefollowingthreemetrics: tailofthedistribution;i.e.thenumberofjobsthatexperiencelongdelays.(see maybemoremeaningfultointerpretthismetricasameasureofthelengthofthe tryingtoscheduletasks.inlightofthedistributionofslowdowns,however,it dictabilityofresponsetime[silberschatzetal.1994],whichisanuisanceforusers Varianceofslowdown.Thismetricisoftencitedasameasureoftheunpre- severelyimpactedbyqueueingandmigrationpenalties.(seefigures5cand5d). abledelaysexplicitly,weconsiderthenumber(orpercentage)ofprocessesthatare Figure5b). Numberofseverelyslowedprocesses.Inordertoquantifythenumberofnotice-

than:5seconds)aremoreperceivabletousersthandelaysinshortjobs.(see Meanslowdownoflongjobs.Delaysinlongerjobs(thosewithlifetimesgreater ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 19 Figure6). 5.2Sensitivitytomigrationcosts Inthissectionwecomparetheperformanceofthenon-preemptiveandpreemptive strategiesoverarangeofvaluesofr,f,bandm(themigrationcostparameters process)andb(thebandwidthofthenetwork).wechosethememorytransfercost Weconsideredarangeforthexedmigrationcostof:1<f<10seconds. denedinsection4). fromadistributionwiththesameshapeasthedistributionofprocesslifetimes, Thememorytransfercostisthequotientofm(thememorysizeofthemigrant Forthefollowingexperiments,wechosetheremoteexecutioncostr=:3seconds. featureofthisdistributionisthattherearemanyjobswithsmallmemorydemands seconds.theshapeofthisdistributionisbasedonaninformalstudyofmemory-use patternsonthesamemachinesfromwhichwecollectedtracedata.theimportant settingthemeanmemorytransfercost(mmtc)toarangeofvaluesfrom1to64 distributiondoesnotaecttheperformanceofeithermigrationstrategystrongly, butofcoursethemean(mmtc)doeshaveastrongeect. andafewjobswithverylargememorydemands.empirically,theexactformofthis migrationstrategiesusingnormalizedslowdown.specically,foreachoftheeight one-hourrunswecalculatethemean(respectivelystandarddeviation)oftheslowdownimposedonallprocessesthatcompleteduringthehour.foreachrun,we Figures4aand4barecontourplotsoftheratiooftheperformanceofthetwo thentaketheratioofthemeans(standarddeviations)ofthetwostrategies.lastly wetakethegeometricmeanoftheeightratios(fordiscussionofthegeometric mean,see[hennessyandpatterson1990]). tivemigration,namelythexedcost,f,andthemmtc,m=b.thecostofnon- preemptivemigration,r,isxedat0:3seconds.asexpected,increasingeitherthe xedcostofmigrationorthemmtchurtstheperformanceofpreemptivemigration.thecontourlinemarked1:0indicatesthecrossoverwheretheperformance ThetwoaxesinFigure4representthetwocomponentsofthecostofpreemp- non-preemptivemigration.whenthexedcostofmigrationorthemmtcare ofpreemptiveandnon-preemptivemigrationisequal(theratiois1:0).forsmaller valuesofthecostparameters,preemptivemigrationperformsbetter;forexample, ifthexedmigrationcostis0.3secondsandthemmtcis2seconds,thenormalizedmeanslowdownwithpreemptivemigrationisalmost40%lowerthanwith unaectedbythesecostssothenon-preemptivestrategycanbemoreeective. veryhigh,almostallprocessesareineligibleforpreemptivemigration;thus,the preemptivestrategydoesalmostnomigrations.thenon-preemptivestrategyis aregionwherepreemptivemigrationyieldsahighermeanslowdownthannonpreemptivemigration,butalowerstandarddeviation.thereasonforthisisthadowns.thecrossoverpoint wherenon-preemptivemigrationsurpassespreemp- Figure4bshowstheeectofmigrationcostsonthestandarddeviationofslowtivemigration isconsiderablyhigherherethaninfigure4a.thusthereis turnsouttobeshort-lived.theseprocessessuerlargedelays(relativetotheir non-preemptivemigrationoccasionallychoosesaprocessforremoteexecutionthat

20 M.Harchol-BalterandA.B.Downey (a) 10 Ratio of mean slowdowns Fixed migration cost (sec.) 3 1 0.6 X 0.7 0.8 0.9 NON PREEMPTIVE BETTER HERE 1.0 PREEMPTIVE BETTER HERE 1.1 1.2 (b) 0.1 1 2 4 8 16 32 64 Mean memory transfer cost (sec.) 10 Ratio of std of slowdowns 1.0 Fixed migration cost (sec.) 0.9 0.8 3 0.7 0.6 1 0.5 Fig.4.(a)Theperformanceofpreemptivemigrationrelativetonon-preemptivemigrationdeterioratesasthecostofpreemptivemigrationincreases.Thetwoaxesarethetwocomponentsof PREEMPTIVE BETTER EXCEPT UPPER RIGHT X ofslowdownmaygiveabetterindicationofauser'sperceptionofsystemperformancethanmean theparticularsetofparameterswewillconsiderinthenextsection.(b)thestandarddeviation thepreemptivemigrationcost.thecostofnon-preemptivemigrationisheldxed.thexmarks slowdown.bythismetric,thebenetofpreemptivemigrationisevenmoresignicant. 0.1 1 2 4 8 16 32 64 Mean memory transfer (sec.)

runtimes)andaddtothetailofthedistributionofslowdowns.inthenextsection, weshowcasesinwhichthestandarddeviationofslowdownsisactuallyworsewith ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 21 non-preemptivemigrationthanwithnomigrationatall(threeoftheeightruns). tems(seesection4.2)andusethemtoexaminemorecloselytheperformanceof thetwomigrationstrategies.thevalueswechoseare: Inthissectionwechoosemigrationcostparametersrepresentativeofcurrentsys- 5.3Comparisonofpreemptiveandnon-preemptivestrategies r:thecostofremoteexecution,0.3seconds f:thexedcostofpreemptivemigration,0.3seconds m:themeanmemorysizeofmigrantprocesses,1mb b:thememorytransferbandwidth,0.5mbpersecond point(comparedtothecaseofnomigration). withanx.figure5showstheperformanceofthetwomigrationstrategiesatthis InFigures4aand4b,thepointcorrespondingtotheseparametervaluesismarked bylessthan20%formostruns(and40%forthetworunswiththehighestloads). Preemptivemigrationreducesthenormalizedmeanslowdownby50%formost runs(andmorethan60%fortwooftheruns).theperformanceimprovementof preemptivemigrationovernon-preemptivemigrationistypicallybetween35%and Non-preemptivemigrationreducesthenormalizedmeanslowdown(Figure5a) 50%. metricstotrytoquantifythesebenets.figure5bshowsthestandarddeviation and5dexplicitlymeasurethenumberofseverelyimpactedprocesses,accordingto statestheperformancebenetsofpreemptivemigration.wehaveproposedother ofslowdowns,whichreectsthenumberofseverelyimpactedprocesses.figures5c Asdiscussedabove,wefeelthatthemeanslowdown(normalizedornot)under- twodierentthresholdsofacceptableslowdown.bythesemetrics,thebenetsof non-preemptivemigrationappearsmuchgreater.forexampleinfigure5d,inthe migrationingeneralappeargreater,andthediscrepancybetweenpreemptiveand absenceofmigration,7{18%ofprocessesareslowedbyafactorof5ormore. Non-preemptivemigrationisabletoeliminate42{62%ofthese,whichisasignicantbenet,butpreemptivemigrationconsistentlyeliminatesnearlyall(86{97%) severedelays. nomigrationatall.forthepreemptivemigrationstrategy,thisoutcomeisnearly allprocessesinvolved(inexpectation).intheworstcase,then,thepreemptive migrationactuallymakestheperformanceofthesystemworsethaniftherewere impossible,sincemigrationsareonlyperformediftheyimprovetheslowdownsof AnimportantobservationfromFigure5bisthatforseveralruns,non-preemptive formanceasloadincreases(asshowninfigure5).inthepresenceofpreemptive strategywilldonoworsethanthecaseofnomigration(inexpectation). regardlessoftheoverallloadonthesystem. migration,boththemeanandstandarddeviationofslowdownarenearlyconstant, Anotherbenetofpreemptivemigrationisgracefuldegradationofsystemper-

22 M.Harchol-BalterandA.B.Downey 4.0 Mean slowdown no migration non-preemptive, name-based migration preemptive, age-based migration 3.0 2.0 1.0 4.0 0 1 2 3 4 5 6 7 run number Standard deviation of slowdown no migration non-preemptive, name-based migration preemptive, age-based migration 3.0 2.0 1.0 0.0 30% 0 1 2 3 4 5 6 7 run number Processes slowed by a factor of 3 or more no migration non-preemptive, name-based migration preemptive, age-based migration 20% 10% 0% 20% 0 1 2 3 4 5 6 7 run number Processes slowed by a factor of 5 or more no migration non-preemptive, name-based migration preemptive, age-based migration 15% Fig.5.(a)Meanslowdown.(b)Standarddeviationofslowdown.(c)Percentageofprocesses slowedbyafactorof3ormore.(d)percentageofprocessesslowedbyafactorof5ormore. 10% 5% 0% 0 1 2 3 4 5 6 7 run number

Thealternatemetricsdiscussedaboveshedsomelightonthereasonsfortheperformancedierencebetweenpreemptiveandnon-preemptivemigration.Weconsider onapotentialmigrant,and,moreimportantly,inictsdelaysonshortjobsthat twokindsofmistakesamigrationsystemmightmake: areforcedtoshareaprocessorwithacpuhog.undernon-preemptivemigration, thiserroroccurswheneveralong-livedprocessisnotonthename-list,possibly becauseitisanunknownprogramoranunusuallylongexecutionofatypically Failingtomigratelong-livedjobs.Thistypeoferrorimposesmoderateslowdowns 5.4Whypreemptivemigrationoutperformsnon-preemptivemigration ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 23 short-livedprogram.preemptivemigrationcancorrecttheseerrorsbymigrating ancing.undernon-preemptivemigration,thiserroroccurswhenaprocesswhose longjobslaterintheirlives. strategyallbuteliminatesthistypeoferrorbyguaranteeingthattheperformance nameisontheeligiblelistturnsouttobeshort-lived.ourpreemptivemigration migratedprocess,wastesnetworkresources,andfailstoeectsignicantloadbal- Migratingshort-livedjobs.Thistypeoferrorimposeslargeslowdownsonthe ofamigrantimprovesinexpectation. becauseonelongjobonabusymachinewillimpedemanysmalljobs.thiseectis suggeststhatabusyhostislikelytoreceivemanyfuturearrivals. aggravatedbytheserialcorrelationbetweenarrivaltimes(seesection5.1),which Evenoccasionalmistakesoftherstkindcanhavealargeimpactonperformance, jobsformigration.toevaluatethisability,weconsidertheaveragelifetimeofthe processeschosenformigrationundereachpolicy.undernon-preemptivemigration, processesis0:4seconds),andthemedianlifetimeofmigrantswas0:9seconds.the theaveragelifetimeofmigrantprocesseswas2:0seconds(themeanlifetimeforall Thus,animportantfeatureofamigrationpolicyisitsabilitytoidentifylong-lived non-preemptivepolicymigratedabout1%ofalljobs,whichaccountedfor5:7%of thetotalcpu. medianlifetimeofmigrantswas2:0seconds.thepreemptivepolicymigrated4% agelifetimeofmigrantprocessesunderpreemptivemigrationwas4:9seconds;the ofalljobs,butsincethesemigrantswerelong-lived,theyaccountedfor55%ofthe totalcpu. Thepreemptivemigrationpolicywasbetterabletoidentifylongjobs;theaver- identifylongjobsaccuratelyandtomigratethosejobsawayfrombusyhosts. Theseoutliersarereectedinthestandarddeviationofslowdowns becausethe downforallprocesses,butitdidimposelargeslowdownsonsomesmallprocesses. Thustheprimaryreasonforthesuccessofpreemptivemigrationisitsabilityto darddeviationofslowdownsworsethanwithnomigration(seefigure5b).the non-preemptivepolicysometimesmigratesveryshortjobs,itcanmakethestan- Thesecondtypeoferrordidnothaveasgreatanimpactonthemeanslow- age-basedpreemptivemigrationcriterioneliminatesmosterrorsofthistypeby guaranteeingthattheperformanceofthemigrantwillimproveinexpectation. targethostmayhavealowloadwhenamigrationisinitiated,butitsloadmay emptivemigrationthanfornon-preemptivemigration:staleloadinformation.a haveincreasedbythetimethemigrantarrives.thisismorelikelyforapreemptive Thereis,however,onetypeofmigrationerrorthatismoreproblematicforpre-

24 migrationbecausethemigrationtimeislonger.inoursimulations,wefoundthat theseerrorsdooccur,althoughinfrequentlyenoughthattheydonothaveasevere M.Harchol-BalterandA.B.Downey runswithhighloadsitwas0:7%).somewhatmoreoften,3%ofthetime,amigrant hostandfoundthattheloadwashigherthanithadbeenatthesourcehostwhen impactonperformance. migrationbegan.formostruns,thisoccurredlessthan0:5%ofthetime(fortwo processarrivedatatargethostandfoundthattheloadatthetargetwasgreater Specically,wecountedthenumberofmigrantprocessesthatarrivedatatarget thanthecurrentloadatthesource.theseresultssuggestthattheperformanceof worktracthatresultsfromlargememorytransfers.inoursimulations,wedid MOSIX. apreemptivemigrationstrategymightbeimprovedbyareservationsystemasin grationwouldnotbeexcessive.thisassumptionseemstobereasonable:under notmodelnetworkcongestion,ontheassumptionthatthetracgeneratedbymi- ourpreemptivemigrationstrategyfewerthan4%ofprocessesaremigratedonce Oneotherpotentialproblemwithpreemptivemigrationisthevolumeofnet- jobsandmovethemawayfrombusyhosts overcomesitsdisadvantages(longer andfewerthan:25%ofprocessesaremigratedmorethanonce.furthermore,there migrationtimesandstaleloadinformation). isseldommorethanonemigrationinprogressatatime. Insummary,theadvantageofpreemptivemigration itsabilitytoidentifylong inglongjobsandmigratingthemawayfrombusyhostshelpsnotonlythelongjobs (whichrunonmorelightly-loadedhosts)butalsotheshortjobsthatrunonthe andmeasuredtheperformancebenetforeachgroupduetomigration.thenumberofjobsinshortgroupisroughlytentimesthenumberinthemediumgroup, whichinturnisroughlytentimesthenumberinthelonggroup.figure6shows thatmigrationreducesthemeanslowdownofallthreegroups:fornon-preemptive sourcehost.totestthisclaim,wedividedtheprocessesintothreelifetimegroups 5.4.1Eectofmigrationonshortandlongjobs.Wehaveclaimedthatidentify- migrationtheimprovementisthesameforallgroups;underpreemptivemigration thelongjobsenjoyaslightlygreaterbenet. accordingtotheirlifetimes.thusthemeanresidencetimemetricreects,primarily,theperformancebenetforlongjobs.underthemeanresidencetimemetric, group,shortjobs.anothercommonmetric,residencetime,eectivelyweightsjobs alljobs;asaresult,themeanslowdownmetricisdominatedbythemostpopulous temperformance.themetricweareusinghere,slowdown,givesequalweightto Thisbreakdownbylifetimegroupisusefulforevaluatingvariousmetricsofsys- then,preemptivemigrationappearsevenmoreeective. AsderivedinSection3.1,theminimumageforamigrantprocessaccordingtoour 5.5Evaluationofanalyticmigrationcriterion analyticcriterionis wherenistheloadatthesourcehostandmistheloadatthetargethost(including migrationage=migrationcost Minimum n?m

ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 25 Mean slowdown by age group 3.0 2.5 no migration non preemptive migration preemptive migration 2.0 Fig.6.Meanslowdownbrokendownbylifetimegroup. 1.5 1.0 dur <.5s.5s < dur < 5s dur > 5s. Duration Mean slowdown 2.5 best fixed parametric min. age analytic minimum age 2.0 Fig.7.Themeanslowdownforeightruns,usingthetwocriteriaforminimummigrationage. 1.5 Thevalueofthebestxedparameterisshowninparenthesesforeachrun. 1.0 0 1 2 3 4 5 6 7 (0.3) (0.5) (0.5) (0.3) (0.3) (0.5) (0.8) (0.9)

26 thepotentialmigrant). Wecomparetheanalyticcriterionwiththexedparametercriterion: M.Harchol-BalterandA.B.Downey whereisafreeparameter.thisparameterismeanttomodelpreemptivemigrationstrategiesintheliterature,asdiscussedinsection3.2.forcomparison, migrationage=migrationcost Minimum considerableadvantage. thebestxedparameter.thebestxedparametervariesconsiderablyfromrun thesmallestmeanslowdown.ofcourse,thisgivesthexedparametercriteriona wewillusethebestxedparameter,whichis,foreachrun,thevaluethatyields torun,andappearstoberoughlycorrelatedwiththeaverageloadduringtherun (therunsaresortedinincreasingorderoftotalload). Figure7comparestheperformanceoftheanalyticminimumagecriterionwith somany(usuallyhand-tuned)parameters. Wefeelthattheeliminationofonefreeparameterisausefulresultinanareawith isthatitisparameterless,andthereforemorerobustacrossavarietyofworkloads. performanceofthebestxedvaluecriterion.theadvantageoftheanalyticcriterion Theperformanceoftheanalyticcriterionisalwayswithinafewpercentofthe 6.WEAKNESSESOFTHEMODEL thisworkload(seesection3.2). istoolow,andtheparameterusedinmosix(=1:0)istoohigh,atleastfor ThisresultalsosuggeststhattheparameterusedbyKruegerandLivny(=0:1) Oursimulationignoresanumberoffactorsthatwouldaecttheperformanceof migrationinrealsystems: boundjobs,however,cpucontentionhaslittleeectonresponsetime.thesejobs sponsetimenecessarilyimprovesiftheyrunonahostwithalighterload.fori/o wouldbenetlessfrommigration.toseehowlargearolethisplaysinourresults, wenotedthenamesoftheprocessesthatappearmostfrequentlyinourtraces CPU-boundjobsonly.OurmodelconsidersalljobsCPU-bound;thus,theirre- (withcputimegreaterthan1second,sincethesearetheprocessesmostlikelyto bemigrated).themostcommonnameswerecc1plusandcc1,bothofwhichare CPUbound.Nextmostfrequentwere:trn,cpp,ld,jove(aversionofemacs),and isreasonableformanyofthemostcommonjobs.insection7wediscussfurther implicationsofaworkloadincludinginteractive,i/o-bound,andnon-migratable ps.soalthoughsomejobsinourtracesareinrealityinteractive,oursimplemodel newpropertyofprocesslifetimes.inanenvironmentwithadierentdistribution, thisstrategywillnotbeeective. Environment.Ourmigrationstrategytakesadvantageoftheused-better-than- round-robin.otherpolicies,likefeedbackscheduling,canreducetheimpactof longjobsontheperformanceofshortjobs,andtherebyreducetheneedforload balancing.weexplorethisissueinmoredetailinsection7andndthatpreemptive migrationisstillbenecialunderfeedbackscheduling. Localscheduling.Weassumethatlocalschedulingonthehostsissimilarto

ameasureddistributionandthereforeourmodelignoresanycorrelationbetween Memorysize.Oneweaknessofourmodelisthatwechosememorysizesfrom ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 27 lationbetweenmemorysizeandprocesscpuusage.kruegerandlivnyreport conductedaninformalstudyofprocessesinourdepartment,andfoundnocorre- memorysizeandotherprocesscharacteristics.tojustifythissimplication,we asimilarobservation[kruegerandlivny1988].thus,thismaybeareasonable simplication. tracasaresultofprocessmigration.weobserve,however,thatfortheloadlevels wesimulated,migrationsareoccasional(oneeveryfewseconds),andthatthereis seldommorethanonemigrationinprogressatatime. Networkcontention.Ourmodeldoesnotconsidertheeectofincreasednetwork 7.FUTUREWORK InourworkloadmodelwehaveassumedthatallprocessesareCPU-bound.Ofprimaryinterestinfutureworkisincludinginteractive,I/O-bound,andnon-migratable willbesomejobsthatshouldnotorcannotbemigrated.ani/o-boundjob,for example,willnotnecessarilyrunfasteronamorelightly-loadedhost,andmight migratedinteractivejobmightbenetbyrunningonamorelightly-loadedhost runslowerifitismigratedawayfromthediskorotheri/odeviceituses.a Inaworkloadthatincludesinteractivejobs,I/O-boundjobs,anddaemons,there jobsintoourworkload. interactions.finally,somejobs(e.g.manydaemons)cannotbemigratedaway ifitusessignicantcputime,butwillsuerperformancepenaltiesforallfuture fromtheirhosts. thesecostsmightbebasedontherecentbehaviorofthejob;e.g.thenumberand gration,includingnetworkdelays,accesstonon-localdata,etc.theestimatesof migrationcosttheadditionalcoststhatwillbeimposedonthesejobsaftermipropriatelywithinteractiveandi/oboundjobsbyincludinginthedenitionof Thepolicyweproposedforpreemptivemigrationcanbeextendedtodealap- frequencyofi/orequestsandinteractions.jobsthatareexplicitlyforbiddento migratecouldbeassignedaninnitemigrationcost. mightreducetheabilityofthemigrationpolicytomoveworkaroundthenetwork majorityofcputime.thus,evenifthemigrationpolicywereonlyabletomigrate andbalanceloadseectively.however,weobservethatthemajorityoflong-lived jobsare,infact,cpu-bound,anditistheselong-livedjobsthatconsumethe Thepresenceofasetofjobsthatareeitherexpensiveorimpossibletomigrate asubsetofthejobsinthesystem,itcouldstillhaveasignicantload-balancing eect. signicantincpuloadbalancing. generalenvironmentisthatn(thenumberofjobsatthesource)andm(thenumber typesofjobs,sinceonlycpu-boundjobsaectcpucontention,andthereforeare ofjobsatthetargethost)shoulddistinguishbetweencpu-boundjobsandother Anotherwayinwhichtheproposedmigrationpolicyshouldbealteredinamore atthehosts.mostpriorstudiesofloadbalancinghaveassumed,aswedo,thatthe localschedulingisround-robin(orprocessor-sharing).afewassumerst-come- rst-serve(fcfs)scheduling,butfewerstillhavestudiedtheeectoffeedback Anotherimportantconsiderationinloadbalancingistheeectoflocalscheduling

28 scheduling,whereprocessesthathaveusedtheleastcputimearegivenpriority overolderprocesses.wesimulatedfeedbackschedulingandfoundthatitgreatly M.Harchol-BalterandA.B.Downey onload)evenwithoutmigration.thus,thepotentialbenetofeithertypeof reducedmeanslowdown(fromapproximately2:5tobetween1:2and1:7,depending slowdownin5ofthe8runs,andonlydecreasingitby11%inthebestcase(highest load). backscheduling,andfoundthatitoftenmakesthingsworse,increasingthemean migrationisgreatlyreduced. Toevaluateourpreemptivepolicy,wehadtochangethemigrationcriterion Weevaluatedthenon-preemptivemigrationpolicyfromSection5.1.1underfeed- toreecttheeectoflocalscheduling.underprocessorsharing,weassumethat theslowdownimposedonaprocessisequaltothenumberofprocessesonthe andtargethoststhatareyoungerthanthemigrantprocess.usingthiscriterion, processes,sinceolderprocesseshavelowerpriority.thus,wemodiedthemigration host.underfeedbackscheduling,theslowdownisclosertothenumberofyounger preemptivemigrationreducesmeannormalizedslowdownby12{32%,andreduces criterioninsection3.1sothatnandmarethenumberofprocessesatthesource thenumberofseverelyslowedprocesses(slowdowngreaterthan5)by30{60%. realsystemsasitwasinoursimulations.forexample,decay-usageschedulingas interactionorotheri/oaregivenhigherpriority,whichallowsthemtointerfere usedinunixhassomecharacteristicsofbothround-robinandfeedbackpolicies [Epema1995].Youngjobsdohavesomeprecedence,butoldjobsthatperform Anissuethatremainsunresolvediswhetherfeedbackschedulingisaseectivein withshortjobs.inourexperimentsonasparcworkstationrunningsunos,we foundthatalong-runningjobthatperformsperiodici/ocanobtainmorethan50% ofthecputime,evenifitissharingahostwithmuchyoungerprocesses.the morerecentlotteryschedulingbehavesmorelikeprocessor-sharing[waldspurger andweihl1994].tounderstandtheeectoflocalschedulingonloadbalancing requiresaprocessmodelthatincludesinteractionandi/o. 8.CONCLUSIONS Toevaluatemigrationstrategies,itisimportanttomodelthedistributionof jobsareexpectedtobelong-lived.evenalifetimedistributionthatmatchesthe timatethebenetsofpreemptivemigration,becauseitignoresthefactthatold processlifetimesaccurately.assuminganexponentialdistributioncanunderes- Preemptivemigrationoutperformsnon-preemptivemigrationevenwhenmemorytransfercostsarehigh,forthefollowingreason:non-preemptivename-based andevaluatingloadbalancingpolicies. strategieschooseprocessesformigrationthatareexpectedtohavelonglives.if thispredictioniswrong,andaprocessrunslongerthanexpected,itcannotbe migratedaway,andmanysubsequentsmallprocesseswillbedelayed.apreemp- measureddistributioninbothmeanandvariancemaybemisleadingindesigning Migratingalongjobawayfromabusyhosthelpsnotonlythelongjob,but tivestrategyisabletopredictlifetimesmoreaccurately(basedonage)and,more importantly,ifthepredictioniswrong,thesystemcanrecoverbymigratingthe processlater.

alsothemanyshortjobsthatareexpectedtoarriveatthehostinthefuture.a busyhostisexpectedtoreceivemanyarrivalsbecauseoftheserialcorrelation ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 29 Usingthefunctionalformofthedistributionofprocesslifetimes,wehavederived (burstiness)ofthearrivalprocess. Exclusiveuseofmeanslowdownasametricofsystemperformanceunderstates acriterionfortheminimumtimeaprocessmustagebeforebeingmigrated.this benetsofpreemptiveloadbalancing.oneperformancemetricwhichismore thebenetsofloadbalancingasperceivedbyusers,andespeciallyunderstatesthe criterionisparameterlessandrobustacrossarangeofloads. Althoughpreemptivemigrationisdiculttoimplement,severalsystemshave relatedtouserperceptionisthenumberofseverelyslowedprocesses.while chosentoimplementitforreasonsotherthanloadbalancing.ourresultssuggestthesesystemswouldbenetfromincorporatingapreemptiveloadbalancing migrationreducesthembyafactoroften. non-preemptivemigrationeliminateshalfofthesenoticeabledelays,preemptive ACKNOWLEDGMENTS WewouldliketothankTomAndersonandthemembersoftheNOWgroupfor policy. commentsandsuggestionsonourexperimentalsetup.wearealsogreatlyindebted totheanonymousreviewersfromsigmetricsandsosp,andtojohnzahorjan, REFERENCES whosecommentsgreatlyimprovedthequalityofthispaper. Abrahams,D.M.andRizzardi,F.1988.TheBerkeleyInteractiveStatisticalSystem.W. Agrawal,R.andEzzet,A.1987.LocationindependentremoteexecutioninNEST.IEEE Ahmad,I.,Ghafoor,A.,andMehrotra,K.1991.Performancepredictionofdistributed W.NortonandCo. TransactionsonSoftwareEngineering13,8(August),905{912. Barak,A.,Shai,G.,andWheeler,R.G.1993.TheMOSIXDistributedOperating Artsy,Y.andFinkel,R.1989.Designingaprocessmigrationfacility:TheCharlotte loadbalancingonmulticomputersystems.insupercomputing(1991),pp.830{839. experience.ieeecomputer,47{56. Braverman,A.1995.PersonalCommunication. Bonomi,F.andKumar,A.1990.Adaptiveoptimalloadbalancinginanonhomogeneous (October),1232{1250. multiserversystemwithacentraljobscheduler.ieeetransactionsoncomputers39,10 System:LoadBalancingforUNIX.SpringerVerlag,Berlin. Bryant,R.M.andFinkel,R.A.1981.Astabledistributedschedulingalgorithm.In2nd Cabrera,F.1986.Theinuenceofworkloadonloadbalancingstrategies.InProceedings Casas,J.,Clark,D.L.,Konuru,R.,Otto,S.W.,Prouty,R.M.,andWalpole,J. oftheusenixsummerconference(june1986),pp.446{458. InternationalConferenceonDistributedComputingSystems(1981),pp.314{323. Casavant,T.L.andKuhl,J.G.1987.Analysisofthreedynamicdistributedloadbalancingstrategieswithvaryingglobalinformationrequirements.In7thInternational 1995.Mpvm:Amigrationtransparentversionofpvm.ComputingSystems8,2(Spring), ConferenceonDistributedComputingSystems(September1987),pp.185{192. 171{216.

30 M.Harchol-BalterandA.B.Downey Chowdhury,S.1990.Thegreedyloadsharingalgorithm.JournalofParallelandDistributedComputing9,93{99. DePaoli,D.andGoscinski,A.1995.Therhodosmigrationfacility.JournalofSystems andsoftware.submitted.seealsohttp://www.cm.deakin.edu.au/rhodos/. Douglis,F.andOusterhout,J.1991.Transparentprocessmigration:Designalternatives andthespriteimplementation.software{practiceandexperience21,8(august),757{ 785. Downey,A.B.andHarchol-Balter,M.1995.Anoteon\Thelimitedperformance benetsofmigratingactiveprocessesforloadsharing".technicalreportucb/csd-95-888(november),universityofcalifornia,berkeley. Eager,D.L.,Lazowska,E.D.,andZahorjan,J.1986.Adaptiveloadsharinginhomogeneousdistributedsystems.IEEETransactionsonSoftwareEngineering12,5(May), 662{675. Eager,D.L.,Lazowska,E.D.,andZahorjan,J.1988.Thelimitedperformancebenets ofmigratingactiveprocessesforloadsharing.inacmsigmetricsconferenceonmeasuring andmodelingofcomputersystems(may1988),pp.662{675. Epema,D.1995.Ananalysisofdecay-usageschedulinginmultiprocessors.InACMSigmetricsConferenceonMeasurementandModelingofComputerSystems(1995),pp.74{85. Evans,D.J.andButt,W.U.N.1993.Dynamicloadbalancingusingtask-transferprobabilites.ParallelComputing19,897{916. Hac,A.andJin,X.1990.Dynamicloadbalancinginadistributedsystemusingasenderinitiatedalgorithm.JournalofSystemsSoftware11,79{94. Hennessy,J.L.andPatterson,D.A.1990.ComputerArchitectureAQuantitativeApproach.MorganKaufmannPublishers,SanMateo,CA. Krueger,P.andLivny,M.1988.Acomparisonofpreemptiveandnon-preemptiveload distributing.in8thinternationalconferenceondistributedcomputingsystems(june 1988),pp.123{130. Kunz,T.1991.Theinuenceofdierentworkloaddescriptionsonaheuristicloadbalancing scheme.ieeetransactionsonsoftwareengineering17,7(july),725{730. Larsen,R.J.andMarx,M.L.1986.Anintroductiontomathematicalstatisticsandits applications(2nded.).prenticehall,englewoodclis,n.j. Leland,W.E.andOtt,T.J.1986.Load-balancingheuristicsandprocessbehavior.In ProceedingsofPerformance'86andACMSigmetrics,Volume14(1986),pp.54{69. Lin,H.-C.andRaghavendra,C.1993.Astate-aggregationmethodforanalyzingdynamic load-balancingpolicies.inieee13thinternationalconferenceondistributedcomputing Systems(May1993),pp.482{489. Litzkow,M.andLivny,M.1990.ExperiencewiththeCondordistributedbatchsystem. InIEEEWorkshoponExperimentalDistributedSystems(1990),pp.97{101. Litzkow,M.,Livny,M.,andMutka,M.1988.Condor-ahunterofidleworkstations. In8thInternationalConferenceonDistributedComputingSystems(June1988). Livny,M.andMelman,M.1982.Loadbalancinginhomogeneousbroadcastdistributed systems.inacmcomputernetworkperformancesymposium(april1982),pp.47{55. Milojicic,D.S.1993.LoadDistribution:ImplementationfortheMachMicrokernel.PhD Dissertation,UniversityofKaiserslautern. Mirchandaney,R.,Towsley,D.,andStankovic,J.A.1990.Adaptiveloadsharing inheterogeneousdistributedsystems.journalofparallelanddistributedcomputing9, 331{346. Powell,M.andMiller,B.1983.ProcessmigrationsinDEMOS/MP.In6thACMSymposiumonOperatingSystemsPrinciples(November1983),pp.110{119. Prouty,R.1996.PersonalCommunication. Pulidas,S.,Towsley,D.,andStankovic,J.A.1988.Imbeddinggradientestimators inloadbalancingalgorithms.in8thinternationalconferenceondistributedcomputing Systems(June1988),pp.482{490.

Rommel,C.G.1991.Theprobabilityofloadbalancingsuccessinahomogeneousnetwork. Rosin,R.F.1965.Determiningacomputingcenterenvironment.Communicationsofthe IEEETransactionsonSoftwareEngineering17,922{933. ExploitingProcessLifetimeDistributionsforDynamicLoadBalancing 31 Silberschatz,A.,Peterson,J.,andGalvin,P.1994.OperatingSystemConcepts,4th Svensson,A.1990.History,anintelligentloadsharinglter.InIEEE10thInternational ACM8,7. Tanenbaum,A.,vanRenesse,R.,vanStaveren,H.,andSharp,G.1990.Experiences Edition.Addison-Wesley,Reading,MA. Theimer,M.M.,Lantz,K.A.,andCheriton,D.R.1985.Preemptableremoteexecution ConferenceonDistributedComputingSystems(1990),pp.546{553. withtheamoebadistributedoperatingsystem.communicationsoftheacm,336{346. facilitiesforthev-system.in10thacmsymposiumonoperatingsystemsprinciples Vahdat,A.M.,Ghormley,D.P.,andAnderson,T.E.1994.Ecient,portable,and Vahdat,A.1995.PersonalCommunication. Thiel,G.1991.Locusoperatingsystem,atransparentsystem.ComputerCommunications14,6,336{346. (December1985),pp.2{12. Waldspurger,C.A.andWeihl,W.E.1994.Lotteryscheduling:Flexibleproportionalshareresourcemanagement.InProceedingsoftheFirstSymposiumonOperatingSystem robustextensionofoperatingsystemfunctionality.technicalreportucb//csd-94-842, Wang,J.,Zhou,S.,K.Ahmed,andLong,W.1993.LSBATCH:Adistributedloadsharing UniversityofCalifornia,Berkeley. Wang,Y.-T.andMorris,R.J.1985.Loadsharingindistributedsystems.IEEETransactionsonComputersc-94,3(March),204{217. batchsystem.technicalreportcsri-286(april),computersystemsresearchinstitute, DesignandImplementation(November1994),pp.1{11. Zayas,E.R.1987.Attackingtheprocessmigrationbottleneck.In11thACMSymposium UniversityofToronto. Zhang,Y.,Hakozaki,K.,Kameda,H.,andShimizu,K.1995.Aperformancecomparison onoperatingsystemsprinciples(1987),pp.13{24. ofadaptiveandstaticloadbalancinginheterogeneousdistributedsystems.inproceedings Zhou,S.andFerrari,D.1987.Ameasurementstudyofloadbalancingperformance.In Zhou,S.1987.Performancestudiesfordynamicloadbalancingindistributedsystems. Ph.D.Dissertation,UniversityofCalifornia,Berkeley. ofthe28thannualsimulationsymposium(april1995),pp.332{340. Zhou,S.,Wang,J.,Zheng,X.,andDelisle,P.1993.Utopia:aload-sharingfacil- IEEE7thInternationalConferenceonDistributedComputingSystems(October1987), ityforlargeheterogeneousdistributedcomputingsystems.software{practiceandex- peience23,2(december),1305{1336. pp.490{497.