Performance Comparisons of Load Balancing Algorithms for I/O- Intensive Workloads on Clusters



Similar documents
Chapter 4 Multiple-Degree-of-Freedom (MDOF) Systems. Packing of an instrument

Vladimir PAPI], Jovan POPOVI] 1. INTRODUCTION

REVISTA INVESTIGACION OPERACIONAL Vol. 25, No. 1, k n ),

Proving the Computer Science Theory P = NP? With the General Term of the Riemann Zeta Function

Jorge Ortega Arjona Departamento de Matemáticas, Facultad de Ciencias, UNAM

A new proposal for computing portfolio valueat-risk for semi-nonparametric distributions

No Regret Learning in Oligopolies: Cournot vs Bertrand

Evaluation and Modeling of the Digestion and Absorption of Novel Manufacturing Technology in Food Enterprises

Mobile Data Mining for Intelligent Healthcare Support

Trust Evaluation and Dynamic Routing Decision Based on Fuzzy Theory for MANETs

Analysis of Coalition Formation and Cooperation Strategies in Mobile Ad hoc Networks

The Design of a Forecasting Support Models on Demand of Durian for Domestic Markets and Export Markets by Time Series and ANNs.

Mobile Data Mining for Intelligent Healthcare Support

Harmony search algorithms for inventory management problems

American Journal of Business Education September 2009 Volume 2, Number 6

7.2 Analysis of Three Dimensional Stress and Strain

Professional Liability Insurance Contracts: Claims Made Versus Occurrence Policies

APPENDIX III THE ENVELOPE PROPERTY

Object Tracking Based on Online Classification Boosted by Discriminative Features

Financial Time Series Forecasting with Grouped Predictors using Hierarchical Clustering and Support Vector Regression

Determinants of Foreign Direct Investment in Malaysia: What Matters Most?

Solving Fuzzy Linear Programming Problems with Piecewise Linear Membership Function

ANOVA Notes Page 1. Analysis of Variance for a One-Way Classification of Data

EQUITY VALUATION USING DCF: A THEORETICAL ANALYSIS OF THE LONG TERM HYPOTHESES

CONVERGENCE AND SPATIAL PATTERNS IN LABOR PRODUCTIVITY: NONPARAMETRIC ESTIMATIONS FOR TURKEY 1

A Security-Oriented Task Scheduler for Heterogeneous Distributed Systems

Quantifying Environmental Green Index For Fleet Management Model

IDENTIFICATION OF THE DYNAMICS OF THE GOOGLE S RANKING ALGORITHM. A. Khaki Sedigh, Mehdi Roudaki

s :risk parameter for company size

CHAPTER 22 ASSET BASED FINANCING: LEASE, HIRE PURCHASE AND PROJECT FINANCING

A quantization tree method for pricing and hedging multi-dimensional American options

55. IWK Internationales Wissenschaftliches Kolloquium International Scientific Colloquium

Anomaly Detection of Network Traffic Based on Prediction and Self-Adaptive Threshold

Lecture 13 Time Series: Stationarity, AR(p) & MA(q)

HIGH FREQUENCY MARKET MAKING

Value of information sharing in marine mutual insurance

Pedro M. Castro Iiro Harjunkoski Ignacio E. Grossmann. Lisbon, Portugal Ladenburg, Germany Pittsburgh, USA

Green Master based on MapReduce Cluster

Longitudinal and Panel Data: Analysis and Applications for the Social Sciences. Edward W. Frees

FORECASTING MODEL FOR AUTOMOBILE SALES IN THAILAND

The Unintended Consequences of Tort Reform: Rent Seeking in New York State s Structured Settlements Statutes

Business School Discipline of Finance. Discussion Paper Modelling the crash risk of the Australian Dollar carry trade

Natural Gas Storage Valuation. A Thesis Presented to The Academic Faculty. Yun Li

The Economics of Administering Import Quotas with Licenses-on-Demand

Report 52 Fixed Maturity EUR Industrial Bond Funds

PORTFOLIO CHOICE WITH HEAVY TAILED DISTRIBUTIONS 1. Svetlozar Rachev 2 Isabella Huber 3 Sergio Ortobelli 4

Applications of Support Vector Machine Based on Boolean Kernel to Spam Filtering

ANALYTICAL MODEL FOR TCP FILE TRANSFERS OVER UMTS. Janne Peisa Ericsson Research Jorvas, Finland. Michael Meyer Ericsson Research, Germany

Integrating Production Scheduling and Maintenance: Practical Implications

RESEARCH ON PERFORMANCE MODELING OF TRANSACTIONAL CLOUD APPLICATIONS

How To Calculate Backup From A Backup From An Oal To A Daa

Bullwhip Effect Measure When Supply Chain Demand is Forecasting

The Virtual Machine Resource Allocation based on Service Features in Cloud Computing Environment

How To Make A Supply Chain System Work

Price Volatility, Trading Activity and Market Depth: Evidence from Taiwan and Singapore Taiwan Stock Index Futures Markets

Approximate hedging for non linear transaction costs on the volume of traded assets

Markit iboxx USD Liquid Leveraged Loan Index

Dynamic Provisioning Modeling for Virtualized Multi-tier Applications in Cloud Data Center

A Study of Unrelated Parallel-Machine Scheduling with Deteriorating Maintenance Activities to Minimize the Total Completion Time

ECONOMIC CHOICE OF OPTIMUM FEEDER CABLE CONSIDERING RISK ANALYSIS. University of Brasilia (UnB) and The Brazilian Regulatory Agency (ANEEL), Brazil

The Digital Signature Scheme MQQ-SIG

Models for Selecting an ERP System with Intuitionistic Trapezoidal Fuzzy Information

6.7 Network analysis Introduction. References - Network analysis. Topological analysis

Dynamic Two-phase Truncated Rayleigh Model for Release Date Prediction of Software

CHAPTER 2. Time Value of Money 6-1

Traditional Smoothing Techniques

1. The Time Value of Money

An Architecture to Support Distributed Data Mining Services in E-Commerce Environments

On the impact of heterogeneity and back-end scheduling in load balancing designs

A Parallel Transmission Remote Backup System

Optimal multi-degree reduction of Bézier curves with constraints of endpoints continuity

An Effectiveness of Integrated Portfolio in Bancassurance

A New Bayesian Network Method for Computing Bottom Event's Structural Importance Degree using Jointree

Load Balancing Algorithm based Virtual Machine Dynamic Migration Scheme for Datacenter Application with Optical Networks

PerfCenter: A Methodology and Tool for Performance Analysis of Application Hosting Centers

Approximation Algorithms for Scheduling with Rejection on Two Unrelated Parallel Machines

of the relationship between time and the value of money.

Internal model in life insurance : application of least squares monte carlo in risk assessment

Simple Linear Regression

Average Price Ratios

Agent-based modeling and simulation of multiproject

Online Appendix: Measured Aggregate Gains from International Trade

RUSSIAN ROULETTE AND PARTICLE SPLITTING

OPTIMAL KNOWLEDGE FLOW ON THE INTERNET

Generating Intelligent Teaching Learning Systems using Concept Maps and Case Based Reasoning

Efficient Traceback of DoS Attacks using Small Worlds in MANET

Task is a schedulable entity, i.e., a thread

Software Reliability Index Reasonable Allocation Based on UML

Claims Reserving When There Are Negative Values in the Runoff Triangle

GARCH Modelling. Theoretical Survey, Model Implementation and

Optimal Packetization Interval for VoIP Applications Over IEEE Networks

FINANCIAL MATHEMATICS 12 MARCH 2014

Low-Cost Side Channel Remote Traffic Analysis Attack in Packet Networks

Abraham Zaks. Technion I.I.T. Haifa ISRAEL. and. University of Haifa, Haifa ISRAEL. Abstract

A PRACTICAL SOFTWARE TOOL FOR GENERATOR MAINTENANCE SCHEDULING AND DISPATCHING

Three Dimensional Interpolation of Video Signals

Methodology of the CBOE S&P 500 PutWrite Index (PUT SM ) (with supplemental information regarding the CBOE S&P 500 PutWrite T-W Index (PWT SM ))

Fractal-Structured Karatsuba`s Algorithm for Binary Field Multiplication: FK

International Journal of Advanced Research in Computer Science and Software Engineering

Chapter 3. AMORTIZATION OF LOAN. SINKING FUNDS R =

Transcription:

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. Performace Comparsos of oad Balacg Algorhms for I/O- Iesve Worloads o Clusers Xao Q Deparme of Compuer Scece ad Sofware Egeerg Aubur Uversy, Aubur, A 36849 xq@aubur.edu hp://www.eg.aubur.edu/~xq Absrac oad balacg echques play a crcally mpora role developg hgh-performace cluser compug plaforms. Exsg load balacg approaches are cocered wh he effecve usage of ad ory resources. Due o mbalace ds I/O resources uder I/O-esve worloads, he prevous - or ory-aware load balacg schemes suffer sgfca performace drop. To remedy hs defcecy, hs paper we propose a ovel loadbalacg algorhm (hereafer referred o as B) for clusers, whch ams a maag hgh resource ulzao uder a wde rage of worload codos. Specfcally, B s coducve o reducg he average slowdow of all parallel obs submed o a cluser by balacg load ds resources. Ths ca, ur, o oly acheve he effecve usage of global ds resources bu also reduce respose mes of I/O-esve parallel obs. To heorecally sudy he opmzao of he B algorhm, we qualavely comparg B wh wo coveoal - ad ory-aware load-balacg schemes. We prove ha whe he worloads become -esve or ory-esve aure, B gracefully degrades owards he exsg load-balacg schemes. Expermeal resuls based o race-drve smulaos demosravely show ha he B algorhm sgfcaly mproves he resource ulzao of a cluser uder I/O-esve worloads. Furhermore, our resuls cofrm ha B s able o maa he same level of performace as he wo exsg approaches, because B mproves ad ory ulzao uder - ad ory-esve worloads.

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008.. Iroduco oad balacg echques play a mpora role he desg ad developme of hghperformace clusers. A varey of load balacg schemes [][3][2][7] ca be used o mprove performace of parallel ad dsrbued sysems clusers by assgg wor, a ru me, o compuaoal odes wh uder-ulzed resources. Dyamc load-balacg schemes have bee exesvely vesgaed, prmarly focusg o [2][4], ory [][28], ewor [3][8][24], or a combao of ad ory [32] resources. Alhough he exsg loadbalacg schemes are effecve maag hgh ulzao of resources, he prevous approaches suffer sgfca performace drop uder I/O-esve worloads due o mbalace ds I/O resources. I s worh og ha ds I/O resources become a performace boleec uder I/O-esve worloads, sce he performace gap bewee ad ds I/O s wdeg. I s beleved ha a way of solvg he ds I/O boleec problem s o leverage loadbalacg echques o acheve effecve usage of global ds resources clusers. I hs paper we propose a ovel load-balacg algorhm (hereafer referred o as B) for parallel obs rug o clusers. The B algorhm ams a mprovg ulzao of ds I/O,, ad ory resources a cluser uder a wde specrum of worload. Specfcally, B s coducve o balacg load a varey of resources, hereby reducg slowdows of all parallel obs submed o a cluser. Afer qualavely comparg B wh wo coveoal - ad ory-aware load-balacg schemes, we prove ha B gracefully degrades owards he exsg load-balacg schemes f worloads become - ad ory-esve, respecvely. We coduced race-drve smulaos o show ha he B algorhm sgfcaly mproves he resource ulzao of clusers uder I/O-esve worloads. 2

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. Moreover, our resuls cofrm ha B mproves ad ory ulzao uder - ad ory-esve worloads ad, herefore, B ca maa he same level of performace as he wo exsg approaches. The res of he paper s orgazed as follows. Relaed wor he leraure s brefly revewed he followg seco. Seco 3 descrbes a geerc model ad he B load-balacg algorhm for parallel obs. Seco 4 preses a qualave comparso bewee B ad he wo exsg load-balacg schemes. To cofrm he aalycal comparso, Seco 5 we made use of race-drve smulaos o quaavely evaluae performace of he B algorhm ad he alerave soluos. Fally, Seco 6 summarzes he ma corbuos of hs paper ad commes o fuure research drecos. 2. Relaed Wor I he pas decade, load balacg echques he coex of ad ory resources has bee exesvely suded rece years. For example, Harchol-Baler ad Dowey suded a preempve mgrao polcy ha s more effecve ha o-preempve mgrao polces uder -esve worloads [2]. Zhag e al. [32] proposed ew load sharg polces ha are cocered wh effecve usage of boh ad ory resources. The above load-balacg schemes are able o acheve hgh sysem performace uder - ad ory-esve worload codos, respecvely. A umber of approaches o balacg load ds I/O resources ca be foud he leraure [6][33]. ee e al. suded wo fle assgmes algorhms o balace load across all dss, hereby mag possble o mprove overall sysem performace by fully ulzg avalable hard drves [6]. Zhag e al. proposed hree I/O-aware schedulg schemes ha are aware of he 3

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. ob s spaal prefereces [33]. I rece years, he ssue of leveragg I/O cache ad buffer o boos performace of sorage sysems has bee repored he leraure. Cho e al. proposed a buffer replaceme scheme for effecve cachg of ds blocs[7]. Forey e al. vesgaed sorage-aware cachg algorhms heerogeeous clusers []. Ma e al. developed a acve bufferg mechasm o allevae he ds I/O boleec problem usg local dle ory ad overlappg I/O wh compuao [8]. We proposed a feedbac corol mechasm o mprove he performace of a cluser by adapvely mapulag he I/O buffer sze [23]. Our B load-balacg algorhm s complemeary o he aforemeoed cachg ad bufferg echques, meag ha B ca provdes addoal performace mproveme whe he exsg cachg ad bufferg mechasms. 3. A I/O-aware oad-balacg Algorhm 3. A geerc model A cluser compug plaform cosdered hs sudy cosss of a se = {, 2,, } of homogeeous odes coeced by a hgh-speed ercoeco ewor le Myre. oe ha he erms ode ad mache are used erchageably hroughou hs paper. Throughou hs paper, represes a se of ass rug o he h ode. Each ode a cluser s composed of a combao of varous resources, cludg processors, ory, ewor coecvy, ad dss. A load maager resdg each ode s resposble for load balacg ad moorg avalable resources of he ode. Each ob s assocaed o a home mache, hrough whch he ob s submed o he cluser. A ode becomes he home mache of a ob eher because he ob s ally creaed o he ode or because daa o be accessed by he ob s sored he ode. A 4

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. smlar home model was proposed by av ad Bara coex of load balacg [5]. I he case ha a ob s submed hrough s home ode, he correspodg load maager for he home ode s voed o allocae he ob o a ode or a group of odes wh he leas load. A ay me, a ode s eher havg s load maager performed or execug a as. Whe he load maager s carred ou, he uderlyg compuao may be cocurrely performed or suspeded. I s feasble o mae he load maager ad oher ass execued parallel, sce he load maager ca be rug he bacgroud by a expesve coprocessor [3]. I addo, s reasoable o assume ha all load maagers a cluser s capable of eepg rac of global load formao by moorg local resources ad sharg load formao hrough a drec commucao ewor [22]. I hs sudy we are cocered wh a class of embarrassg parallel applcaos, each of whch s represeed form of a se T = {τ, τ 2,..., τ m } of ass (also referred o as processes) ha are depede of oe aoher. Some real-world examples of embarrassg parallel applcaos ca be foud [29]. I s worh og ha he proposed load-balacg algorhm ca be readly egraed wh a commucao load balacer o deal wh parallel applcaos wh depede ass. A as τ T s modeled as a uple (c, s, λ, d ), where c s he compuaoal me, s s he requesed ory space measured by MByes, λ s he arrval rae of ds requess measured by umber of ds accesses per ms (o./ms), ad d s he ds reques s daa sze KByes. oe ha he parameers c ad s are used o descrbe as τ s requremes for ad ory, whereas he I/O requreme of τ s characerzed by he oher wo parameers - λ ad d. 3.2 Problem Formulao 5

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. Gve a as τ, we deoe ded as he me o execue he as o a dedcaed compug cluser, ad as he me o execue he as o he same cluser a me-sharg seg. I our ded, ded, model, we cosder hree resources:, ory, ad ds I/O. e,, ad be he mes of a as spe o, faul hadlg, ad ds I/O processg a dedcaed ded, ded, ded, mode. The values of,, ad ded, ca be respecvely derved from τ s requreme parameers, cludg c, s, λ, ad d. e,, ad deoe he mes spe o he hree resources a me-shared mode. Sce ass rug o compuaoal odes may be delayed by coeo for resources, he slowdow mposed o as τ s expressed as he rao bewee he as s execuo me he me-shared mode ad s execuo me o he same cluser he dedcaed mode. Thus, he slowdow of τ s wre as sd = =. () ded ded, ded, ded, Gve a parallel applcao wh as se T = {τ, τ 2,..., τ m }, he slowdow of he applcao s calculaed as he average slowdow of all he depede ass T. Thus, he slowdow of he parallel applcao s wre as m m sd ( T ) = sd =. (2) ded, ded, ded, m m = = For a specal case where all he depede ass he se T are decal, (.e., ded m : =, ded,, ded, ded, ded, ded, =, = ), Eq. (2) ca be rewre as follows m sd ( T ) =.,,, ded ded ded (3) m ( ) = 6

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. The goal of he proposed load-balacg algorhm s o reduce he average slowdow of all parallel applcaos submed o a cluser. Ths ca, ur, mmze he average respose me of he rug applcaos. Specfcally, our algorhm ams a opmzg he followg average slowdow of a sequece of parallel applcaos (e.g., T, T 2,, T q ) execued o a cluser q mmze sd = sd( T ) q q m = = = = q m ded, 3.3 A load-balacg algorhm for I/O-esve worloads ded, ded,. (4) ow we prese a load-balacg algorhm (hereafer referred o as B) for a wde varey of worload codos cludg I/O-esve, -esve, ad ory-esve, worloads. The obecve of he proposed B algorhm s o balace he load of hree ypes of resources across all odes a cluser such ha he average slowdow of submed parallel obs s mmzed. Sce he goal of hs sudy s o aalycally evaluae he performace of he B algorhm, we are focused o a remoe execuo mechasm whch a as ca be rug o a remoe ode where sared execuo. Thus, preempve mgraos of ass are o suppored he B algorhm. everheless, he B load-balacg algorhm ca be readly egraed wh a preempve mgrao mechasm, hereby provdg furher performace mproveme. Recely, we suded a load-balacg algorhm wh preempve mgrao, ad deals of hs algorhm ca be foud [22]. To faclae he descrpo of he B algorhm, we frs roduce he followg hree load dces wh respec o, ory, ad I/O resources. The load dex of ode s defed as he sum of remag lfemes of ass rug o he ode. Thus, s expressed as = r, where r s he expeced remag lfeme of as τ. The τ 7

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. ory load dex of ode s defed as he sum of faul processg mes of ass o ode. Hece, we have = τ. Smlarly, he I/O load dex of ode s he sum of I/O processg mes of ass o he ode. Therefore, he I/O load dex ca be wre as = τ ( r λ ), (5), where, s as τ s he I/O processg me of each ds reques. The value of, Eq. (5) s compued by d, = see ro, (6) B ds where see ad ro are he see me ad roaoal laecy, ad depedg o daa sze d ad ds badwdh B ds. B d ds s he daa rasfer me I lgh of he hree load dces descrbed above, we propose a ew cocep of load mbalace facor o quafy he amou of mbalace a cluser. The load mbalace facor of a resource s a produc of he fraco of me spe o usg he resource a cluser ad he dscrepacy bewee he maxmum ad he mmum loads of he resource amog all odes he cluser. More specfcally, he load mbalace facors for, ory, ad I/O resources ca be wre as Eq. (7)-(9). IF = = = = = =, (7) 8

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. where = = IF IF = = = = = = = = = =, = = = =, ad = =, (8), (9). O he rgh-had sdes of Eqs. (7)-(9), he frs erms, whch reflec he mporace of he hree ypes of resources, are he fracos of mes spe o compug, faul hadlg, ad ds I/O processg, respecvely. The secod erms o he rgh-had sde of Eqs. (7)-(9) are used o measure he amou of mbalace he hree resources. The load mbalace facor IF of a cluser ca be derved from Eqs. (7)-(9) as he sum of he load mbalace facors of he hree resource ypes. Thus, we have IF = IF IF IF. (0) ow we are posoed o deleae he B load-balacg algorhm, of whch he pseudocode s show Fg.. Gve a embarrassg parallel applcao wh a se T of depede ass submed o a local ode of a cluser, he B algorhm mae a effor o balace worload of he cluser s resources by allocag each as T o a compuaoal ode such ha he as s expeced respose me (also ow as ur aroud me) s mmzed. I oher words, B ams o redsrbue load amog all he ode a cluser, hereby allowg he submed parallel applcao o effcely ru o he cluser. For each as τ se T, he B algorhm repeaedly performs Seps 2-2 descrbed as 9

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. follows. Frs, he respose me esmao of R of τ o he local ode s approxmaed by Sep 2. The R s mpora because wll be used o usfy wheher a remoe execuo of τ s worhwhle (see Seps 6,, ad 6). Algorhm: -Aware oad Balacg (B) Ipu: A local ode, a ob wh as se T submed o.. for (each as τ T ) do 2. calculae he respose me R of τ o ode ; c λ ad IF = max( IF, IF, IF ) ad max( ) 3. f 0, > 4. choose ode such ha m( ) 5. calculae he respose me = ; a = R of τ o ode 6. f c < ( c ) 2,, a ; λ λ ad R > R e he 7. dspach as τ o ode 8. else locally execue τ o ad remoely execue τ o ; 9. else f > 0 ad IF > IF ad max( a ) a= 0. choose ode such ha = m( );. calculae he respose me 2. f ( ) 2 3. dspach as τ o ode 4. else locally execue τ o 5. else f max( ) a = R of τ o ode a ; = he ; < ad R > R e he = he a= a ad remoely execue τ o ; 6. choose ode such ha m( ) 7. calculae he respose me 8. f r ( r ) 2 9. dspach as τ o ode 20. else locally execue τ o 2. Updae he load saus; 22.ed for = ; a = R of τ o ode a ; < ad R > R e he ; ad remoely execue τ o ; = he a= ; Fg.. Pseudocode of he B load-balacg algorhm. a 0

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. Secod, Sep 3 s resposble for ag he process of balacg worload of ds I/O resources. Specfcally, Seps 4-7 are voed o balace he load of I/O resources he case ha all he followg hree codos hold Sep 3. Codo (.e., c λ 0 ) saes ha, > he I/O load of τ mus be greaer ha zero (see Theorems 2 ad 3). Codo 2 (.e., ( IF, IF IF ) IF = max, ) says ha he load mbalace facor of ds I/O mus be hgher ha hose ad ory resources. Codo 3 (.e., max( ) I/O load of he local ode s he hghes amog hose of all he odes. = ) meas ha he Thrd, f becomes a ecessy o balace I/O load, he Sep 4 chooses he mos a= a approxmae remoe ode wh he lghes load wh respec o ds I/O, followed by esmag he respose me R of τ o he caddae ode. Sep 6 s of crcal mporace o esure performace mproveme acheved by havg τ execued remoely. More specfcally, before Sep 7 dspaches τ ad has remoely execued o, Sep 6 mus mae sure ha he followg wo codos are sasfed. Codo (.e., λ < c, ( c λ ) 2, reduced. Codo 2 (.e., ) guaraees ha he load dscrepacy bewee ad R s > R e, where e s he remoe execuo overhead) esures ha he expeced respose me of τ o he seleced remoe ode of τ o he local ode has τ locally execued o s less ha he respose me. I he case ha he remoe execuo of τ s o beefcal, Sep 8. Fourh, Sep 9 decdes f he followg hree codos are sasfed before a meagful

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. remoe execuo s performed. Codo (.e., > 0 ) says ha τ mus exhb faul behavor (see Theorems 5 ad 6). Page faul behavors occur whe he ory space requred by rug ass exceeds he amou of avalable ory space. Codo 2 (.e., IF > IF ) dcaes ha he load mbalace facor of ory resources mus be hgher ha ha resources. Codo 3 (.e., max( ) = ) saes ha he load of -faul processg he local ode s he hghes amog hose of all he odes. If he above codos hold, Seps 0-4 am o balace he load of ory resources by rasferrg as τ from he overloaded ode o a remoe ode ha are lghly loaded wh respec o ory. Sep 2 s carred ou o guaraee ha he remoe execuo of τ leads o performace mproveme. Ffh, f here s o way of balacg he ds I/O ad ory resources he cluser, Seps 5-20 aemp o evely dsrbue he load. Whe he local ode s overloaded wh respec o resource (See Sep 5), as τ s dspached o ad execued by a remoe ode wh he lghes load. Sep 9 maes he remoe execuo possble f such a remoe execuo s beefcal (see Sep 8). as, Sep 2 maas updaed load formao ha s broadcased o he local ode ad oher odes he cluser. The followg heorem proves he me complexy of he B load-balacg algorhm. Theorem. Gve a cluser ad a parallel applcao submed o he cluser, he me complexy of he B algorhm s O(m), where s he umber of odes he cluser, m s he umber of ass he applcao, ad he values of ad m are much larger ha 2. Proof. I aes O() me o compue he respose me of a as o a ode. The me complexy a= a 2

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. of deermg ha a local ode s overly loaded s O(), sce here are odes he cluser (see Sep 3). Sep 4 aes O() me o choose he mos approprae ode wh he mmal load. Seps 6 ad 7 ae O() me. Hece, he me complexy of balacg ds I/O resources s O(22) (see Seps 7-2). Smlarly, he me complexes of balacg ory ad resources are boh O(22). Sce here are m ass he parallel applcao, he me complexy of he B algorhm s O(22)O(m) = O(2()m). The values of ad m mos cases are much larger ha 2 ad, herefore, he me complexy becomes O(m). 4. A Aalycal Comparso I hs seco, we frs prove mpora properes of he B algorhms (see emmas -2, Theorems 2-6). ex, we qualavely compare B wh wo exsg load-balacg algorhms (see Theorems 7 ad 9). 4. Properes Theorem 2. e c, λ, ad, be he execuo me, I/O arrval rae of as τ, ad he I/O processg me of each ds reques. If he value of c, λ s zero, he he allocao of τ has o mpac o balacg ds I/O resources. Proof. Before he allocao of τ, he amou of mbalace wh respec o ds I/O resources ca be measured by =, where = τ s allocaed o, ad pror o he arrval of =. Whou loss of geeraly, we assume τ he I/O load of s = ( r λ, ) (see Eq. 5). The ds I/O load dex afer dspachg as τ o ode becomes τ 3

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. = c ( r λ ) λ, τ,. Sce he value of c λ, s zero, we have ( r λ, ) = ( r λ, ) =. = c, τ τ λ Thus, he values of ad are decal, meag ha he allocao of τ has o mpac o balacg ds I/O resources. Corollary. If he I/O arrval rae λ of as τ s zero, he he allocao of τ has o mpac o balacg ds I/O resources. Proof. The value of λ becomes zero f he I/O arrval rae λ of as τ s zero. The, c, he proof s mmedae from Theorem. Theorem 3. Suppose here s a as τ (ally submed o ode ) o be allocaed a cluser; he ds I/O processg me of τ s c, λ. The c λ 0 s a ecessary codo, > for allocag as τ a way o balace load wh respec o ds I/O. Proof. To prove he correcess of Theorem 3, we have o show ha allocag τ a way o balace load ds I/O ha ad, hece, le us assume ha c, λ s larger ha 0. Ths ca be proved by coradco equals o 0. Sce c λ 0, Theorem 2 shows ha he, = allocao of τ has o mpac o balacg ds I/O resources, meag ha here s o way of allocag τ such ha he mbalace load ds I/O s allevaed. Hece, we oba he coradco ha complees he proof. Theorem 4. e ad, be he faul processg me ad ory requreme of asτ. e deoe he accumulao of he ory space allocaed o ass rug o he 4

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. ode. Thus, we have =,. The faul processg me s compued as, =, p M 0, f r see ro M, d B ds, oherwse, () where p s he faul rae, M s he oal ory space avalable o ode, r s he expeced remag lfeme, see ad ro are he see me ad roaoal laecy, d s he sze, ad d B ds s he daa rasfer me. Proof. Frs, we have o prove ha s zero f M. Whe he oal avalable ory space M ca mee he ory demads of ass rug o he h ode (.e., < M ), o faul occurs he ode. I hs case τ exhbs o faul behavor, ad he faul processg me s zero. Secod, le us prove ha = p M, r see ro d B ds f > M. If s larger ha M, he he ode ecouers fauls. The umber of fauls π s proporoal o () he faul rae p, (2) he accumulaed ory space allocaed o all he rug ass o, ad (3) he expeced remag lfeme r. Furhermore, π s versely proporoal o he oal avalable ory space M. Therefore, he umber of fauls π of as τ ca be wre as π = p M r. The I/O processg me µ of each faul s he summao of he see me see, he roaoal laecy ro, ad he daa rasfer me 5

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. d B ds. Thus, he I/O processg me µ s expressed by d µ = see ro. Hece, he B ds faul processg me s wre as follows f > M = π µ = p = p M, M r r µ see ro d B ds, whch complees he proof of he heorem 3. Theorem 5. e τ deoe a as o be allocaed a cluser. For each ode a cluser, f he oal avalable ory space M s able o mee he ory demads of τ ad ass rug o (.e., M, ), he he allocao of as τ has o mpac o balacg ory resources. Proof. I s proved ha M, sce we have, M for each ode he cluser. I lgh of Theorem 3, we ca prove ha before he allocao of τ, s rue ha τ : = 0. Hece, we have : = = 0. τ The amou of mbalace wh respec o ory resources ca be measured by =, where = =. Pror o he allocao of τ, we have = 0. Smlarly, afer = he allocao of τ, we have : = = 0 τ, because M,. Cosequely, he value of = s sll zero afer he allocao of τ, meag ha 6

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. he allocao of τ has o mpac o balacg ory resources. Before proceed o he proof of a ecessary codo for allocag ass such ha he global usage of ory resources a cluser s mproved, we frs prove he followg wo lemmas. emma. Suppose here s a as τ (ally submed o ode ) o be allocaed a cluser; he faul processg me of τ o o s. If he faul processg me of τ equals o zero, he he faul processg mes of all ass rug o zero. More formally, we have = 0 τ : = 0. Proof. e equal o be he accumulave ory space allocaed o ass rug o pror o he allocao of τ. Sce equals o zero, we show ha he sum of, ad s smaller ha he oal avalable ory space (.e., M, ). Hece, we have M, meag ha he faul processg me of each as rug o s zero (.e., τ : = 0 ). Hece, he proof. emma 2. Suppose here s a as τ (ally submed o ode ) o be allocaed a cluser; he faul processg me of τ o o s equals o zero, he he ory load dex. If he faul processg me of τ of ode equals o zero (.e., = 0). Proof. The ory load dex s measured by = τ. As per emma, s proved ha he faul processg me of all ass rug o are zero (.e., τ : = 0 ). Therefore, we have = = 0. τ 7

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. Theorem 6. Suppose here s a as τ (ally submed o ode ) o be allocaed a cluser; he faul processg me of τ o s. The > 0 s a ecessary codo for allocag as τ a way o balace load erms of ory resources. Proof. The proof of Theorem 5 s mmedae from emma 2. To prove he correcess of Theorem 5, we have o show ha allocag τ a way o balace load ory resource mples ha s larger ha 0. Ths ca be proved by coradco ad, hece, le us assume ha equals o 0. Sce = 0, emma 2 shows ha he load dex wh respec o ory resource s 0, mplyg ha here s o way of allocag τ such ha load ory resource s balaced. Hece, we oba he coradco ha complees he proof. 4.2 A qualave comparso ow we qualavely compare he B algorhm wh wo exsg schedulg approaches: he -based load-balacg algorhm (hereafer referred o as CB) [2] ad he orybased load-balacg algorhm (hereafer referred o as MB) [32]. The CB load-balacg polcy srves o mprove global usages of resources by balacg load across all odes a cluser. The MB load-balacg polcy s coducve o balacg worload wh respec o ory resources a cluser whe he cluser expereces a large umber of fauls due o suffce ory space. Boh CB ad MB are respecvely cocered wh effecve usages of global ad ory resources clusers whou addressg he ssue of balacg ds I/O resources. Cosequely, he exsg load-balacg approaches become adequae for mxed worloads wh -, ory-, ad I/O-esve applcaos. Theorem 7. Suppose here s a as τ (ally submed o ode ) o be allocaed a cluser; 8

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. ode has he lghes load ds I/O (.e., m( ) = ). If he followg fve codos are sasfed, he he B algorhm ouperforms he CB ad MB algorhms. c λ, (2) IF = max( IF, IF, IF ), (3) max( ) () 0 c,, > ( c λ ) 2, a = λ <, ad (5) R > R e. a =, (4) Proof. Frs, heorem 3 shows ha c λ 0 s a ecessary codo for balacg ds, > I/O load. Secod, IF max( IF, IF, IF ) = dcaes ha balacg ds I/O load ca acheve more performace mproveme ha balacg ory or resources. Thrd, ( ) max a a= = meas ha he local ode s overly loaded erms of ds I/O. Sce boh he CB ad MB algorhms do o ae ds I/O load o accou, leavg ds I/O resources severely mbalaced uder codos (), (2), ad (3). he suffer sgfca performace drop uder I/O-esve worload due o he mbalace of I/O load. Furhermore, codo (4) guaraees ha allocag as τ o ode wh he lghes ds I/O load ca effcely allevae he mbalace problem, whereas codo (5) esures ha he expeced respose me of τ o he caddae remoe ode s less ha he respose me of τ o he local ode. Cosequely, f he above codos hold, he he B algorhm ouperforms he CB ad MB algorhms. Theorem 8. If each as τ rug o a cluser mposes o load o ds resources (.e., a= a c, λ = 0), he he B algorhm degrades o he MB algorhm. Proof. If he value of c, λ for each as τ s 0, he frs codo Sep 3 he B 9

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. algorhm does o hold, he Seps 4-8 used o balace ds I/O load across dss he cluser are o performed by B. I hs case he B algorhm srves o balace load wh respec o ory resources f he ory load exceeds he amou of avalable ory space. Ths meas ha he behavor of B s he same as ha of MB f he value of c, λ for each as τ s 0. Hece, case ha each as τ rug o a cluser mposes o load o ds resources (.e., c, λ = 0), he performace of B ad MB are decal. Ths complees he proof of Theorem 5. Theorem 9. If each as τ rug o a cluser mposes o load o ds ad ory resources (.e., λ = 0 ad = 0 ), he he B algorhm degrades o he CB algorhm. c, Proof. Frs, f he value of c, λ for each as τ s 0, he frs codo Sep 3 he B algorhm does o hold, he Seps 4-8 used o balace ds I/O load across dss he cluser are o performed by B. Secod, f he faul processg me of τ s zero, Seps 9-4 are o voed o mprove he global usage of ory resources he cluser. I hs case he B algorhm maes a effor o evely dsrbue he load. Therefore, he behavor of B s he same as ha of CB. Ths complees he proof of Theorem 8. 5. Expermeal Resuls I hs seco we quaavely compare he B algorhm wh he wo exsg loadbalacg approaches (see Seco 4.2): CB ad MB. We coduced race-drve smulaos usg a smulaed cluser wh 32 odes provdg a me-sharg evrome. A smlar smulao evrome was deleaed [23]. The races used our expermes were 20

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. exrapolaed from hose races repored [2][32]. The umber of ass of each parallel ob races s seleced radomly wh a uform dsrbuo bewee 2 o 32. mes ad ory demads of obs are specfed he races. We assume ha he ds reques arrval rae of each ob s geeraed radomly wh a uform dsrbuo. Ths assumpo s reasoable because he mea ds reques arrval rae ca be mapulaed ad examed as a sysem parameer. I our emprcal sudes, we vared he mea ds reques arrval rae from 0.8 o.25 o./ms. The daa sze of ds requess each ob s seleced radomly based o a Gamma dsrbuo wh he mea sze of 256Kbye. The performace merc by whch we evaluae sysem performace s mea slowdow of all he obs a race. Table. Mea slowdows of parallel obs uder he CB, MB, ad B schemes. λ 0.80 0.85 0.90 0.95.0.05.0.5.20.25 CB 62 74 86 99 3 32 49 68 87 20 MB 62 75 87 00 4 32 49 7 92 28 B 47 6 73 80 94 0 7 30 5 72 The umber of odes he smulaed cluser s 32; umber of ass each ob s seleced radomly wh a uform dsrbuo bewee 2 o 32; ds reques arrval rae s vared from 0.8 o.25 o./ms; ds reques sze s chose radomly wh a Gamma dsrbuo wh mea sze of 256Kbye; he faul rae s se o 0.5 o./ms; he sze s 4KBye. Table shows mpacs of he ds reques arrval rae o he mea slowdows of submed parallel obs uder he hree evaluaed load-balacg schemes. I s uve ha regardless of he load-balacg approaches, he slowdows of he parallel obs crease wh he I/O load goes up. Ths s because hgh ds reques arrval raes leads o hgh ds I/O loads, whch ur cause log I/O processg me ad log wag me o ds I/O resources. More mporaly, he expermeal resuls reveal ha he B algorhm s superor o he CB ad MB loadbalacg schemes. These resuls dcae ha he exsg load-balacg polces are adequae for I/O-esve worloads. The performace mprovemes ca be arbued o he fac ha 2

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. CB ad MB do o address he ssue of balacg ds I/O load uder I/O-esve worload codos. Table 2. Mea slowdows of parallel obs uder he CB, MB, ad B schemes. faul rae 2.6 2.7 2.8 2.9 3 3. 3.2 3.3 3.4 3.5 CB 22 23 23 24 24 24 24 27 27 28 MB 8.5 8.6 8.8 0.8 0.9. 3 3.5 4.2 4.3 B 8.5 8.5 8.7 0.8 0.9 3. 3.5 4.2 4 Ds reques arrval rae s fxed o 0.0 o./ms; he faul rae vares from 2.6 o 3.5 o./ms; he sze s 4KBye. Recall ha Theorem 8 proves ha f all parallel obs rug o a cluser mpose o load o ds I/O resources, he he performace of B s decal o ha of MB. ow we expermeally valdae he correcess of Theorem 8 usg ory-esve worloads. To acheve hs goal, we vared he faul rae from 2.6 o 3.5 o./ms wh a creme of 0. o./ms. The ds reques arrval rae s fxed o a low value - 0.0 o./ms. Table 2 shows performace mpacs of he faul rae o he mea slowdows of parallel obs rug o he smulaed cluser. For all he hree examed load-balacg schemes, resuls Table 2 dcae ha MB ad B ouperform CB uder ory-esve worloads. These performace mprovemes are possble because MB ad B are cocered wh he global ory usage he cluser by balacg ory resources. The mproved ory usage ur sgfcaly reduces me spe faul processg. Ths red becomes more proouced whe he faul rae s creased. 6. Coclusos Mos exsg load balacg approaches are adequae for I/O-esve worloads due o mbalace of I/O loads ad low usage of global ds resources. To address hs ssue, hs paper we proposed a ew load-balacg algorhm (referred o as B) for clusers. The 22

Joural of ewor ad Compuer Applcaos, vol. 3, o., pp. 32-46, Jauary 2008. proposed load-balacg algorhm ams o acheve he effecve usage of global ds resources a cluser. Ths ca, ur, mmze he average slowdow of all parallel obs rug o a cluser ad reduce he average respose me of he obs. I addo o balacg loads ds resources uder I/O-esve worloads, he B algorhm mproves he ad ory ulzao uder - ad ory-esve worload codos. Cosequely, B s able o maa he same level of performace as wo exsg - ad ory-aware loadbalacg schemes. We coduced race-drve smulaos where races are composed of parallel applcaos wh a wde varey of I/O demads. Emprcal resuls demosravely show ha compared wh he wo exsg load-balacg approaches, he B algorhm sgfcaly mproves he resource ulzao of a cluser uder I/O-esve worloads. Whe he worloads become -esve or ory-esve aure, B gracefully degrades owards he exsg load-balacg schemes. Fuure sudes ca be performed he followg drecos. Frs, we wll evaluae he performace of B o a large-scale cluser wh more ha 000 odes. Secod, hs sudy we assume ha ewor commucao cos s eglgble. Therefore, we ed o furher exed our load-balacg algorhm a way o balace load ewor resources. Thrd, a heerogeey-aware load-balacg algorhm wll be vesgaed o deal wh parallel obs rug o heerogeeous clusers, whch odes have varous processg capables. Acowledgemes The wor repored hs paper was suppored by he US aoal Scece Foudao uder Gras o. CCF-074287 ad o. CS-073895, Aubur Uversy uder a sarup gra, ad he Iel Corporao uder Gra o. 2005-04-070. 23