Off-line and on-line scheduling on heterogeneous master-slave platforms

Similar documents

Approximation Algorithms for Data Distribution with Load Balancing of Web Servers

A Simple Congestion-Aware Algorithm for Load Balancing in Datacenter Networks

An Efficient Job Scheduling for MapReduce Clusters

Expressive Negotiation over Donations to Charities

Dynamic Virtual Network Allocation for OpenFlow Based Cloud Resident Data Center

A Resources Allocation Model for Multi-Project Management

Asymptotically Optimal Inventory Control for Assemble-to-Order Systems with Identical Lead Times

TCP/IP Interaction Based on Congestion Price: Stability and Optimality

Recurrence. 1 Definitions and main statements

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign

Predictive Control of a Smart Grid: A Distributed Optimization Algorithm with Centralized Performance Properties*

The Greedy Method. Introduction. 0/1 Knapsack Problem

1 Example 1: Axis-aligned rectangles

DEFINING %COMPLETE IN MICROSOFT PROJECT

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

Predicting Advertiser Bidding Behaviors in Sponsored Search by Rationality Modeling

Project Networks With Mixed-Time Constraints

The Dynamics of Wealth and Income Distribution in a Neoclassical Growth Model * Stephen J. Turnovsky. University of Washington, Seattle

Increasing Supported VoIP Flows in WMNs through Link-Based Aggregation

J. Parallel Distrib. Comput.

Multi-agent System for Custom Relationship Management with SVMs Tool

An Ensemble Classification Framework to Evolving Data Streams

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

SIMPLIFYING NDA PROGRAMMING WITH PROt SQL

GRADIENT METHODS FOR BINARY INTEGER PROGRAMMING

What is Candidate Sampling

Branch-and-Price and Heuristic Column Generation for the Generalized Truck-and-Trailer Routing Problem

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

denote the location of a node, and suppose node X . This transmission causes a successful reception by node X for any other node

Joint Scheduling of Processing and Shuffle Phases in MapReduce Systems

Efficient Bandwidth Management in Broadband Wireless Access Systems Using CAC-based Dynamic Pricing

INSTITUT FÜR INFORMATIK

Loop Parallelization

An MILP model for planning of batch plants operating in a campaign-mode

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Clustering based Two-Stage Text Classification Requiring Minimal Training Data

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

On the Interaction between Load Balancing and Speed Scaling

Optimization of network mesh topologies and link capacities for congestion relief

On the Interaction between Load Balancing and Speed Scaling

MAC Layer Service Time Distribution of a Fixed Priority Real Time Scheduler over

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

Conferencing protocols and Petri net analysis

Sngle Snk Buy at Bulk Problem and the Access Network

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1

Schedulability Bound of Weighted Round Robin Schedulers for Hard Real-Time Systems

Distributed Optimal Contention Window Control for Elastic Traffic in Wireless LANs

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Dynamic Fleet Management for Cybercars

ANALYTICAL CHARACTERIZATION OF WLANS FOR QUALITY-OF-SERVICE WITH ACTIVE QUEUE MANAGEMENT

A New Paradigm for Load Balancing in Wireless Mesh Networks

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

XAC08-6 Professional Project Management

A Fast Incremental Spectral Clustering for Large Data Sets

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) , info@teltonika.

Enabling P2P One-view Multi-party Video Conferencing

Ants Can Schedule Software Projects

General Auction Mechanism for Search Advertising

Logical Development Of Vogel s Approximation Method (LD-VAM): An Approach To Find Basic Feasible Solution Of Transportation Problem

Multi-Source Video Multicast in Peer-to-Peer Networks

2008/8. An integrated model for warehouse and inventory planning. Géraldine Strack and Yves Pochet

Analysis of Energy-Conserving Access Protocols for Wireless Identification Networks

Fault tolerance in cloud technologies presented as a service

行政院國家科學委員會補助專題研究計畫成果報告期中進度報告

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

8 Algorithm for Binary Searching in Trees

Real-Time Process Scheduling

BERNSTEIN POLYNOMIALS

J. Parallel Distrib. Comput. Environment-conscious scheduling of HPC applications on distributed Cloud-oriented data centers

How To Make A Co-Ocaton Work For Free

Availability-Based Path Selection and Network Vulnerability Assessment

Performance Analysis and Comparison of QoS Provisioning Mechanisms for CBR Traffic in Noisy IEEE e WLANs Environments

An Integrated Approach for Maintenance and Delivery Scheduling in Military Supply Chains

Research on Single and Mixed Fleet Strategy for Open Vehicle Routing Problem

A New Quality of Service Metric for Hard/Soft Real-Time Applications

To Fill or not to Fill: The Gas Station Problem

Packet Dispersion and the Quality of Voice over IP Applications in IP networks

On-Line Trajectory Generation: Nonconstant Motion Constraints

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

A Novel Auction Mechanism for Selling Time-Sensitive E-Services

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

Sangam - Efficient Cellular-WiFi CDN-P2P Group Framework for File Sharing Service

How To Solve A Problem In A Powerline (Powerline) With A Powerbook (Powerbook)

A Secure Password-Authenticated Key Agreement Using Smart Cards

Chapter 4 ECONOMIC DISPATCH AND UNIT COMMITMENT

Profit-Aware DVFS Enabled Resource Management of IaaS Cloud

An Interest-Oriented Network Evolution Mechanism for Online Communities

A Performance Analysis of View Maintenance Techniques for Data Warehouses

Swing-Free Transporting of Two-Dimensional Overhead Crane Using Sliding Mode Fuzzy Control

Complete Fairness in Secure Two-Party Computation

Relay Secrecy in Wireless Networks with Eavesdropper

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

A generalized hierarchical fair service curve algorithm for high network utilization and link-sharing

An Alternative Way to Measure Private Equity Performance

CLoud computing technologies have enabled rapid

IWFMS: An Internal Workflow Management System/Optimizer for Hadoop

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

Transcription:

Laboratore de Informatque du Paraésme Écoe Normae Supéreure de Lyon Unté Mxte de Recherche CNRS-INRIA-ENS LYON-UCBL n o 5668 Off-ne and on-ne schedung on heterogeneous master-save patforms Jean-Franços Pneau, Yves Robert, Frédérc Vven Juy 2005 Research Report N o 2005-31 Écoe Normae Supéreure de Lyon 46 Aée d Itae, 69364 Lyon Cedex 07, France Tééphone : +33(0)4.72.72.80.37 Téécopeur : +33(0)4.72.72.80.80 Adresse éectronque : p@ens-yon.fr

Off-ne and on-ne schedung on heterogeneous master-save patforms Jean-Franços Pneau, Yves Robert, Frédérc Vven Juy 2005 Abstract In ths wor, we dea wth the probem of schedung ndependent tass on heterogeneous master-save patforms. We target both off-ne and on-ne probems, wth severa obectve functons (maespan, maxmum response tme, tota competon tme). On the theoretca sde, our resuts are two-fod: () For off-ne schedung, we prove severa optmaty resuts for probems wth reease dates; () For on-ne schedung, we estabsh ower bounds on the compettve rato of any determnstc agorthm. On the practca sde, we have mpemented severa heurstcs, some cassca and some new ones derved n ths paper. We studed expermentay these heurstcs on a sma but fuy heterogeneous MPI patform. Our resuts show the superorty of those heurstcs whch fuy tae nto account the reatve capacty of the communcaton ns. Keywords: Schedung, Master-save patforms, Heterogeneous computng, On-ne, Reease dates. Résumé Nous nous ntéressons c au probème de ordonnancement d un ensembe de tâches ndépendantes sur une pate-forme maître escave hétérogène. Nous consdérons es probèmes en-gne (ou à a voée) et hors-gne, pour des fonctons obectves dfférentes (durée totae d exécuton, temps de réponse maxmum, temps de réponse moyen). D un pont de vue théorque, nous obtenons deux types de résutats : () pour e probème hors-gne, nous avons étab puseurs résutats d optmaté pour des probèmes avec dates d arrvée ; () pour e probème en-gne, nous avons étab des bornes nféreures sur e facteur de compéttvté des agorthmes détermnstes. D un pont de vue pratque, nous avons mpémenté puseurs heurstques, certanes cassques, d autres ssues du présent trava. Nous avons étudé expérmentaement ces heurstques sur une pette pate-forme MPI totaement hétérogène. Les résutats expérmentaux montrent a supérorté des heurstques qu prennent compètement en compte es capactés reatves des dfférents ens de communcaton. Mots-cés: Ordonnancement en gne, Ordonnancement hors-gne, Cacu hétérogène, Pate-forme maître-escave

Off-ne and on-ne schedung on heterogeneous master-save patforms 1 1 Introducton In ths paper, we dea wth the probem of schedung ndependent tass on a heterogeneous mastersave patform. We assume that ths patform s operated under the one-port mode, where the master can communcate wth a snge save at any tme-step. Ths mode s much more reastc than the standard mode from the terature, where the number of smutaneous messages nvovng a processor s not bounded. However, very few compexty resuts are nown for ths mode (see Secton 7 for a short survey). The maor obectve of ths paper s to assess the dffcuty of off-ne and on-ne schedung probems under the one-port mode. We dea wth probems where a tass have the same sze. Otherwse, even the smpe probem of schedung wth two dentca saves, wthout payng any cost for the communcatons from the master, s NP-hard [12]. Assume that the patform s composed of a master and m saves P 1, P 2,..., P m. Let c be the tme needed by the master to send a tas to P, and et p be the tme needed by P to execute a tas. Our man resuts are the foowng: When the patform s fuy homogeneous (c = c and p = p for a ), we desgn an agorthm whch s optma for the on-ne probem and for three dfferent obectve functons (maespan, maxmum response tme, tota competon tme). When the communcatons are homogeneous (c = c for a, but dfferent vaues of p ), we desgn an optma maespan mnmzaton agorthm for the off-ne probem wth reease dates. Ths agorthm generazes, and provdes a new proof of, a resut of Smons [27]. When the computatons are homogeneous (p = p for a, but dfferent vaue of c ), we faed to derve an optma maespan mnmzaton agorthm for the off-ne probem wth reease dates, but we provde an effcent heurstc for ths probem. For these ast two scenaros (homogeneous communcatons and homogeneous computatons), we show that there does not exst any optma on-ne agorthm. Ths hods true for the prevous three obectve functons (maespan, maxmum response tme, tota competon tme). We even estabsh ower bounds on the compettve rato of any determnstc agorthm. The man contrbutons of ths paper are mosty theoretca. However, on the practca sde, we have mpemented severa heurstcs, some cassca and some new ones derved n ths paper, on a sma but fuy heterogeneous MPI patform. Our (premnary) resuts show the superorty of those heurstcs whch fuy tae nto account the reatve capacty of the communcaton ns. The rest of the paper s organzed as foows. In Secton 2, we state some notatons for the schedung probems under consderaton. Secton 3 deas wth fuy homogeneous patforms. We study communcaton-homogeneous patforms n Secton 4, and computaton-homogeneous patforms n Secton 5. We provde an expermenta comparson of severa schedung heurstcs n Secton 6. Secton 7 s devoted to an overvew of reated wor. Fnay, we state some concudng remars n Secton 8. 2 Framewor To be consstent wth the terature [16, 9], we use the notaton α β γ where: α: the patform As n the standard, we use P for patforms wth dentca processors, and Q for patforms wth dfferent-speed processors 1. We add MS to ths fed to ndcate that we wor wth master-save patforms. β: the constrants We wrte on-ne for on-ne probems, and r when there are reease dates. We wrte c = c for communcaton-homogeneous patforms, and p = p for computatonhomogeneous patforms. 1 As we ony target sets of same-sze tass, we aways fa under the unform processors framewor. In other words, the executon tme of a tas on a processor w ony depend on the processor runnng t and not on the tas.

2 J.-F. Pneau, Y. Robert, F. Vven γ: the obectve We et C denote the end of the executon of tas. We dea wth three obectve functons: the maespan (tota executon tme) max C ; the maxmum response tme (or maxmum fow) max C r : ndeed, C r s the tme spent by tas n the system; the tota competon tme C, whch s equvaent to the sum of the response tmes (C r ). 3 Fuy homogeneous patforms For fuy homogeneous patforms, we are abe to prove the optmaty of the Round-Robn agorthm whch processes the tass n the order of ther arrva, and whch assgns them n a cycc fashon to processors: Theorem 1. The Round-Robn agorthm s optma for the probem P, MS onne, r, p = p, c = c (C r ), as we as for the mnmzaton of the maespan and of the maxmum response tme. We pont out that the compexty of the Round-Robn agorthm s near n the number of tass and does not depend upon the patform sze. Proof. To prove that the greedy agorthm Round-Robn s optma for our probem, we show that there s an optma schedue under whch the executon of each tas starts at the exact same date than under Round-Robn. To prove ths, we frst show two resuts statng that we can focus on certan partcuar optma schedues. 1. There s an optma schedue such that the master sends the tass to the saves n the order of ther arrva. We prove ths resut wth permutaton arguments. Let S be an optma schedue not verfyng the desred property. Remember that the master use ts communcaton ns n a sequenta fashon. Then we denote by r the date at whch the tas arrves on a save. By hypothess on S, there are two tass, and, such that arrves on the master before, but s sent to a processor save after. So: r < r and r < r. We then defne from S a new schedue S as foows: If the tas was nevertheess treated earer than the tas (.e., f C C ), then we smpy reverse the dspatch dates of tass and, but do not change the processors where they are computed. Ths s ustrated on Fgure 1. In ths case, the remander of the schedue s et unaffected, and the tota fow remans the same (ust as the maespan, and the maxmum fow). If the tas was processed ater than the tas,.e., f C > C, then we send the tas to the processor that was recevng under S, at the tme tas was sent to that processor, and conversey. Ths s ustrated on Fgure 2. Snce the tass and have the same sze, the use of the processors w be the same, and the remander of the schedue w reman unchanged. One obtans a new schedue S, havng as tota fow: n, (C r ) + (C r ) + (C r ) = n (C r ) (1)

Off-ne and on-ne schedung on heterogeneous master-save patforms 3 Therefore, ths s aso an optma schedue. In the same way, the maespan as we as the maxmum fow are unchanged. P P P P (a) Before permutaton (b) After permutaton Fgure 1: Permutaton on the optma schedue S (case C C ). P P P P (a) Before permutaton (b) After permutaton Fgure 2: Permutaton on the optma schedue S (case C > C ). By teratng ths process, we obtan an optma schedue where the master sends the tass accordng to ther arrva dates,.e., by ncreasng r s. Indeed, f one consders the set of the coupes {(, ) r < r & r < r }, we notce that each teraton of the process strcty ncreases the sze of ths set. 2. There s an optma schedue such that the master sends the tass to the saves n the order of ther arrva, and such that the tass are executed n the order of ther arrva. We w permute tass to bud an optma schedue satsfyng ths property from a schedue satsfyng the property stated n pont 1. Let S be an optma schedue n whch tass are sent by the master n the order of ther arrva. From the above study, we now that such a schedue exsts. Let us suppose that S does not satsfy the desred property. Then, there are two tass and, such that r r, r < r, and C > C. Then we defne a new schedue S by ust exchangng the processors to whch the tass and were sent. Then, the tas s computed under S at the tme when was computed under S, and conversey. Ths way, we obtan the same tota fow ((C r ) + (C r ) = (C r )+(C r )), the same maespan (snce the worng tmes of the processors remans unchanged), whereas the maxmum fow can decrease. Among the optma schedues whch respect the property stated n pont 2, we now oo at the subset of the soutons computng the frst tas as soon as possbe. Then, among ths subset, we

4 J.-F. Pneau, Y. Robert, F. Vven oo at the soutons computng the second tas as soon as possbe. And so on. Ths way, we defne from the set of a optma schedues an optma souton, denoted ASAP, whch processes the tass n the order of ther arrva, and whch processes each of them as soon as possbe. We w now compare ASAP wth the schedue Round-Robn, formay defned as foows: under Round-Robn the tas s sent to the processor mod m as soon as possbe, whe respectng the order of arrva of the tass. 3. The computaton of any tas starts at the same tme under the schedues ASAP and Round-Robn. The demonstraton s done by nducton on the number of tass. Round-Robn sends the frst tas as soon as possbe, ust as ASAP does. Let us suppose now that the frst tass satsfy the property. Let us oo at the behavor of Round-Robn on the arrva of the ( + 1)-th tas. The computaton of the ( + 1)-th tas starts at tme: RR( + 1) = max { r +1, RR( + 1 m) + p}. Indeed, ether the processor s avaabe at the tme the tas arrves on the save, and the tas executon starts as soon as the tas arrves,.e., at tme r +1, or the processor s busy when the tas arrves. In the atter case, the processor w be avaabe when the ast tas t prevousy receved (.e., the ( + 1 m)-th tas accordng to the Round-Robn strategy) w be competed, at tme RR( + 1 m) + p. Therefore, f RR( + 1) = r +1, Round-Robn remans optma, snce the tas s processed as soon as t s avaabe on a save, and snce t was sent as soon as possbe. Otherwse, RR( + 1) = RR( + 1 m) + p. But, by nducton hypothess, we now that λ, 1 λ m, RR( + 1 λ) = ASAP ( + 1 λ). Furthermore, thans to the Round-Robn schedung pocy, we now that, RR() RR( + 1). Therefore: λ, 1 λ m, RR( + 1 m) RR( + 1 λ) < RR( + 1 m) + p = RR( + 1) Ths mpes that, between RR( + 1 m) and RR(), m tass of sze p were started, under Round-Robn, and aso under ASAP because of the nducton hypothess. Therefore, durng that tme nterva, m saves were seected. Then, unt the date RR( + 1 m) + p, a the saves are used and, thus, the tas + 1 s aunched as soon as possbe by Round-Robn, nowng that ASAP coud not have aunched t earer. Therefore, ASAP (+1) = RR(+1). We can concude. We have aready stated that the demonstratons of ponts 1 and 2 are vad for schedues mnmzng ether maespan, tota fow, or maxmum fow. The reasonng foowed n the demonstraton of pont 3 s ndependent from the obectve functon. Therefore, we demonstrated the optmaty of Round-Robn for these three obectve functons. 4 uncaton-homogeneous patforms In ths secton, we have c = c but dfferent-speed processors. We order them so that P 1 s the fastest processor (p 1 s the smaest computng tme p ), whe P m s the sowest processor. 4.1 On-ne schedung Theorem 2. There s no schedung agorthm for the probem P, MS onne, r, p, c = c max C wth a compettve rato ess than 5+3 5 10. Proof. Suppose the exstence of an on-ne agorthm A wth a compettve rato ρ = 5+3 5 10 ɛ, wth ɛ > 0. We w bud a patform and study the behavor of A opposed to our adversary. The patform conssts of two processors, where p 1 = 2, p 2 = 1+3 5 2, and c = 1.

Off-ne and on-ne schedung on heterogeneous master-save patforms 5 Intay, the adversary sends a snge tas at tme 0. A sends the tas ether on P 1, achevng a maespan at east equa to 3, or on P 2, wth a maespan at east equa to 3+3 5 2. At tme-step 1, we chec f A made a decson concernng the schedung of, and the adversary reacts consequenty: 1. If A dd not begn the sendng of the tas, the adversary does not send other tass. The best maespan s then 4. As the optma maespan s 3, we have a compettve rato of 4 3 > 5+3 5 10. Ths refutes the assumpton on ρ. Thus the agorthm A must have schedued the tas at tme 1. 2. If A schedued the tas on P 2 the adversary does not send other tass. The best possbe maespan s then equa to 3+3 5 2, whch s even worse than the prevous case. Consequenty, agorthm A does not have another choce than to schedue the tas on P 1. At tme-step 1, the adversary sends another tas,. In ths case, we oo, at tme-step 2, at the assgnment A made for : 1. If s sent on P 2, the adversary does not send any more tas. The best achevabe maespan s then 5+3 5 2, whereas the optma s 5. The compettve rato s then 5+3 5 10 > ρ. 2. If s sent on P 1 the adversary sends a ast tas at tme-step 2. The best possbe maespan s then 7+3 5 2, whereas the optma s 5+3 5 2. The compettve rato s st 5+3 5 10, hgher than ρ. Remar 1. Smary, we can show that there s no on-ne schedung for the probem P, MS onne, r, p, c = c C whose compettve rato ρ s strcty ower than 2+4 2 7, and that there s no on-ne schedung for the probem P, MS onne, r, p, c = c max (C r ) whose compettve rato ρ s strcty ower than 7 6. 4.2 Off-ne schedung In ths secton, we am at desgnng an optma agorthm for the off-ne verson of the probem, wth reease dates. We target the obectve max C. Intutvey, to mnmze the competon date of the tas arrvng ast, t s necessary to aocate ths tas to the fastest processor (whch w fnsh t the most rapdy). However, the other tass shoud aso be assgned so that ths fastest processor w be avaabe as soon as possbe for the tas arrvng ast. We defne the greedy agorthm SLJF (Schedung Last Jobs Frst) as foows: Intazaton Tae the ast tas whch arrves n the system and aocate t to the fastest processor (Fgure 3(a)). Schedung bacwards Among the not-yet-aocated tass, seect the one whch arrved atest n the system. Assgn t, wthout tang ts arrva date nto account, to the processor whch w begn ts executon at the atest, but wthout exceedng the competon date of the prevousy schedued tas (Fgure 3(b)). Memorzaton Once a tass are aocated, record the assgnment of the tass to the processors (Fgure 3(c)). Assgnment The master sends the tass accordng to ther arrva dates, as soon as possbe, to the processors whch they have been assgned to n the prevous step (Fgure 3(d)). Theorem 3. SLJF s an optma agorthm for the probem Q, MS r, p, c = c max C.

6 J.-F. Pneau, Y. Robert, F. Vven P 2 : p = 4 P 2 : p = 3 P 1 : p = 2 P 2 : p = 4 P 2 : p = 3 P 1 : p = 2 (a) Intazaton (b) Schedung bacwards P 3 : p = 4 P 2 : p = 3 P 1 : p = 2 P 2 : p = 4 P 2 : p = 3 P 1 : p = 2 P 1 P 3 P 2 P 1 Arrva:, (c) Memorzaton (d) Assgnment Fgure 3: Dfferent steps of the SLJF agorthm, wth four tass,,, and. Proof. The frst three phases of the SLJF agorthm are ndependent of the reease dates, and ony depend on the number of tass whch w arrve n the system. The proof proceeds n three steps. Frst we study the probem wthout communcaton costs, nor reease dates. Next, we tae reease dates nto account. Fnay, we extend the resut to the case wth communcatons. The second step s the most dffcut. For the frst step, we have to mnmze the maespan durng the schedung of dentca tass wth heterogeneous processors, wthout reease dates. Wthout communcaton costs, ths s a wenown oad baancng, probem, whch can be soved by a greedy agorthm [6]. The schedung bacwards phase of SLJF soves ths oad baancng probem optmay. Snce the probem s wthout reease dates, the memorzaton phase does not ncrease the maespan, whch thus remans optma. Next we add the constrants of reease dates. To show that SLJF s optma, we proceed by nducton on the number of tass. For a snge tas, t s obvous that the addton of a reease date does not change anythng about the optmaty of the souton. Let us suppose the agorthm optma for n tass, or ess. Then oo at the behavor of the agorthm to process n + 1 tass. If the addton of the reease dates does not ncrease the maespan compared to that obtaned durng the memorzaton step, then an optma schedung s obtaned. If not, et us oo once agan at the probem startng from the end. Compare the competon tmes of the tass n the schedung of the memorzaton phase (denoted as (C n C ) memo ), and n the assgnment phase (denoted as (C n C ) fna ). If both maespans are equa, we are fnshed. Otherwse, there are tass such that (C n C ) memo < (C n C ) fna. Let be the ast tas satsfyng ths property. In ths case, the schedung of the (n 1) ast tass corresponds to SLJF n the case of (n 1) tass, when the frst tas arrves at tme r +1 (see Fgure 4). And snce s the ast tas satsfyng the above property, we are sure that the processors are free at the expected tmes. Usng the nducton hypothess, schedung s thus optma from r +1, and tas + 1 cannot begn ts computaton

Off-ne and on-ne schedung on heterogeneous master-save patforms 7 Arrva:, P 3 : p = 4 P 2 : p = 3 P 1 : p = 2 P 1 C C = 2 P 3 C C = 0 Maespan = 4 P 2 C C = 0 (a) Schedung bacwards P 1 C C = 0 P 2 : p = 4 P 2 : p = 3 P 1 : p = 2 SLJF (3) C C > 2 C C > 0 C C = 0 C C = 0 (b) Assgnment Fgure 4: Detang the ast two phases of the SLJF agorthm. earer. The whoe schedung s thus optma. Fnay, SLJF s optma to mnmze the maespan n the presence of reease dates. Tang communcatons nto account s now easy. Under the one-port mode, wth a unform communcaton tme for a tass and processors, the optma pocy of the master conssts n sendng the tass as soon as they arrve. Now, we can consder the dates at whch the tass are avaabe on the saves, and consder them as reease dates for a probem wthout communcatons. Remar 2. It shoud be stressed that, by posng c = 0, our approach aows to provde a new proof to the resut of Barbara Smons [27]. 5 Computaton-homogeneous patforms In ths secton, we have p = p but processor ns wth dfferent capactes. We order them, so that P 1 s the fastest communcatng processor (c 1 s the smaest computng tme c ). 5.1 On-ne schedung Just as n Secton 4, we can bound the compettve rato of any determnstc agorthm: Theorem 4. There s no schedung agorthm for the probem P, MS onne, r, p = p, c max C whose compettve rato ρ s strcty ower than 6 5. Proof. Assume that there exsts a determnstc on-ne agorthm A whose compettve rato s ρ = 6 5 ɛ, wth ɛ > 0. We w bud a patform and an adversary to derve a contradcton. The patform s made up wth two processors P 1 and P 2 such that p 1 = p 2 = p = max{5, 12 25ɛ }, c 1 = 1 and c 2 = p 2. Intay, the adversary sends a snge tas at tme 0. A executes the tas, ether on P 1 wth a maespan at east 2 equa to 1 + p, or on P 2 wth a maespan at east equa to 3p 2. At tme-step p 2, we chec whether A made a decson concernng the schedung of, and whch one: 1. If A schedued the tas on P 2 the adversary does not send other tass. The best possbe maespan s then 3p 2. The optma schedung beng of maespan 1+p, we have a compettve 2 Nothng forces A to send the tas as soon as possbe.

8 J.-F. Pneau, Y. Robert, F. Vven rato of ρ 3p 2 1 + p = 3 2 3 2(p + 1) > 6 5 because p 5 by assumpton. Ths contradcts the hypothess on ρ. Thus the agorthm A cannot schedue tas on P 2. 2. If A dd not begn to send the tas, the adversary does not send other tass. The best maespan that can be acheved s then equa to p 2 + (1 + p) = 1 + 3p 2, whch s even worse than the prevous case. Consequenty, the agorthm A does not have any other choce than to schedue tas on P 1. At tme-step p 2, the adversary sends three tass,, and. No schedue whch executes three of the four tass on the same processor can have a maespan ower than 1+3p (mnmum duraton of a communcaton and executon wthout deay of the three tass). We now consder the schedues whch compute two tass on each processor. Snce s computed on P 1, we have three cases to study, dependng upon whch other tas (,, or ) s computed on P 1 : 1. If s computed on P 1 : (a) Tas s sent to P 1 durng the nterva [0, 1] and s computed durng the nterva [1, 1+p]. (b) Tas s sent to P 1 durng the nterva [ p 2, 1 + p 2 ] and s computed durng the nterva [1 + p, 1 + 2p]. (c) Tas s sent to P 2 durng the nterva [1+ p 2, 1+p] and s computed durng the nterva [1 + p, 1 + 2p]. (d) Tas s sent to P 2 durng the nterva [1+p, 1+ 3p 2 ] and s computed durng the nterva [1 + 2p, 1 + 3p]. The maespan of ths schedue s then 1 + 3p. 2. If s computed on P 1 : (a) Tas s sent to P 1 durng the nterva [0, 1] and s computed durng the nterva [1, 1+p]. (b) Tas s sent to P 2 durng the nterva [ p 2, p] and s computed durng the nterva [p, 2p]. (c) Tas s sent to P 1 durng the nterva [p, 1 + p] and s computed durng the nterva [1 + p, 1 + 2p]. (d) Tas s sent to P 2 durng the nterva [1+p, 1+ 3p 2 ] and s computed durng the nterva [2p, 3p]. The maespan of ths schedung s then 3p. 3. If s computed on P 1 : (a) Tas s sent to P 1 durng the nterva [0, 1] and s computed durng the nterva [1, 1+p]. (b) tas s sent to P 2 durng the nterva [ p 2, p] and s computed durng the nterva [p, 2p]. (c) Tas s sent to P 2 durng the nterva [p, 3p 2 ] and s computed durng the nterva [2p, 3p]. (d) Tas s sent to P 1 durng the nterva [ 3p 2, 1 + 3p 2 ] and s computed durng the nterva [1 + 3p 2, 1 + 5p 2 ]. The maespan of ths schedue s then 3p.

Off-ne and on-ne schedung on heterogeneous master-save patforms 9 Consequenty, the ast two schedues are equvaent and are better than the frst. Atogether, the best achevabe maespan s 3p. But a better schedue s obtaned when computng on P 2, then on P 1, then on P 2, and fnay on P 1. The maespan of the atter schedue s equa to 1 + 5p 2. The compettve rato of agorthm A s necessary arger than the rato of the best reachabe maespan (namey 3p) and the optma maespan, whch s not arger than 1 + 5p 2. Consequenty: ρ 3p 1 + 5p 2 = 6 5 6 5(5p + 2) > 6 5 6 25p 6 5 ɛ 2 whch contradcts the assumpton ρ = 6 5 ɛ wth ɛ > 0. 5.2 Off-ne schedung In the easy case where p c p, and wthout reease dates, Round-Robn s optma for maespan mnmzaton. But n the genera case, not a saves w be enroed n the computaton. Intutvey, the dea s to use the fastest m ns, where m s computed so that the tme p to execute a tas es between the tme necessary to send a tas on each of the fastest m 1 ns and the tme necessary to send a tas on each of the fastest m ns. Formay, m 1 c < p and c p. m Wth ony m ns seected n the patform, we am at dervng an agorthm smar to Round- Robn. But we dd not succeed n provng the optmaty of our approach. Hence the agorthm beow shoud rather be seen as a heurstc. The dffcuty es n decdng when to use the m -th processor. In addton to be the one havng the sowest communcaton n, ts use can cause a moment of nactvty on another processor, snce m 1 c + c m p. Our greedy agorthm w smpy compare the performances of two strateges, the one sendng tass ony on the m 1 frst processors, and the other usng the m -th processor at the best possbe moment. Let RRA be the agorthm sendng the tass to the m 1 fastest processors n a cycc way, startng wth the fastest processor, and schedung the tass n the reverse order, from the ast one to the frst one. Let RRB be the agorthm sendng the ast tas to processor m, then foowng the RRA pocy. We see that RRA sees to contnuousy use the processors, even though de tme may occur on the communcaton n, and on the processor P m. On the contrary, RRB tres to contnuousy use the communcaton n, despte eavng some processors de. The goba behavor of the greedy agorthm, SLJFWC (Schedung the Last Job Frst Wth uncaton) s as foows: Intazaton: Aocate the m 1 ast tass to the fastest m 1 processors, from the fastest to the sowest. Comparson: Compare the schedues RRA and RRB. If there are not enough tass to enforce the foowng stop and save condton, then eep the fastest pocy (see Fgure 5). Stop and save: After (m 1) + 1 aocated tass ( 2), f (see Fgure 6) { m 1 c + c m > p ( + 1) m 1 c + c m ( + 1)p then eep the tas assgnment of RRB for the ast (m 1) + 1 tass, and start agan the comparson phase for the remanng tass. If not, proceed wth the comparson step. End: When the ast tas s treated, eep the fastest pocy.

10 J.-F. Pneau, Y. Robert, F. Vven P 4 : c = 5 Intazaton P 3 : c = 4 q q n n P 2 : c = 2 p p m m P 1 : c = 1 o o p = 8 (a) Agorthm RRA Intazaton P 4 : c = 5 P 3 : c = 4 o o P 2 : c = 2 q n q n P 1 : c = 1 p m p m p = 8 (b) Agorthm RRB Fgure 5: Agorthms RRA and RRB wth 9 tass. The ntuton under ths agorthm s smpe. We now that f we ony have the m 1 fastest processors, then RRA s optma to mnmze the maespan. However, the tme necessary for sendng a tas on each of the m 1 processors s ower than p. Ths means that the sendng of the tass taes advances compared to ther executon. Ths advance, whch accumuates for a the m 1 tass, can become suffcenty arge to aow the sendng of a tas on another m-th processor, for free,.e. wthout deayng the treatment of the next tass to come on the other processors. 6 MPI experments 6.1 The expermenta patform We bud a sma heterogeneous master-save patform wth fve dfferent computers, connected to each other by a fast Ethernet swtch (100 Mbt/s). The fve machnes are a dfferent, both n terms of CPU speed and n the amount of avaabe memory. The heterogenety of the communcaton ns s many due to the dfferences between the networ cards. Each tas w be a matrx, and each save w have to cacuate the determnant of the matrces that t w receve. Whenever needed, we pay wth matrx szes so as to acheve more heterogenety n the CPU speeds or communcaton bandwdths. Beow we report experments for the foowng confguraton (n an arbtrary unt): c 1 = 0.011423 et p 1 = 0.052190 c 2 = 0.012052 et p 2 = 0.019685

Off-ne and on-ne schedung on heterogeneous master-save patforms 11 m X 1 ( + 1) c + c m m X 2 c m X 1 c + cn P 4 : c = 5 0 0 1 0 2 0 4 0 3... 4 3 4 2 1 P 3 : c = 4 0 1... 7 3 P 2 : c = 2 0 2... 6 2 P 1 : c = 1 0 0 3... 5 1 p = 8 p m X 2 c ( + 1)p Fgure 6: The stop and save condton. c 3 = 0.016808 et p 3 = 0.101777 c 4 = 0.043482 et p 4 = 0.288397 6.2 Resuts Fgure 7 shows the maespan obtaned wth cassca schedung agorthms, such as SRPT (Shortest Remanng Processng Tme), Lst Schedung, and severa varants of Round-Robn, as we as wth SLJF and SLJF W C. In ths expernebt, a the tass to be schedued arrved at tme 0 (off-ne framewor wthout reease dates). Each pont on the fgure, representng the maespan of a schedue, corresponds n reaty to an average obtaned whe aunchng severa tmes the experment. We see that SLJF W C obtans good resuts. SLJF remans compettve, even f t was not desgned for a patform wth dfferent communcatons ns. Fgure 8 aso represents the average maespan of varous agorthms, but on a dfferent patform. Ths tme, the parameters were modfed by software n order to render the processors homogeneous. In ths case, SLJF W C s st better, and SLJF obtans poor performances. Fnay, Fgure 9 represents the average maespan n the presence of reease-dates. Agan, SLJF W C performs we, even though t was not desgned for probems wth reease-dates. 7 Reated wor We cassfy severa reated papers aong the foowng four man nes: Modes for heterogeneous patforms In the terature, one-port modes come n two varants. In the undrectona varant, a processor cannot be nvoved n more than one communcaton at a gven tme-step, ether a send or a receve. In the bdrectona mode, a processor can send and receve n parae, but at most to a gven neghbor n each drecton. In both varants, f P u sends a message to P v, both P u and P v are boced throughout the communcaton. The bdrectona one-port mode s used by Bhat et a [7, 8] for fxed-sze messages. They advocate ts use because current hardware and software do not easy enabe mutpe messages to be transmtted smutaneousy. Even f non-bocng mut-threaded communcaton brares aow for ntatng mutpe send and receve operatons, they cam that a these operatons are eventuay serazed by the snge hardware port to the networ. Expermenta evdence of ths fact has recenty been reported by Saf and Parashar [24], who report

12 J.-F. Pneau, Y. Robert, F. Vven 2.4e+07 2.2e+07 2e+07 1.8e+07 SLJFWC SLJF SRPT Lst Schedung Round Robn Round Robn Round Robn Proc Maespan 1.6e+07 1.4e+07 1.2e+07 1e+07 8e+06 6e+06 10 12 14 16 18 20 Number of tass Fgure 7: Comparng the maespan of severa agorthms. that asynchronous MPI sends get serazed as soon as message szes exceed a few megabytes. Ther resuts hod for two popuar MPI mpementatons, MPICH on Lnux custers and IBM MPI on the SP2. The one-port mode fuy accounts for the heterogenety of the patform, as each n has a dfferent bandwdth. It generazes a smper mode studed by Banazem et a. [1], Lu [19], and Khuer and Km [15]. In ths smper mode, the communcaton tme ony depends on the sender, not on the recever: n other words, the communcaton speed from a processor to a ts neghbors s the same. Fnay, we note that some papers [2, 3] depart from the one-port mode as they aow a sendng processor to ntate another communcaton whe a prevous one s st on-gong on the networ. However, such modes nsst that there s an overhead tme to pay before beng engaged n another operaton, so there are not aowng for fuy smutaneous communcatons. Tas graph schedung Tas graph schedung s usuay studed usng the so-caed macrodatafow mode [20, 26, 10, 11], whose maor faw s that communcaton resources are not mted. In ths mode, a processor can send (or receve) any number of messages n parae, hence an unmted number of communcaton ports s assumed (ths expans the name macro-datafow for the mode). Aso, the number of messages that can smutaneousy crcuate between processors s not bounded, hence an unmted number of communcatons can smutaneousy occur on a gven n. In other words, the communcaton networ s assumed to be contenton-free, whch of course s not reastc as soon as the processor number exceeds a few unts. More recent papers [29, 21, 23, 4, 5, 28] tae communcaton resources nto account. Hoermann et a. [13] and Hsu et a. [14] ntroduce the foowng mode for tas graph schedung: each processor can ether send or receve a message at a gven tme-step (bd-

Off-ne and on-ne schedung on heterogeneous master-save patforms 13 7e+06 6e+06 5e+06 SLJFWC SLJF Round Robn Proc SRPT Lst Schedung Round Robn Round Robn Maespan 4e+06 3e+06 2e+06 1e+06 0 10 12 14 16 18 20 Number of tass Fgure 8: Maespan on a patform wth homogeneous saves. rectona communcaton s not possbe); aso, there s a fxed atency between the ntaton of the communcaton by the sender and the begnnng of the recepton by the recever. St, the mode s rather cose to the one-port mode dscussed n ths paper. On-ne schedung A good survey of on-ne schedung can be found n [25, 22]. Two papers focus on the probem of on-ne schedung for master-saves patforms. In [17], Leung and Zhao proposed severa compettve agorthms mnmzng the tota competon tme on a master-save patform, wth or wthout pre- and post-processng. In [18], the same authors studed the compexty of mnmzng the maespan or the tota response tme, and proposed some heurstcs. However, none of these wors tae nto consderaton communcaton costs. 8 Concuson In ths paper, we have deat wth the probem of schedung ndependent, same-sze tass on mastersave patforms. We enforce the one-port mode, and we study the mpact of the communcatons on the desgn and anayss of the proposed agorthms. On the theoretca sde, we have derved severa new resuts, ether for on-ne schedung, or for off-ne schedung wth reease dates. There are two mportant drectons for future wor. Frst, the bounds on the compettve rato that we have estabshed for on-ne schedung on communcatonhomogeneous, and computaton-homogeneous patforms, are ower bounds: t woud be very nterestng to see whether these bounds can be met, and to desgn the correspondng approxmaton agorthms. Second, there remans to derve an optma agorthm for off-ne schedung wth reease dates on computaton-homogeneous patforms. On the practca sde, we have to wden the scope of the MPI experments. A detaed comparson of a the heurstcs that we have mpemented needs to be conducted on sgnfcanty arger patforms (wth severa tens of saves). Such a comparson woud, we beeve, further demonstrate the superorty of those heurstcs whch fuy tae nto account the reatve capacty of the

14 J.-F. Pneau, Y. Robert, F. Vven 3.5e+07 3e+07 SLJFWC SLJF SRPT Lst Schedung Round Robn Round Robn Proc Round Robn 2.5e+07 Maespan 2e+07 1.5e+07 1e+07 10 12 14 16 18 20 Number of tass Fgure 9: Maespan wth reease dates. communcaton ns. References [1] M. Banazem, V. Moorthy, and D. K. Panda. Effcent coectve communcaton on heterogeneous networs of worstatons. In Proceedngs of the 27th Internatona Conference on Parae Processng (ICPP 98). IEEE Computer Socety Press, 1998. [2] M. Banazem, J. Sampathumar, S. Prabhu, D.K. Panda, and P. Sadayappan. uncaton modeng of heterogeneous networs of worstatons for performance characterzaton of coectve operatons. In HCW 99, the 8th Heterogeneous Computng Worshop, pages 125 133. IEEE Computer Socety Press, 1999. [3] Amotz Bar-Noy, Sudpto Guha, Joseph (Seff) Naor, and Baruch Scheber. Message mutcastng n heterogeneous networs. SIAM Journa on Computng, 30(2):347 358, 2000. [4] Over Beaumont, Vncent Boudet, and Yves Robert. A reastc mode and an effcent heurstc for schedung wth heterogeneous processors. In HCW 2002, the 11th Heterogeneous Computng Worshop. IEEE Computer Socety Press, 2002. [5] Over Beaumont, Arnaud Legrand, and Yves Robert. A poynoma-tme agorthm for aocatng ndependent tass on heterogeneous for-graphs. In ISCIS XVII, Seventeenth Internatona Symposum On Computer and Informaton Scences, pages 115 119. CRC Press, 2002. [6] Over Beaumont, Arnaud Legrand, and Yves Robert. The master-save paradgm wth heterogeneous processors. IEEE Trans. Parae Dstrbuted Systems, 14(9):897 908, 2003.

Off-ne and on-ne schedung on heterogeneous master-save patforms 15 [7] P.B. Bhat, C.S. Raghavendra, and V.K. Prasanna. Effcent coectve communcaton n dstrbuted heterogeneous systems. In ICDCS 99 19th Internatona Conference on Dstrbuted Computng Systems, pages 15 24. IEEE Computer Socety Press, 1999. [8] P.B. Bhat, C.S. Raghavendra, and V.K. Prasanna. Effcent coectve communcaton n dstrbuted heterogeneous systems. Journa of Parae and Dstrbuted Computng, 63:251 263, 2003. [9] J. Bazewcz, J.K. Lenstra, and A.H. Kan. Schedung subect to resource constrants. Dscrete Apped Mathematcs, 5:11 23, 1983. [10] P. Chrétenne, E. G. Coffman Jr., J. K. Lenstra, and Z. Lu, edtors. Schedung Theory and ts Appcatons. John Wey and Sons, 1995. [11] H. E-Rewn, H. H. A, and T. G. Lews. Tas schedung n mutprocessng systems. Computer, 28(12):27 37, 1995. [12] M. R. Garey and D. S. Johnson. Computers and Intractabty, a Gude to the Theory of NP-Competeness. W. H. Freeman and Company, 1979. [13] L. Hoermann, T. S. Hsu, D. R. Lopez, and K. Vertanen. Schedung probems n a practca aocaton mode. J. Combnatora Optmzaton, 1(2):129 149, 1997. [14] T. S. Hsu, J. C. Lee, D. R. Lopez, and W. A. Royce. Tas aocaton on a networ of processors. IEEE Trans. Computers, 49(12):1339 1353, 2000. [15] S. Khuer and Y.A. Km. On broadcastng n heterogenous networs. In Proceedngs of the ffteenth annua ACM-SIAM symposum on Dscrete agorthms, pages 1011 1020. Socety for Industra and Apped Mathematcs, 2004. [16] J.K. Lenstra, R. Graham, E. Lawer, and A.H. Kan. Optmzaton and approxmaton n determnstc sequencng and schedung: a survey. Annas of Dscrete Mathematcs, 5:287 326, 1979. [17] Joseph Y-T. Leung and Harong Zhao. Mnmzng tota competon tme n master-save systems, 2004. Avaabe at http://web.nt.edu/~hz2/papers/mastersave-eee.pdf. [18] Joseph Y-T. Leung and Harong Zhao. Mnmzng mean fowtme and maespan on mastersave systems. J. Parae and Dstrbuted Computng, 65(7):843 856, 2005. [19] P. Lu. Broadcast schedung optmzaton for heterogeneous custer systems. Journa of Agorthms, 42(1):135 152, 2002. [20] M. G. Norman and P. Thansch. Modes of machnes and computaton for mappng n mutcomputers. ACM Computng Surveys, 25(3):103 117, 1993. [21] J. M. Orduna, F. Sa, and J. Duato. A new tas mappng technque for communcaton-aware schedung strateges. In T. M. Pnston, edtor, Worshop for Schedung and Resource Management for Custer Computng (ICPP 01), pages 349 354. IEEE Computer Socety Press, 2001. [22] Ir Pruhs, Jr Sga, and Erc Torng. On-ne schedung. In J. Leung, edtor, Handboo of Schedung: Agorthms, Modes, and Performance Anayss, pages 15.1 15.43. CRC Press, 2004. [23] C. Rog, A. Rpo, M. A. Senar, F. Gurado, and E. Luque. Improvng statc schedung usng nter-tas concurrency measures. In T. M. Pnston, edtor, Worshop for Schedung and Resource Management for Custer Computng (ICPP 01), pages 375 381. IEEE Computer Socety Press, 2001.

16 J.-F. Pneau, Y. Robert, F. Vven [24] T. Saf and M. Parashar. Understandng the behavor and performance of non-bocng communcatons n MPI. In Proceedngs of Euro-Par 2004: Parae Processng, LNCS 3149, pages 173 182. Sprnger, 2004. [25] J. Sga. On ne schedung-a survey. In On-Lne Agorthms, Lecture Notes n Computer Scence 1442, pages 196 231. Sprnger-Verag, Bern, 1998. [26] B. A. Shraz, A. R. Hurson, and K. M. Kav. Schedung and oad baancng n parae and dstrbuted systems. IEEE Computer Scence Press, 1995. [27] Barbara Smons. Mutprocessor schedung of unt-tme obs wth arbtrary reease tmes and deadnes. SIAM Journa on Computng, 12(2):294 299, 1983. [28] Over Snnen and Leone Sousa. uncaton contenton n tas schedung. IEEE Trans. Parae Dstrbuted Systems, 16(6):503 515, 2004. [29] M. Tan, H. J. Sege, J. K. Antono, and Y. A. L. Mnmzng the apcaton executon tme through schedung of subtass and communcaton traffc n a heterogeneous computng system. IEEE Transactons on Parae and Dstrbuted Systems, 8(8):857 871, 1997.