3 develop a column generaton technque for solvng the lnear programmng problem. C. Column Generaton In a column generaton procedure, the lnear program s frst wrtten n terms of a convex combnaton of an exponental number of columns. Ths s called the master problem (MP). The master problem s then solved by successve approxmaton. The approxmaton of the master problem s done by restrctng the set of columns n the lnear programmng problem. Ths s called the restrcted master problem (RMP). The dual soluton to the restrcted master problem s used to verfy f the current soluton s optmal. If not, a new column s generated to be ncluded n the restrcted master problem. Ths process s repeated untl optmalty s reached. The practcalty of the column generaton procedure depends on whether optmalty verfcaton and new column generaton can be done effcently. In our case, ths column generaton procedure s very easy to solve snce t reduces to a smple sortng procedure. We now gve a more detaled vew of the column generaton procedure for the schedulng problem on hand. The master problem (MP) n terms of the extreme ponts of the polyhedrons P p s the followng: subject to n p! n p! k= Z MP = mn u T w u C u () C u λ k p(u) Ek p(u) (u) u (3a) λ k p = p k= (3b) C v C u + t v (u, v) L (3c) Instead of usng all the extreme ponts of the polyhedron, the restrcted master problem s formulated over a subset of extreme ponts. Let Z(p) denote a subset of extreme ponts for processor p. Intally we generate one extreme pont for each processor. Ths s done as follows: For processor p, we randomly order the tasks n J p and compute the completon tme for all the tasks. Ths vector of completon tmes s an extreme pont of P p and s ncluded n the restrcted master problem. Therefore, ntally Z(p) contans only one element for each processor p. The restrcted master problem s smlar to the master problem except that t s formulated over the extreme ponts n Z(p). subject to k Z(p) C u Z RMP = mn u T w u C u (4) k Z(p(u)) λ k p(u) Ek p(u) (u) u (5a) λ k p = p (5b) C v C u + t v (u, v) L (5c) Note that Z RMP Z MP. We now wrte the dual to the master problem whch we call the column generator (CG): Z CG = max δ p + t v π uv (6) p v subject to θ u v:(u,v) L π uv + v:(v,u) L u:(u,v) L π vu w u u (7a) δ p u J p E k p (u)θ u 0 k p (7b) By lnear programmng dualty we know that Z CG = Z MP. Moreover any feasble soluton to the dual s a lower bound on Z MP. If we solve the restrcted master problem, and obtan the optmal dual varables then we can check f that soluton s feasble to CG. If t s, then the soluton s optmal to the master problem. The dual optmal soluton to the restrcted master problem wll be feasble to equatons Equaton (7a). However, t may not be feasble to equatons Equaton (7b) snce the RMP uses only a subset of the extreme ponts. Gven a current dual optmal soluton (δp, θu, πuv) to the RMP, we have to check f δp Ep k (u)θu k p. u J p Ths s equvalent to the followng lnear programmng problem. Z p (θ ) = mn θ C u P p uc u (8) u J p Ths s just the classcal problem of mnmzng the weghted completon tme of the tasks on processor p. The weght of task u s θu n ths case. The soluton procedure s called Smth s Rule [9] and s the followng: Sort the tasks such that θ θ... θ n p t t t np and schedulng the tasks n the order of ncreasng ndex. In ths case, the completon tme of task u s = u v= t v and n p Z p (θ ) = θuc u. (9) u= Note that defnes an extreme pont of the polyhedron P p. If δp Z p (θ), for all p then the current value of Z RMP = Z MP. If there s a p such that δp > Z p (θ) then extreme pont correspondng to ths processor s added to the restrcted master problem and the restrcted master s solved agan. Gven the current dual varable values of θ and π, we can derve a feasble dual soluton as follows: Equaton (7a) s satsfed by any dual soluton to the RMP. We know that settng δ p = Z p (θ ) makes a feasble dual soluton. Therefore, Z p (θ ) + t v πuv (0) p v u:(u,v) L s a lower bound on Z MP. The lower bound may not be monotoncally ncreasng. Therefore, we keep track of the current best (hghest) lower bound as LB. Instead of solvng the MP to optmalty, gven some threshold ɛ we can termnate computaton f the rato of the soluton to the restrcted master problem to the best lower bound s less than ( + ɛ). Algorthm summarzes the column generaton procedure.

9 Compettve Rato Lower Bound H-MARES MARES Problem nstance (a) Results agant the Optmal Compettve Rato Fg. 4: Performance test for Two phases Map-Reduce 0.5 H-MARES 0 MARES Problem nstance (b) Large problem nstances Compettve Rato H-MASHERS.4 D-MASHERS Problem nstance (a) Compettve Ratos Compettve Rato Fg. 5: Performance test for Map-Shuffle-Reduce Number of Jobs (b) Average H-MASHERS D-MASHERS on these guaranteed performance algorthms we developed heurstcs that were tested expermentally and performed sgnfcantly better that the worst case guaranteed performance. We are currently workng on mprovng the worst case performance guarantees as well as testng the algorthm on larger data sets. REFERENCES [] J. Dean and S. Ghemawat, MapReduce: Smplfed Data Processng on Large Clusters, n USENIX OSDI, 004. [] H. Chang, M. S. Kodalam, R. R. Kompella, T. V. Lakshman, M. Lee, and S. Mukherjee, Schedulng n mapreduce-lke systems for fast completon tme, n IEEE INFOCOM, 0. [3] M. Chowdhury, M. Zahara, J. Ma, M. I. Jordan, and I. Stoca, Managng data transfers n computer clusters wth orchestra, n ACM SIGCOMM, 0. [4] M. Isard, V. Prabhakaran, J. rrey, U. Weder, K. Talwar, and A. Goldberg, Quncy: Far Schedulng for Dstrbuted Computng Clusters, n SOSP, 009. [5] M. Zahara, D. Borthakur, J. S. Sarma, K. Elmeleegy, S. Shenker, and I. Stoca, Delay schedulng: a smple technque for achevng localty and farness n cluster schedulng, n EuroSys 0: Proceedngs of the 5th European conference on Computer systems, 00, pp [6] G. Ananthanarayanan, S. Kandula, A. Greenberg, I. Stoca, Y. Lu, B. Saha, and E. Harrs, Renng n the outlers n map-reduce clusters usng mantr, n USENIX OSDI, 00. [7] J. Lenstra, A. Rnnooy Kan, and P. Brucker, Complexty of machne schedulng problems, Annals of Dscrete Mathematcs, vol., pp , 977. [8] Y. Km, Data mgraton to mnmze the total completon tme, Journal of Algorthms, vol. 55, no., pp. 4 57, 005. [9] P. Brucker, Schedulng algorthms. Sprnger, 004. [0] R. Graham, Bounds on multprocessng tmng anomales, SIAM Journal on Appled Mathematcs, vol. 7, no., pp , 969. [] J. Lenstra and A. Kan, Complexty of schedulng under precedence constrants, Operatons Research, pp. 35, 978. [] S. Chakrabart, C. Phllps, A. Schulz, D. Shmoys, C. Sten, and J. Wen, Improved schedulng algorthms for mnsum crtera, Automata, Languages and Programmng, pp , 996. [3] M. Queyranne and A. Schulz, Approxmaton bounds for a general class of precedence constraned parallel machne schedulng problems, SIAM Journal on Computng, vol. 35, no. 5, pp. 4 53, 006. [4] R. Gandh and J. Mestre, Combnatoral algorthms for data mgraton to mnmze average completon tme, Algorthmca, vol. 54, no., pp. 54 7, 009. [5] M. Hajaghay, R. Khandekar, G. Kortsarz, and V. Laghat, On a local protocol for concurrent fle transfers, n Proceedngs of the 3rd ACM symposum on Parallelsm n algorthms and archtectures. ACM, 0, pp [6] A. Bar-Noy, M. Halldórsson, G. Kortsarz, R. Salman, and H. Shachna, Sum multcolorng of graphs, Journal of Algorthms, vol. 37, no., pp , 000. [7] R. Gandh, M. Halldórsson, G. Kortsarz, and H. Shachna, Improved bounds for schedulng conflctng jobs wth mnsum crtera, ACM Transactons on Algorthms (TALG), vol. 4, no., p., 008.

