Online Algorithms for Uploading Deferrable Big Data to The Cloud



Similar documents
Stochastic Models of Load Balancing and Scheduling in Cloud Computing Clusters

Stochastic Models of Load Balancing and Scheduling in Cloud Computing Clusters

An Electricity Trade Model for Microgrid Communities in Smart Grid

Stochastic Models of Load Balancing and Scheduling in Cloud Computing Clusters

Basic Queueing Theory M/M/* Queues. Introduction

BANDWIDTH ALLOCATION AND PRICING PROBLEM FOR A DUOPOLY MARKET

How Much to Bet on Video Poker

Virtual machine resource allocation algorithm in cloud environment

Capacity Planning for Virtualized Servers

An Analytical Model of Web Server Load Distribution by Applying a Minimum Entropy Strategy

An Alternative Way to Measure Private Equity Performance

A Statistical Model for Detecting Abnormality in Static-Priority Scheduling Networks with Differentiated Services

Inventory Control in a Multi-Supplier System

CONSTRUCTION OF A COLLABORATIVE VALUE CHAIN IN CLOUD COMPUTING ENVIRONMENT

Dynamic Resource Allocation in Clouds: Smart Placement with Live Migration

Research Article Load Balancing for Future Internet: An Approach Based on Game Theory

The Packing Server for Real-Time Scheduling of MapReduce Workflows

Scan Detection in High-Speed Networks Based on Optimal Dynamic Bit Sharing

DEFINING %COMPLETE IN MICROSOFT PROJECT

Naglaa Raga Said Assistant Professor of Operations. Egypt.

Two-Phase Traceback of DDoS Attacks with Overlay Network

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Project Networks With Mixed-Time Constraints

Schedulability Bound of Weighted Round Robin Schedulers for Hard Real-Time Systems

Maximizing profit using recommender systems

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Yixin Jiang and Chuang Lin. Minghui Shi and Xuemin Sherman Shen*

A Novel Dynamic Role-Based Access Control Scheme in User Hierarchy

INTRODUCTION TO MERGERS AND ACQUISITIONS: FIRM DIVERSIFICATION

The OC Curve of Attribute Acceptance Plans

II. THE QUALITY AND REGULATION OF THE DISTRIBUTION COMPANIES I. INTRODUCTION

J. Parallel Distrib. Comput.

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

Lecture 3: Force of Interest, Real Interest Rate, Annuity

Enabling P2P One-view Multi-party Video Conferencing

A Multi Due Date Batch Scheduling Model. on Dynamic Flow Shop to Minimize. Total Production Cost

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign

International Journal of Industrial Engineering Computations

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

Secure Cloud Storage Service with An Efficient DOKS Protocol

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Recurrence. 1 Definitions and main statements

Multi-Source Video Multicast in Peer-to-Peer Networks

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

CLoud computing technologies have enabled rapid

An Enhanced K-Anonymity Model against Homogeneity Attack

J. Parallel Distrib. Comput. Environment-conscious scheduling of HPC applications on distributed Cloud-oriented data centers

A Secure Password-Authenticated Key Agreement Using Smart Cards

Efficient On-Demand Data Service Delivery to High-Speed Trains in Cellular/Infostation Integrated Networks

Calculation of Sampling Weights

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

Cloud-based Social Application Deployment using Local Processing and Global Distribution

An MILP model for planning of batch plants operating in a campaign-mode

On the Interaction between Load Balancing and Speed Scaling

NON-CONSTANT SUM RED-AND-BLACK GAMES WITH BET-DEPENDENT WIN PROBABILITY FUNCTION LAURA PONTIGGIA, University of the Sciences in Philadelphia

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

Performance Analysis of Energy Consumption of Smartphone Running Mobile Hotspot Application

Real-Time Process Scheduling

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Extending Probabilistic Dynamic Epistemic Logic

Multiple-Period Attribution: Residuals and Compounding

Analysis of Energy-Conserving Access Protocols for Wireless Identification Networks

Joint Scheduling of Processing and Shuffle Phases in MapReduce Systems

7.5. Present Value of an Annuity. Investigate

What is Candidate Sampling

Description of the Force Method Procedure. Indeterminate Analysis Force Method 1. Force Method con t. Force Method con t

Simple Interest Loans (Section 5.1) :

Multi-Resource Fair Allocation in Heterogeneous Cloud Computing Systems

This paper concerns the evaluation and analysis of order

Section 5.4 Annuities, Present Value, and Amortization

An Error Detecting and Tagging Framework for Reducing Data Entry Errors in Electronic Medical Records (EMR) System

Optimization of network mesh topologies and link capacities for congestion relief

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Solution: Let i = 10% and d = 5%. By definition, the respective forces of interest on funds A and B are. i 1 + it. S A (t) = d (1 dt) 2 1. = d 1 dt.

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE

Traffic State Estimation in the Traffic Management Center of Berlin

Transcription:

Onlne lgorths for Uploadng Deferrable Bg Data to The Cloud Lnquan Zhang, Zongpeng L, Chuan Wu, Mnghua Chen Unversty of Calgary, {lnqzhan,zongpeng}@ucalgary.ca The Unversty of Hong Kong, cwu@cs.hku.hk The Chnese Unversty of Hong Kong, nghua@e.cuhk.edu.hk bstract Ths work studes how to nze the bandwdth cost for uploadng deferral bg data to a cloud coputng platfor, for processng by a MapReduce fraework, assung the Internet servce provder (ISP) adopts the MX contract prcng schee. We frst analyze the sngle ISP case and then generalze to the MapReduce fraework over a cloud platfor. In the forer, we desgn a Heurstc Soothng algorth whose worst-case copettve rato s proved to fall between /(D+) and ( /e), where D s the axu tolerable delay. In the latter, we eploy the Heurstc Soothng algorth as a buldng block, and desgn an effcent dstrbuted randozed onlne algorth, achevng a constant expected copettve rato. The Heurstc Soothng algorth s shown to outperfor the best known algorth n the lterature through both theoretcal analyss and eprcal studes. The effcacy of the randozed onlne algorth s also verfed through sulaton studes. I. INTRODUCTION Cloud coputng s eergng as a new coputng paradg that enables propt and on-deand access to coputng resources. s exeplfed n azon EC [] and Lnode [], cloud provders nvest substantally nto ther data centre nfrastructure, provdng a vrtually unlted sea of CPU, RM and bandwdth resources to cloud users, often asssted by vrtualzaton technologes. The elastc and on-deand nature of cloud coputng asssts cloud users to eet ther dynac and fluctuatng deands wth nal anageent overhead, whle the cloud ecosyste as a whole acheves econoes of scale through cost aortzaton. Typcal coputng jobs hosted n the cloud nclude large scale web applcatons [3] and bg data analytcs [4]. In such data-ntensve applcatons, a large volue of nforaton (up to terabytes or even petabytes) s perodcally transtted between the user locaton and the cloud, through the publc Internet. Parallel to utlty bll reducton n data centres (coputaton cost control), bandwdth charge nzaton (councaton cost control) now represents a ajor challenge n the cloud coputng paradg [5], [6], [7], where a sall fracton of proveent n effcency translates nto llons of dollars n annual savngs across the world [8]. Coercal Internet access, partcularly the transfer of bg data, s nowadays routnely prced by the Internet servce Ths work s supported n part by the Natural Scences and Engneerng Research Councl of Canada (NSERC), and grants fro Hong Kong RGC under the contracts HKU 778 and HKU 7853. 978--4799-336-/4/$3. 4 IEEE provders (ISPs) through a percentle charge odel, a draatc departure fro the ore ntutve total-volue based charge odel as n resdental utlty bllng or the flat-rate charge odel as n personal Internet and telephone bllng [5], [9], [7], []. Specfcally, n a θ-th percentle charge schee, the ISP dvdes the charge perod, e.g., 3 days, nto sall ntervals of equal fxed length, e.g., 5 nutes. Statstcal logs suarze traffc volues wtnessed n dfferent te ntervals, sorted n ascendng order. The traffc volue of the θ-th percentle nterval s chosen as the charge volue. For exaple, under the 95th-percentle charge schee, the cost s proportonal to the traffc volue sent n the 88- th (95% 3 4 6/5 = 88) nterval n the sorted lst [9], [7], []. The MX contract odel s sply the - th percentle charge schee. Such percentle charge odels are perhaps less surprsng when one consders the fact that nfrastructure provsonng cost s ore closely related to peak nstead of average deand. Due to both ts new algorthc plcatons and ts econoc sgnfcance n practce, ths nterestng percentle charge odel has soon spawned a seres of studes. Most of these endeavours exane cost savng strateges and opportuntes through careful traffc schedulng, ulthong (subscrbng to ultple ISPs), and nter-isp traffc shftng. However, they odel the cost nzaton proble wth a crtcal, although soetes plct, assupton that all data generated at the user locaton have to be uploaded to the cloud edately, wthout any delay [9], []. Consequently, the soluton space s restrcted to traffc soothng n the spatal doan only. Real-world bg data applcatons reveal a dfferent pcture, n whch a reasonable aount of uploadng delay (often specfed n servce level agreeent, or SL) s tolerable by the cloud user, provdng a golden te wndow for traffc soothng n the teporal doan, whch can substantally slash peak traffc volues and hence councaton cost. n exaple les n astronocal data fro observatores, whch are perodcally generated at huge volues but requre no urgent attenton. nother well-known exaple s huan genoe analyses [4], where data are also bg but not te-senstve. The an challenge of effectve teporal doan soothng les n the uncertanly n future data arrvals. Therefore a practcal cost nzaton soluton s nherently an onlne algorth, akng perodcal optzaton decsons based on

htherto nput. It s agan, surprsng, to dscover that the onlne cost nzaton for deferrable upload under percentle chargng, even when defned over a sngle lnk fro one source to one recever only, s stll hghly non-trval, exhbtng a rch cobnatoral structure, yet never studed before n the lterature of ether coputer networkng or theoretcal coputer scence (wth an only excepton below) [5]. The only study of the onlne cost nzaton proble under percentle charges that we are aware of s a recent work of Golubchk et al. [5], whch focuses exclusvely on the sngle pont-to-pont lnk case. The onlne algorth they present, referred to as Sple Soothng here, s extreely sple, and nvolves evenly soothng every nput across ts wndow of tolerable delay for upload. Nonetheless, ths seengly straghtforward algorth s proven to approach the offlne optu wthn a sall constant under the MX odel. In ths work, we frst desgn our own onlne algorth for a sngle lnk, also adoptng the MX odel, n preparaton for the MapReduce data processng case. Based on the nsght that Sple Soothng gnores valuable nforaton ncludng the axu volue recorded so far and the current aount of backlogged data and ther deadlnes, we talor a ore sophstcated soluton, whch ncorporates a few heurstc soothng deas and s hence referred to as Heurstc Soothng. We prove that Heurstc Soothng always guarantees a copettve rato no worse than that of Sple Soothng, under any possble data arrval pattern. Theoretcal analyss shows that Heurstc Soothng can acheve a worst-case copettve rato between D+ and ( e ), where D s the tolerable delay. We further extend the sngle lnk case to a cloud scenaro where ultple ISPs are eployed to transfer bg data dynacally for processng usng a MapReduce-lke fraework. Data are routed fro the cloud user to appers and then reducers, both resdng n potentally dfferent data centres of the cloud [6]. We apply Heurstc Soothng as a plug-n odule for desgnng a dstrbuted and randozed onlne algorth wth very low coputatonal coplexty. The copettve rato guaranteed by the randozed onlne algorth ncreases fro that of Heurstc Soothng by a sall constant factor. Extensve evaluatons are conducted to nvestgate the perforance of the proposed onlne algorths. The results show that Heurstc Soothng perfors uch better than Iedate Transfer (IT), a straghtforward algorth that gnores teporal soothng. Meanwhle Heurstc Soothng also acheves saller copettve ratos than Sple Soothng does. In ost cases tested, the observed copettve rato of Heurstc Soothng s saller than.5, better than the theoretcal upper bound, and relatvely close to the offlne optu. Such superor perforance s attrbuted to less abrupt responses to hghly volatle traffc deand. Eprcal studes for the cloud scenaro further verfy the effcacy of the randozed cost reducton algorth, n ters of both scalablty and copettve rato. In the rest of ths paper, we dscuss related work n Sec. II, and ntroduce the syste odel n Sec. III. Heurstc Soothng and the randozed algorth for the cloud scenaro are desgned and analyzed n Sec. IV and Sec. V, respectvely. Evaluaton results are n Sec. VI. Sec. VII concludes the paper. II. RELTED WORK Slar to deferrng data upload to nze the peak bandwdth deand, there have been studes on schedulng CPU tasks to nze the axu CPU speed, that s closely related to the power consupton. Yao et al. [] ntally provde an optal offlne algorth, the YDS algorth, to optally nze power consupton by scalng CPU speed under the assupton that the forer s a convex functon of the latter. Bansal et al. [] further propose the BKP algorth wth a copettve rato of e, for nzng the axu speed when facng arbtrary nputs wth dfferent delay requreents, and arbtrary workload patterns. Towards new challenges brought by the prolferaton of ult-core processors, lbers et al. [3] desgn an onlne algorth for ult-processor job schedulng wthout nterprocess job graton. Bngha et al. [4] and ngel et al. [5] further propose polynoal-te offlne optal algorths, wth graton of jobs consdered. Grener et al. [6] generalze a c-copettve onlne algorth for a sngle processor nto a randozed cb α -copettve onlne algorth for ultple processors, where B α s the α-th Bell nuber. Dfferent fro the MX traffc charge odel n ths work, they focus on the total volue based energy charges coputed by ntegratng nstantaneous power consupton over te. In recent years, data centre workload schedulng wth deadlne constrants has been extensvely studed n the cloud coputng lterature. Gupta et al. [7] analyze the energy nzaton proble n a data center when avalable deadlne nforaton of the workload ay be used to defer job executon for reduced energy consupton. Yao et al. [8] tackle the power reducton proble wth deferrable workloads n date centers usng the Lyapunov optzaton approach, for approxate te averaged optzaton. few studes exst on the transfer of bg data to the cloud. Cho et al. [9] desgn a statc cost-aware plannng syste for transferrng large aounts of data to the cloud provder va both the Internet and courer servces. Consderng a dynac transfer schee where data s produced dynacally, Zhang et al. [6] propose two onlne algorths to nze the total transfer cost. Dfferent fro ths work, they assue andatory edate data upload, and adopt a total volue based charge odel nstead of the percentle charge odel. Goldenberg et al. [9] study the ulthong proble under 95-percentle traffc charges. Grothey et al. [] nvestgate a slar proble through a stochastc dynac prograng approach. They both leverage ISP subscrpton swtchng for traffc engneerng, so that the charge volue s nzed. However, data traffc n ther studes cannot be deferred. dler et al. [] focus on careful routng of data traffc between two types of ISPs (verage contract, Maxu contract) to pursue the optal onlne soluton, leadng to an onlne optzaton proble slar to the classc sk-rental proble. Golubchk

et al. [5] study the nzaton of transsson cost by explotng a sall tolerable delay when ISPs adopt a 95- percentle or MX charge odel, focusng on a sngle lnk only, and proposng the Sple Soothng algorth. between DC and DC, and ISP B for councatng between DC and DC3. If two nter-dc connectons are covered by the sae ISP, t can be equvalently vewed as two ISPs wth dentcal traffc charge odels. III. SYSTEM MODEL We consder a cloud user who generates large aounts of data dynacally over te, requred for transfer nto a cloud or a federaton of clouds for processng usng a MapReducelke fraework. The appers and reducers ay resde n geographcally dspersed data centres. The bg data n queston can tolerate bounded upload delays specfed n ther SL. User Locaton DC DC DC ' DC '. The MapReduce Fraework MapReduce, ntally unveled by Dean and Gheawat [], s a prograng odel targetng at effcently processng large datasets n parallel. typcal MapReduce applcaton ncludes two functons ap and reduce, both wrtten by the users. Map processes nput key/value pars, and produces a set of nteredate key/value pars. The MapReduce lbrary cobnes all nteredate values assocated wth the sae nteredate key I and then passes the to the reduce functon. Reduce then erges these values assocated wth the nteredate key I to produce saller sets of values. There are four stages n the MapReduce fraework: pushng, appng, shufflng, and reducng. The user transfers workloads to the appers durng the pushng stage. The appers process the durng the appng stage, and delver the processed data to the reducers durng the shufflng stage. Fnally the reducers produce the results n the reducng stage. In a dstrbuted syste, appng and reducng stages can happen at dfferent locatons. The syste wll delver all nteredate data fro appers to reducers durng the shufflng stage, and the cloud provders ay charge for nter-datacentre traffc durng the shufflng stage. Recent studes [], [3] suggest that the relaton between nteredate data sze and orgnal data sze depends closely on the specfc applcaton. For applcatons such as n-gra odels, nteredate data sze s uch bgger, and the bandwdth cost charged by the cloud provder cannot be neglected. We use β to denote the rato of orgnal data sze to nteredate data sze. B. Cost Mnzaton for MapReduce pplcatons We odel a cloud user producng a large volue of data every hour, as exeplfed by astronocal observatores [6]. s shown n Fg., the data locaton s ult-hoed wth ultple ISPs, for councatng wth data centers. Through the nfrastructure provded by ISP, data can be uploaded to a correspondng data centre DC. Each ISP has ts own traffc charge odel and prcng functon. fter arrval at the data centers, the uploaded data wll be processed usng a MapReduce-lke fraework. Interedate data need to be transferred aong data centers n the shufflng stage. Towards a general odel, we agan assue that ultple ISPs are eployed by the cloud to councate aong ts dstrbuted data centers, e.g., ISP for councatng DC 3 DC 3' Data Sources Mappers Reducers Fg.. n llustraton of the network for deferrable data upload under the MapReduce fraework. The syste runs n a te-slotted fashon. Each te slot s 5 nutes. The charge perod s a onth (3 days). M and R denote the set of appers and the set of reducers, respectvely. Snce each apper s assocated wth a unque ISP n the frst stage, we eploy M to represent the ISP used to connect the user to apper. ll appers use the sae hash functon to ap the nteredate keys to reducers [3]. The upload delay s defned as the duraton between when data are generated to when they are transtted to the appers. We focus on unfor delays,.e., all jobs have the sae axu tolerable delay D, whch s reasonable assung data generated at the sae user locaton are of slar nature and portance. We use W t to represent each workload released at the user locaton n te slot t. Let x d,t be a decson varable ndcatng the porton of W t assgned to apper at te slot t+d. The cost of ISP s ndcated by f (V ), where V s the axu traffc that goes through ISP at te slot t. To ensure all workload s uploaded nto the cloud, we have: x d,t, M. () D x d,t =, t. () Gven the axu tolerable uploadng delay D, the traffc V t between the user and apper s: D V t = W t d x d,t d, M. (3) Let V be the axu traffc volue of ISP, whch wll be used n the calculaton of bandwdth cost. V satsfes: V V t, t. (4) We assue that ISPs n the frst stage, connectng user to appers, eploy the sae chargng functon f ; and ISPs n the second stage fro appers to reducers use the sae chargng functon f,r. Both chargng functons f and f,r are non-decreasng and convex. We further assue that the frst stage s non-splttable,.e., each workload s uploaded

through one ISP only. The user decdes to delver the workload to apper n te slot t. ssue t takes a unt te to transt data va denote the total data sze at apper n te slot t +. M t+ can be calculated as the suaton of all transtted workloads at te slot t: D M t+ = W t d x d,t d, M. ISPs. Let M t+ ssue the appers take te slot to process a receved workload. Therefore the appers wll transfer data to the reducer n te slot t+. Let T,r t+ be the traffc fro apper to reducer r s n te slot t + : V t+,r = βm t+ y t+,r, M, r R. (5) The axu traffc volue of the ISP (, r), V,r, satsfes: V,r V t+,r, t. (6) Notce that the MapReduce fraework parttons the output pars (key/value) of appers to reducers usng hash functons. ll values for the sae key are always reduced at the sae reducer no atter whch apper t coes fro. Furtherore, we assue that data generated n the data locatons are unforly xed, therefore we have: y t+,r = z r, M, r R. (7) Ths equaton also ples that the superscrpt of y,r t+ can be gnored. Now we can forulate the overall traffc cost nzaton proble for the cloud user, under the MX contract charge odel: nze f (V ) + f,r(v,r) (8),r subject to: V V t, t, (8a) V,r V,r t, t,, r (8b) D x d,t = n, t, (8c) n =, (8d) x d,t, n {, }, (8e). where V t and V t,r are defned n Eqn. (3) and Eqn. (5), respectvely. n s a bnary varable ndcatng whether ISP s eployed or not. For ease of reference, notatons are suarzed l n Tab. I. IV. THE SINGLE ISP CSE We frst nvestgate the basc case that ncludes one apper and one reducer only, co-located n the sae data center, wth no bandwdth cost between the pars. Gven a MX charge odel at the ISP, the algorth tres to explot the allowable delay by schedulng the traffc to the best te slot wthn the allowed te wndow, for reducng the charge volue. Ths can be llustrated through a toy exaple: n t =, a Sybol Defnton TBLE I NOTTION D the axu delay fro the te data s generated to the te the data locaton begns to transt t to the appers. M the set of appers. R the set of reducers. Soe apper and reducer ay be n the sae locaton,.e., M R. W t the workload released n user locaton at te slot t. x d,t the porton of the workload W t that s assgned to apper at te slot t + d. β the rato of the sze of output of a apper to the sze of ts nput. y,r t the porton of the output of apper that s transtted to reducer r at te slot t. z r the porton of the key space apped nto reducer r. V t the total traffc that goes through ISP at te slot t. f (y) the cost of ISP for the nput y. job (MB, ax delay = 9 te slots) s released; n the followng te slots, no jobs are released. If the algorth sooths the traffc across the te slots, the charge volue can be reduced to M B/5n, fro M B/5n f edate transsson s adopted.. The Pral & Dual Cost Mnzaton LPs We can drop the locaton ndex (, r) n ths basc scenaro of one apper and one reducer locatng n the sae data centre. Note that the chargng functon f s a non-decreasng functon of the axu traffc volue. Mnzng the axu traffc volue therefore ples nzng the bandwdth cost. Consequently, the cost nzaton proble n our basc sngle ISP scenaro can be forulated nto the followng (pral) lnear progra (LP): subject to: n{d,t } nze V (9) W t d x d,t d V, t T (9a) D x d,t =, t T D (9b) x d,t, V, d D, t T D, (9c) where T = [, T ], T D = [, T D], D = [, D] and x d,t =, t > T D, d D Introducng dual varable y and z to constrants (9a) and (9b) respectvely, we forulate the correspondng dual LP: subject to: axze T y t t= T D t= z t () (a) z t W ty t+d, t T D, d D (b) y t, t T (c) z t unconstranted, t T D (d)

The nput begns wth W and ends wth W T D, and W T D+ =,..., W T = s padded to the tal of the nput. We use P and D to denote feasble solutons to the pral and dual LPs, respectvely. The optzaton n (9) s a standard lnear progra. For an offlne optal soluton, one can sply solve (9) usng a standard LP soluton algorth such as the splex ethod or the nteror-pont ethod. B. Onlne algorths The splest onlne soluton n the basc one ISP scenaro s the edate transfer algorth (IT). Once a new job arrves, IT transfers t to appers edately wthout any delay. Next we analyze the copettve rato of IT, as copared to the offlne optu. Theore. IT s (D + )-copettve. Proof: Consder the followng nput: (W,,,,,...). IT wll process t edately wth bandwdth cost: W. However the offlne optal algorth wll dvde the workload nto sall peces: W/(D+), W/(D+),...W/(D+),,,,...), feasble wthn the deadlne D, wth axu traffc volue W/(D + ). W Copettve rato λ W/(D + ) = D + We hence obtan a lower bound on the copettve rato of IT, D +. Next we prove D + s also an upper-bound. Wthout explotng any delays, IT provdes a feasble soluton to the pral proble, whch s denoted as P IT. P IT = ax W t t Now we desgn a feasble soluton to the dual proble as follows (assue τ = arg ax t W t ): { /(D + ) f t = τ,..., τ + D y t = otherwse { /(D + )Wt f t = τ z t = otherwse D = D + Wτ So the copettve rato s: Copettve rato λ = PIT OP T PIT D = D + Rearks: f D =,.e. jobs are not deferrable, the offlne optal algorth degrades nto IT, agreeng wth the theore, whch clas IT s -copettve (D + = ). IT s apparently not deal, and ay lead to hgh peak traffc and hgh bandwdth cost as copared wth the offlne optu. Golubchk et al. [5] desgn a cost-aware algorth that strkes to spread out bandwdth deand by utlzng all possble delays, referred to as the Sple Soothng lgorth. Upon recevng a new workload, Sple Soothng evenly dvdes t nto D + parts, and processes the one by one n the current te slot and the followng D te slots, as shown n lgorth. lgorth The Sple Soothng lgorth [5] : for τ = to T D do : for d = to D do 3: x d,τ = /(D + ) 4: end for 5: end for Theore. [5] The copettve rato of Sple Soothng s D+. Theore can be proven through weak LP dualty,.e., usng a feasble dual as the lower bound of the offlne optal. Sple Soothng s very sple, but guarantees a worst case copettve rato saller than. Nonetheless, there s stll roo for further proveents, snce Sple Soothng gnores avalable nforaton such as the htherto axu traffc volue transtted, and the current pressure fro backlogged traffc and ther deadlnes. Such an observaton otvated our desgn of the ore sophstcated Heurstc Soothng algorth for the case D, as shown n lgorth. Here T s the charge perod, τ s the current te slot, and H d s the total volue of data that have been buffered for d te slots. lgorth The Heurstc Soothng lgorth : V ax = : W τ =, τ = T D +,..., T ; 3: H d =, d =,..., D; 4: for τ = to T do { 5: V τ = n } D d= H d D } W τ + D d= H d, ax{v ax, Wτ D+ + 6: f V ax < V τ then 7: V ax = V τ ; 8: end f 9: Transfer the traffc followng Earlest Deadlne Frst (EDF) strategy; : Update H d, d =,..., D; : end for Theore 3. The copettve rato of Heurstc Soothng s lower bounded by ( e ). Proof: Consder the followng nput: (W, W,...W,,..., ) whose frst D + te slots are W. The traffc deand V ncreases untl te slot D +. V D+ = W D + + W (D )W + D + (D + )D +... + (D )D W (D + )D D = W D + ( + D( ( D )D )) We can fnd a feasble pral soluton whch yelds the charge volue D+ D+W. Ths pral soluton s an upper bound of the offlne optu. Therefore the lower bound D+ of the copettve rato λ V D+ (D+) = D+ (D+) ( + D( ( D )D )) ( e ) as D +. Notce that

D+ (D+) ( + D( ( D )D )) s a decreasng functon for D [, + ), we further have λ ( e ). Theore 4. The copettve rato of Heurstc Soothng s upper-bounded by D+. Proof: We take the Sple Soothng algorth (lgorth. ) as a benchark, and we prove that P sooth P heurstc, where P heurstc s the charged volue produced by lgorth 3. lgorth 3 wll only ncrease the traffc deand when W τ D+ + D d= H d/d exceeds V ax. Therefore, we rearrange H d to copute the axu traffc deand. Let V t+d = ( W t+d D + + Wt+D D + (D )Wt+D + +... + (D )D W t ) (D + )D (D + )D D Then P heurstc = ax t V t+d. Let τ = arg ax t V t+d, and we have t+d W t P sooth = ax t D + =t τ+d =τ W τ D + Wτ+D D + + Wτ+D D + + (D )Wτ+D (D + )D +... + (D )D W τ (D + )D D = P heurstc Snce the sple soothng algorth s D+ copettve, the copettve rato of lgorth 3 cannot be worse than D+. Fro the proof above, we have followng corollary. Corollary. For any gven nput, the charge volue resultng fro Heurstc Soothng s always equal to or saller than that of Sple Soothng. lgorth Coplexty. ll three onlne algorths dscussed have oderate te coplexty, akng the lght-weght for practcal applcatons. More specfcally, IT, Sple Soothng and Heurstc Soothng have a te coplexty of O(T D), O((T D)D), and O(T D), respectvely. V. CLOUD SCENRIO In ths secton, we apply the algorths desgned for the sngle ISP case to the cloud scenaro, whch utlzes a MapReduce-lke fraework for processng bg data. Defne Cost = f(v), Cost =,r f,r(v,r), and adopt power charge functons by lettng f (x) = f,r (x) = x α, α >.. lgorth Desgn The two-phase MapReduce cost optzaton proble s defned n (8), and s a dscrete optzaton wth nteger varables. Consequently, an offlne soluton that solves such an nteger progra has a hgh coputatonal coplexty, further otvatng the desgn of an effcent onlne soluton. natve onlne algorth selects a fxed apper and schedules the traffc on the correspondng ISP usng the Sple Soothng lgorth. Theore 5. The copettve rato of the natve onlne algorth s lower bounded by M α, where M s the nuber of appers. Proof: Consder the nput (W,, W,,, ) whose frst D + tes are W. We can verfy that the charge volue s D+. The correspondng cost s ( D+ )α + r (βz r D+ )α. Next we consder a ore ntellgent algorth that assgns the j-th workload to the apper (j od M ). Ths algorth acts as the upper bound of the offlne optu. Its charge volue s (D+) M. The correspondng cost s M ( (D+) M )α + M r (βz r (D+) M )α. Therefore, Copettve rato ( D+ )α + r (βz r D+ )α M ( (D+) M )α + M r (βz r = M α (D+) M )α We next present a dstrbuted randozed onlne algorth for (8). For each workload, the user chooses ISPs unforly at rando to transfer the data to a randoly selected apper. Forally, let W be the randozed workload assgnent allocatng each workload to appers. For each selected ISP, the user runs Heurstc Soothng to gude one-stage traffc deferral and transsson, as shown n lgorth 3. lgorth 3 Randozed Uploadng Schee : Generate a randozed workload assgnent W whch allocates each workload to a randoly selected apper. : For each ISP, apply the sngle ISP algorth, e.g., lgorth to schedule the traffc. We analyze lgorth 3 by buldng a connecton between the uploadng schee π and the randozed workload assgnent W. We cobne π and W to a new uploadng schee π W. Let t = < t < t e = T. Durng each nterval [t, t + ), each ISP s eployed to transfer at ost one workload n the uploadng schee π. If a workload s processed n [t, t + ), then t cannot be fnshed before t +. Due to the MX charge odel, the transfer speed for workload w n [t, t + ) s a sngle speed, say v,w. If workload w s not processed n [t, t + ), we set v,w =. Therefore, for any gven, there are at ost M values of v,w. ssue there are n workloads, forng a set W. Let Ω = {w all workloads assgned to ISP } W. In schee π W, the user transfers data at speed of w Ω v,w n te nterval [t, t + ). Let φ n (Ω ) be the probablty that exactly the workloads Ω are allocated to ISP. φ n(ω ) = ( M ) Ω ( M )n Ω

We next defne functon Λ n(x) where x R n \ {}: Λ n(x) = M φ n n(ω )( x w) α / x α w w Ω w= Ω W Lea. Gven any uploadng schee π and a randozed workload assgnent W, we have a randozed uploadng schee π W, whch satsfes: E(Cost (π W ) + Cost (π W )) ax Λ M (x)(cost x (π) + Cost (π)) Proof: Snce the traffc pattern n ISP (, r), r s exactly the sae as ISP, we only consder one stage. Let us consder schee π frst. In the frst stage, the cost s: Cost (π) = ax,w (v,w) α ax M Σ M (v α,w) where v,w ndcates the transfer speed n ISP durng [t, t + ) for workload w. Σ M (vα,w ) s the su of the largest M values of v,w α when gven. The nequalty holds because there are at ost M non-zero speeds for any gven duraton [t, t + ). We next have the cost of the second stage: Cost (π) = ax,w (βzrv,w) α r = β α zr α ax,w (v,w) α r β α zr α ax Σ M (v,w) α r The cost of the frst stage n π W s: E(Cost (π W )) = φ n(ω W ) ax( M Ω W W = M ax )( φ n(ω W Ω W W w Ω W w Ω W v,w) α v,w) α The second equalty above holds because the assgnent s unforly rando. Slarly, The cost of the second stage n π W s: E(Cost (π W )) = M Ω W = M β α r φ n(ω W P z α r ax ) r ax(z r φ n(ω W Ω W W w Ω W )( w Ω W βv,w) α v,w) α gan because for any [t, t + ), there are at ost M values of v,w. We have M Ω W = M Ω W W φ n(ω W Σ M (vα,w ) = Λ n (v) = Λ M (v ) )( w Ω W v,w) α W φ n(ω W )( w Ω v W,w ) α n w= (vα,w ) where v s an M -densonal subvector of v R n \ {}, whch contans all non-zero transfer speeds n [t, t + ). Therefore, the rato for the frst stage s: E(Cost (π W )) Cost (π) M Ω W W φ(ωw ) ax ( w Ω W ax Σ M (vα,w ) M Ω W W φ(ωw )( w Ω W Σ M (vα,w ) ax Λ M (x) x v,w) α v,w)α where = arg ax ( w Ω v,w) α. Slarly, the rato W for the second stage s also bounded by ax x Λ M (x),.e., E(Cost (π W )) Cost (π) ax x Λ M (x). Ths proves Lea. Let S(α, j) be the j-th Strlng nuber for α eleents, defned as the nuber of parttons of a set of sze α nto j subsets [4]. Let B α be the α-th Bell nuber, defned as the nuber of parttons of a set of sze α [4]. The Bell nuber s relatvely sall when α s sall: B =, B =, B 3 = 5, B 4 = 5. The defntons also ply: α S(α, j) = B α j The followng lea s proven by Grener et al. [6]. Lea. [6] α N and α M, ax x Λ M (x) = M! S(α, j) α j= M j ( M j)!. Theore 6. Gven a λ-copettve algorth wth respect to cost for the sngle ISP case, then the randozed onlne algorth s λb α -copettve n expectaton. Proof: Let π be the optal uploadng schee, the correspondng randozed uploadng schee s πw. The algorth we use s π W. Snce the workloads n πw and π W are the sae, we have: E(Cost (π W )) λe(cost (π W )) () snce the algorth s λ-copettve. Slarly, E(Cost (π W )) λe(cost (π W )) () snce the traffc pattern n ISP (, r), r s exactly the sae as n ISP. Lea ples: E(Cost (π W ) + Cost (π W )) ax x Λ M (x)(cost (π ) + Cost (π )) (3) Snce Λ M (x) s a onotoncally ncreasng functon of α, we use α as an upper bound of α >, obtanng a correspondng upper bound of Λ M (x). Cobnng Eqn. () () and (3) as well as Lea, we have the followng expected cost of the randozed onlne algorth:

E(Cost (π W ) + Cost (π W )) λe(cost (π W ) + Cost (π W )) λ ax x Λ M (x)(cost (π ) + Cost (π )) α M! = λ S( α, j) M j ( M j)! (Cost(π ) + Cost (π )) j= α λ S( α, j)(cost (π ) + Cost (π )) j= λb α OP T Reark: For a sngle lnk, we can eploy Heurstc Soothng, whose copettve rato s saller than wth respect to axu traffc volue. Then the copettve rato of lgorth s α n cost. Thus lgorth 3 s α B α - copettve n expectaton. When α =, the copettve rato s 8, a constant regardless of the nuber of appers. VI. PERFORMNCE EVLUTION We have pleented Sple Soothng, Heurstc Soothng, as well as the randozed onlne algorth, for perforance evaluaton through sulaton studes. The default nput W t s generated unforly at rando, as shown n Fg., where all data are noralzed,.e., scaled down by ax t W t. We assue there are 5 appers at dfferent locatons, and 5 reducers at dfferent locatons. We choose α =, thus the charge functon f (x) = f,r (x) = x.. The Sngle ISP Case Frst we copare Heurstc Soothng wth Sple Soothng. The two algorths are executed under a delay requreent D = 5. Fg. 3 llustrates the traffc volue scheduled at each te slot. Copared wth Sple Soothng, Heurstc Soothng results n a axu traffc volue ths s about 8% saller. Heurstc Soothng tres to explot the avalable delay to average the traffc and s less senstve to the fluctuaton of traffc deand, as copared wth Sple Soothng. For exaple, at around t =, the traffc of Sple Soothng ncreases abruptly due to hgh traffc deand n the nput; around t = 4, t goes down due to low traffc deand. In coparson, Heurstc Soothng results n ore even traffc dstrbutons around t = and t = 4. Next we exane how the tolerable delay affects the perforance of the proposed onlne algorths. We execute Sple Soothng, Heurstc Soothng and IT aganst a varety of delays rangng fro D = to D = 4. We also copute the offlne optu as a benchark. The observed copettve ratos are shown n Fg. 4. The results suggest that both Sple Soothng and Heurstc Soothng perfor uch better than IT. Heurstc Soothng also beats Sple Soothng, by a saller argn. Heurstc Soothng approaches the offlne optu rather closely; the observed copettve ratos are always below.5 and usually around., uch better than the theoretcally proven upper bound n Theore 4. Heurstc Soothng s further evaluated under other rando nputs, ncludng Posson dstrbuton n Fg. 5, Gaussan dstrbuton n Fg. 6 and a specfcally desgned rando nput n Fg. 7. ll results verfy that Heurstc Soothng works best aong the three onlne cost nzaton algorths. B. The Cloud Scenaro We pleented the randozed algorth n lgorth 3 and the natve algorth n Sec. V-. They are evaluated under three types of nputs: unfor dstrbuton, Posson dstrbuton and Gaussan dstrbuton. We copare the costs of the two algorths usng these nputs, as shown n Fg. 8, Fg. 9 and Fg., respectvely. We observe that the randozed algorth acheves uch lower cost than the natve algorth, n partcular wth longer tolerable delays. For exaple, Fg. 8 shows that the randozed algorth saves approxately 45% cost as copared wth the natve algorth when D = 5, and t saves ore than 68% when D =. Ths suggests that longer tolerable delays provde the randozed algorth ore space of aneuver, leadng to ore evdent cost reduce. We further nvestgate the nfluence of β, the rato of orgnal data sze to the nteredate data sze. Results are shown n Fg.. When D s sall, a large β causes a rather hgh cost. However when a large D s used, e.g., D = 4, even a large β only produces a relatvely sall cost. Noralze Cost.6.4 5 5 3 4 β Fg.. Relatonshp between traffc cost and paraeters D, β. VII. CONCLUSION ISPs now charge bg data applcatons wth a new, nterestng percentle based odel, leadng to new onlne algorth desgn probles for nzng the traffc cost pad for uploadng bg data to the cloud. We studed two scenaros for such onlne algorth desgn n ths work. Heurstc Soothng algorth s proposed n the sngle lnk case, wth proven better perforance than the best alternatve n the lterature, and a saller copettve rato below. randozed onlne algorth s desgned for the MapReduce fraework, achevng a constant copettve rato by eployng Heurstc Soothng as a buldng odule. We have focused on MX charge rules, and leave slar onlne algorth desgn for 95-percentle charge rules as future work. REFERENCES [] azon Elastc Copute Cloud, http://aws.aazon.co/ec/. [] Lnode, https://www.lnode.co/speedtest/. [3] azon EC Case-studes, http://aws.aazon.co/solutons/casestudes.

Noralzed Data Traffc.6.4 Unfor Input 4 6 8 Te Noralzed Scheduled Traffc.6.4 Sple Soothng Heurstc Soothng 4 6 8 Te Copettve Rato 3.5.5 IT Sple Soothng Heurstc Soothng 5 5 5 Copettve Rato.5.5 Fg.. Unforly Rando Input. IT Sple Soothng Heurstc Soothng 5 5 5 Fg. 5. Copettve rato over varous delay wndow szes under nput of Posson dstrbuton. Fg. 3. Sple Soothng vs. Heurstc Soothng, D = Copettve Rato 5 4 3 IT Sple Soothng Heurstc Soothng 5 5 5 Fg. 6. Copettve rato over varous delay wndow szes under nput of Gaussan dstrbuton. Fg. 4. Copettve rato over varous delay wndow szes under nput of unfor dstrbuton. Copettve Rato 8 7 6 5 4 3 IT Sple Soothng Heurstc Soothng 5 5 5 Fg. 7. Copettve rato over varous delay wndow szes under a specfcally desgned nput. Randozed lgorth Natve lgorth Randozed lgorth Natve lgorth Randozed lgorth Natve lgorth Noralze Cost.6.4 Noralze Cost.6.4 Noralze Cost.6.4 5 5 5 5 5 5 5 5 5 Fg. 8. Coparson between the proposed randozed algorth and the natve algorth under nput of unfor dstrbuton and β =. Fg. 9. Coparson between the proposed randozed algorth and the natve algorth under nput of Posson dstrbuton and β =. Fg.. Coparson between the proposed randozed algorth and the natve algorth under nput of Gaussan dstrbuton and β =. [4] E. E. Schadt, M. D. Lnderan, J. Sorenson, L. Lee, and G. P. Nolan, Coputatonal Solutons to Large-scale Data Manageent and nalyss, Nat Rev Genet, vol., no. 9, pp. 647 657, Sep.. [5] L. Golubchk, S. Khuller, K. Mukherjee, and Y. Yao, To Send or not to Send: Reducng the Cost of Data Transsson, n Proc. of IEEE INFOCOM, 3. [6] L. Zhang, C. Wu, Z. L, C. Guo, M. Chen, and F. Lau, Movng Bg Data to The Cloud: n Onlne Cost-Mnzng pproach, IEEE Journal on Selected reas n Councatons, vol. 3, no., pp. 7 7, 3. [7] H. Wang, H. Xe, L. Qu,. Slberschatz, and Y. Yang, Optal ISP Subscrpton for Internet Multhong: lgorth Desgn and Iplcaton nalyss, n Proc. of IEEE INFOCOM, 5. [8] S. Peak, Beyond Bandwdth: The Busness Case For Data cceleraton, Whte Paper, 3. [9] D. K. Goldenberg, L. Quy, H. Xe, Y. R. Yang, and Y. Zhang, Optzng Cost and Perforance for Multhong, n Proc. of CM SIGCOMM, 4. []. Grothey and X. Yang, Top-percentle Traffc Routng Proble by Dynac Prograng, Optzaton and Engneerng, vol., pp. 63 655,. [] F. Yao,. Deers, and S. Shenker, Schedulng Model for Reduced CPU Energy, n Proc. of IEEE FOCS, 995. [] N. Bansal, T. Kbrel, and K. Pruhs, Speed Scalng to Manage Energy and Teperature, J. CM, vol. 54, no., pp. 3: 3:39, Mar. 7. [3] S. lbers, F. Müller, and S. Schelzer, Speed Scalng on Parallel Processors, n Proc. of CM SP, 7. [4] B. Bngha and M. Greenstreet, Energy Optal Schedulng on Multprocessors wth Mgraton, n Proc. of IEEE ISP, 8. [5] E. ngel, E. Baps, F. Kace, and D. Letsos, Speed Scalng on Parallel Processors wth Mgraton, n Euro-Par Parallel Processng, ser. Lecture Notes n Coputer Scence, C. Kaklaans, T. Papatheodorou, and P. Spraks, Eds. Sprnger Berln Hedelberg,, vol. 7484, pp. 8 4. [6] G. Grener, T. Nonner, and. Souza, The Bell s Rngng n Speedscaled Multprocessor Schedulng, n Proc. of CM SP, 9. [7] M.. dnan, Y. Ma, R. Sughara, and R. Gupta, Dynac Deferral of Workload for Capacty Provsonng n Data Centers, http://arxv.org/abs/9.3839. [8] Y. Yao, L. Huang,. Shara, L. Golubchk, and M. Neely, Data Centers Power Reducton: Two Te Scale pproach for Delay Tolerant Workloads, n Proc. of IEEE INFOCOM,. [9] B. Cho and I. Gupta, New lgorths for Plannng Bulk Transfer va Internet and Shppng Networks, n Proc. of IEEE ICDCS,. [] M. dler, R. K. Staraan, and H. Venkataraan, lgorths for Optzng the Bandwdth Cost of Content Delvery, Coput. Netw., vol. 55, no. 8, pp. 47 4, Dec.. [] J. Dean and S. Gheawat, MapReduce: Splfed Data Processng on Large Clusters, Coun. CM, vol. 5, no., pp. 7 3, Jan. 8. [] S. Rao, R. Raakrshnan,. Slbersten, M. Ovsannkov, and D. Reeves, Salfsh: Fraework for Large Scale Data Processng, Yahoo!Labs, Tech. Rep.,. [3] B. Hentz,. Chandra, and R. K. Staraan, Optzng MapReduce for Hghly Dstrbuted Envronents, Departent of Coputer Scence and Engneerng, Unversty of Mnnesota, Tech. Rep.,. [4] H. Becker and J. Rordan, The rthetc of Bell and Strlng nubers, ercan journal of Matheatcs, vol. 7, no., pp. 385 394, 948.