Cost Efficient Datacenter Selection for Cloud Services



Similar documents
Game Theoretic Modeling of Cooperation among Service Providers in Mobile Cloud Computing Environments

A Data Placement Strategy in Scientific Cloud Workflows

Bellini: Ferrying Application Traffic Flows through Geo-distributed Datacenters in the Cloud

Firewall Design: Consistency, Completeness, and Compactness

Optimal Control Policy of a Production and Inventory System for multi-product in Segmented Market

HOST SELECTION METHODOLOGY IN CLOUD COMPUTING ENVIRONMENT

On Adaboost and Optimal Betting Strategies

GPRS performance estimation in GSM circuit switched services and GPRS shared resource systems *

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 12, June 2014

INFLUENCE OF GPS TECHNOLOGY ON COST CONTROL AND MAINTENANCE OF VEHICLES

Lecture L25-3D Rigid Body Kinematics

Sensor Network Localization from Local Connectivity : Performance Analysis for the MDS-MAP Algorithm

A Blame-Based Approach to Generating Proposals for Handling Inconsistency in Software Requirements

A New Pricing Model for Competitive Telecommunications Services Using Congestion Discounts

Low-Complexity and Distributed Energy Minimization in Multi-hop Wireless Networks

An intertemporal model of the real exchange rate, stock market, and international debt dynamics: policy simulations

MSc. Econ: MATHEMATICAL STATISTICS, 1995 MAXIMUM-LIKELIHOOD ESTIMATION

Hull, Chapter 11 + Sections 17.1 and 17.2 Additional reference: John Cox and Mark Rubinstein, Options Markets, Chapter 5

The one-year non-life insurance risk

Minimum-Energy Broadcast in All-Wireless Networks: NP-Completeness and Distribution Issues

State of Louisiana Office of Information Technology. Change Management Plan

CALCULATION INSTRUCTIONS

Modelling and Resolving Software Dependencies

Ch 10. Arithmetic Average Options and Asian Opitons

Chapter 9 AIRPORT SYSTEM PLANNING

ThroughputScheduler: Learning to Schedule on Heterogeneous Hadoop Clusters

Product Differentiation for Software-as-a-Service Providers

Optimizing Multiple Stock Trading Rules using Genetic Algorithms

A New Evaluation Measure for Information Retrieval Systems

Data Center Power System Reliability Beyond the 9 s: A Practical Approach

Net Neutrality, Network Capacity, and Innovation at the Edges

Web Appendices of Selling to Overcon dent Consumers

Scalable live video streaming to cooperative clients using time shifting and video patching

Modeling and Predicting Popularity Dynamics via Reinforced Poisson Processes

JON HOLTAN. if P&C Insurance Ltd., Oslo, Norway ABSTRACT

Safety Management System. Initial Revision Date: Version Revision No. 02 MANUAL LIFTING

Unbalanced Power Flow Analysis in a Micro Grid

10.2 Systems of Linear Equations: Matrices

View Synthesis by Image Mapping and Interpolation

Search Advertising Based Promotion Strategies for Online Retailers

Consumer Referrals. Maria Arbatskaya and Hideo Konishi. October 28, 2014

Stock Market Value Prediction Using Neural Networks

Stochastic Planning for Content Delivery: Unveiling the Benefits of Network Functions Virtualization

Chapter 4: Elasticity

zupdate: Updating Data Center Networks with Zero Loss

Forecasting and Staffing Call Centers with Multiple Interdependent Uncertain Arrival Streams

Enterprise Resource Planning

A Universal Sensor Control Architecture Considering Robot Dynamics

Improving Emulation Throughput for Multi-Project SoC Designs

How To Segmentate An Insurance Customer In An Insurance Business

MODELLING OF TWO STRATEGIES IN INVENTORY CONTROL SYSTEM WITH RANDOM LEAD TIME AND DEMAND

Energy Cost Optimization for Geographically Distributed Heterogeneous Data Centers

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 14 10/27/2008 MOMENT GENERATING FUNCTIONS

Web Appendices to Selling to Overcon dent Consumers

Towards a Framework for Enterprise Architecture Frameworks Comparison and Selection

SCDA: SLA-aware Cloud Datacenter Architecture for Efficient Content. Storage and Retrieval

Option Pricing for Inventory Management and Control

How To Predict A Call Capacity In A Voip System

Security Vulnerabilities and Solutions for Packet Sampling

Sensitivity Analysis of Non-linear Performance with Probability Distortion

Performance Analysis of Bandwidth Allocations for Multi-Services Mobile Wireless Cellular Networks *

Cross-Over Analysis Using T-Tests

Cost-aware Workload Dispatching and Server Provisioning for Distributed Cloud Data Centers

Detecting Possibly Fraudulent or Error-Prone Survey Data Using Benford s Law

Optimal Energy Commitments with Storage and Intermittent Supply

Professional Level Options Module, Paper P4(SGP)

Unsteady Flow Visualization by Animating Evenly-Spaced Streamlines

Digital barrier option contract with exponential random time

How To Evaluate Power Station Performance

Measures of distance between samples: Euclidean

Dynamic Network Security Deployment Under Partial Information

Safety Stock or Excess Capacity: Trade-offs under Supply Risk

A Scheme to Estimate One-way Delay Variations for Diagnosing Network Traffic Conditions

A Theory of Exchange Rates and the Term Structure of Interest Rates

A Generalization of Sauer s Lemma to Classes of Large-Margin Functions

Hybrid Model Predictive Control Applied to Production-Inventory Systems

Calibration of the broad band UV Radiometer

How To Connect Two Servers Together In A Data Center Network

Mathematics Review for Economists

Coalitional Game Theoretic Approach for Cooperative Transmission in Vehicular Networks

A Comparison of Performance Measures for Online Algorithms

DECISION SUPPORT SYSTEM FOR MANAGING EDUCATIONAL CAPACITY UTILIZATION IN UNIVERSITIES

Improving Direct Marketing Profitability with Neural Networks

Seeing the Unseen: Revealing Mobile Malware Hidden Communications via Energy Consumption and Artificial Intelligence

RUNESTONE, an International Student Collaboration Project

An Introduction to Event-triggered and Self-triggered Control

! # % & ( ) +,,),. / % ( 345 6, & & & &&3 6

How To Price Internet Access In A Broaban Service Charge On A Per Unit Basis

Mandate-Based Health Reform and the Labor Market: Evidence from the Massachusetts Reform

Asymmetric Neutrality Regulation and Innovation at the Edges: Fixed vs. Mobile Networks

CURRENCY OPTION PRICING II

Risk Management for Derivatives

Factoring Dickson polynomials over finite fields

Achieving quality audio testing for mobile phones

Di usion on Social Networks. Current Version: June 6, 2006 Appeared in: Économie Publique, Numéro 16, pp 3-16, 2005/1.

arxiv: v3 [gr-qc] 7 Mar 2014

Optimal Control Of Production Inventory Systems With Deteriorating Items And Dynamic Costs

Math , Fall 2012: HW 1 Solutions

Innovation Union means: More jobs, improved lives, better society

Rural Development Tools: What Are They and Where Do You Use Them?

Transcription:

Cost Efficient Datacenter Selection for Clou Services Hong u, Baochun Li henryxu, bli@eecg.toronto.eu Department of Electrical an Computer Engineering University of Toronto Abstract Many clou services nowaays are running on top of geographically istribute infrastructures for better reliability an performance. They nee an effective way to irect the user requests to a suitable atacenter, in a cost efficient manner. Previous work focuse mostly on the electricity cost of atacenters. The approaches favor atacenters at locations with cheaper electricity prices. In this paper, we augment the picture by consiering another significant cost contributor: network banwith. We propose to utilize statistical multiplexing to strategically bunle emans at ifferent locations. The anti-correlation between emans effectively smooths out the aggregate banwith usage, thereby saving the banwith cost calculate by burstable billing methos that charge the peak banwith usage. We present an optimization framework that moels the realistic environment an practical constraints a clou faces. We evelop an efficient istribute algorithm base on ual ecomposition an the subgraient metho, an evaluate its effectiveness an practicality using realworl traffic traces an electricity costs. I. INTRODUCTION Internet-scale services are becoming essential to our everyay lives, with important applications incluing web search, vieo-on-eman, an file hosting. The emergence of clou computing platforms, such as Amazon AWS [1], further enables rapi eployment of new services at scale. Almost all of these services are built atop geographically istribute infrastructures, i.e. atacenters locate in ifferent regions to provie better reliability an performance. They nee an effective way to irect clients across the wie area to an appropriate atacenter. Usually, Internet-scale services hanle atacenter selection by eploying mapping noes, which are typically DNS servers as shown in Fig. 1, to customize the IP aress(es) returne to ifferent clients. Alternatively, they can also outsource atacenter selection to thir-parties [2], [3] or the clou provier [4]. An efficient atacenter selection algorithm is imperative to the operation of clou services. Many previous works exist in this area. The problem can be cast as an optimization that maximizes the system-wie performance subject to certain cost constraints. Existing works usually consier the electricity cost of running atacenters. By taking avantage of the geographic iversity of electricity prices, requests are irecte in favor of atacenters with lower electricity prices, an costs can be reuce [5], [6]. In this paper, we consier another significant cost contributor to atacenters: wie-area network banwith [7]. We propose to utilize statistical multiplexing to strategically bunle emans Clients Mapping noes (DNS) Mapping Requests Datacenters Fig. 1. An example of a clou service running atop a geographically istribute clou infrastructure. from ifferent mapping noes. The intuition is that, emans from ifferent mapping noes correlate with each other in a ifferent fashion: some are positively correlate, i.e. they ten to peak at the same time, while some are negatively correlate, i.e. when eman from one noe rises, eman from another noe tens to ecrease. By combining emans from negatively correlate noes, the aggregate banwith require from a particular atacenter is smoothe out across time, thereby reucing the banwith cost which is etermine by burstable billing methos, such as the 95-percentile billing that charges peak banwith usage [8] [1]. To better illustrate the iea, Fig. 2 an 3 plot some sample eman ata we collecte from a major online multimeia company UUSee [11] in China. The re line correspons to the 95-percentile banwith consumption, which amounts to 45 Mbps an 33 Mbps for noe 1 an 2 respectively. If these two noes were serve by the same atacenter, the aggregate banwith consumption as shown in Fig. 4 is smoother than the iniviual curves. The 95-percentile of the aggregate banwith is aroun 7 Mbps, which is smaller than the sum of the iniviual 95-percentile values. This emonstrates the potential of multiplexing in terms of saving banwith consumption an cost. Our main contribution in this paper is a general optimization framework for cost efficient atacenter selection that takes into account both electricity an banwith costs. Our framework is general in the sense that it moels practical environments that a clou operates in. The utility abstraction encompasses many performance consierations, incluing throughput, latency, as well as possible fairness criteria. The electricity an banwith

Deman (Mbps) 6 4 2 Noe 1 Time (epochs) Deman (Mbps) 5 4 3 2 1 Noe 2 Time (epochs) Deman (Mbps) 8 6 4 2 Noe 1 + Noe 2 Time (epochs) Fig. 2. Deman at noe 1. Fig. 3. Deman at noe 2. Fig. 4. Aggregate eman. cost constraints capture the two most important ongoing costs associate with the operation of atacenters, i.e. operation expense (OPE) [7]. Both the utility an electricity price are location epenent in orer to realistically moel the geographic iversity. By using ual ecomposition, our optimization formulation can be ecentralize to the atacenter level. Specifically, the problem can be ecompose into subproblems, each solvable by an iniviual atacenter itself. This enables us to evelop efficient istribute implementations of our atacenter selection algorithm to fin the optimal noe-atacenter assignment base on the subgraient metho. Our algorithms remain relevant in an are applicable to other request irection scenarios, such as a content istribution network (CDN) [12]. We evaluate the effectiveness of our ecentralize implementation using real-worl traffic traces collecte from UUSee [11], as well as real-worl electricity prices [13]. Results emonstrate that our algorithm saves the overall operating cost of atacenters while offering a comparable performance compare to the vanilla banwith-agnostic solution. II. AN OPTIMIZATION FRAMEWORK FOR COST EFFICIENT DATACENTER SELECTION In this section, we present our optimization framework for cost-effective atacenter selection. A. System Moel We start by introucing the system moel. We consier a clou infrastructure with M atacenters geographically istribute across the wie area. The clou eploys N mapping noes (e.g. DNS servers) at ifferent locations to serve client requests. We use the term mapping noes an noes interchangeably in the sequel. The requests at a particular noe are irecte to a subset of all the atacenters etermine by the atacenter selection algorithm. Since the request traffic fluctuates ynamically, the atacenter selection algorithm nees to be run perioically to optimize performance. Let us introuce a few notations. We consier an iniviual time epoch without loss of generality, an thus we rop the time subscript t in our notations. The banwith eman of noe i 2 is a ranom variable D i with mean µ i an variance i. The ranom emans D =[D 1,...,D N ] may be correlate ue to time ifference an the natural correlation between viewer preferences, human behavior, etc. Let µ enote the N 1 mean eman matrix, or eman matrix in short, an be the N N covariance matrix. We assume that the clou operator employs techniques such as those in our previous work [14], [15] to preict the eman matrix µ with satisfactory accuracy. The covariance matrix between emans at ifferent noes can also be preicte for the short-term future by using time series forecasting methos [14], [15]. We also assume that the electricity price at each atacenter p is available at the beginning of an epoch, an remains static throughout the entire epoch. This is a practical assumption in toay s electricity market in the U.S. If the local electricity market of atacenter is in a regulate utility region, the electricity price is fixe. If on the other han the atacenter is in a eregulate market region, such as California an Texas, there is a forwar market with settlements of various kins, such as ay-ahea an hour-ahea, for customers to lock in the price [6], [13]. The M 1 matrix p =[p ] is referre to as the price matrix. We use an abstract utility notion u i to capture the performance of the clou service, when a request from noe i is irecte to atacenter. This notion allows us a consierable amount of expressiveness. For example, if the clou service is an interactive application an seeks minimal latency, u i can be a ecreasing function of the roun trip time (RTT), irectly measure or estimate by various means. If the clou service is a bulk transfer application an seeks goo throughput, U i can be a ecreasing function of the network congestion level or the link utilization. It can incorporate fairness consierations by making use of the canonical alpha-fair utility functions [16]. For more iscussions of the generality of the utility notion one can refer to [3]. The N M matrix u =[u i ] is the performance matrix, an the column vector u =[u i ] is the performance vector of atacenter. Finally, we use w i 2 [, 1] to enote the proportion of traffic irecte from noe i to atacenter, an w is a N M atacenter selection matrix. w i is the optimizing variable of our problem. Given w, we observe that the vector w = [w 1,...,w N ] represents the workloa portfolio of atacenter. B. Moeling the 95-percentile Banwith Note that in the moel, we choose not to take into account the banwith price, since in reality this is often a fixe price

across ifferent regions. It is equivalent to only consiering the aggregate banwith usage at iniviual atacenters. The aggregate banwith consumption of becomes a ranom variable L = w D. whose mean an variance are w µ an w w, respectively, given the atacenter selection matrix w. Suppose the peak banwith usage of atacenter is A. This implies that, the probability that the banwith consumption of excees A is equal to of the time, where is a small positive constant. That is, Pr(L >A )=, 8. For the 95-percentile charging moel, =.5. Note that this can also be interprete as a QoS constraint, where the probability of banwith uner-provisioning is boune by. Through reasonable aggregation, L follows a Gaussian istribution ue to the law of large numbers. This has also been empirically verifie using trace stuies in previous work [14], [15]. Thus, the above constraint is equivalent to w µ + qw w = A, 8 where = F 1 (1 ) an F ( ) is the CDF of the Gaussian istribution N (, 1). For example when =.5, =1.96. The total billable banwith usage of the clou is w qw µ + w. C. An Optimization Framework Now we formally introuce our optimization framework. The atacenter selection problem at a particular epoch can be succinctly expresse as follows: DC-OPT: s.t. max w w u (1) w qw µ + w apple A, (2) p w µ apple B, (3) C min apple w qw µ + w apple C, 8, (4) w i =1, 8i (5) The ecision variables are w i, i.e. the proportion of requests irecte to atacenter from noe i. The objective (1) calculates the system-wie utility given by the atacenter selection matrix w an the performance matrix u. Constraint (2) is the total banwith usage constraint, where A is the banwith cap. (3) is the total electricity cost constraint. It enforces that the total electricity cost of serving all the requests shoul not excee the buget B. is a conversion factor that converts workloa in Gbps into electricity consumption in KWh. Constraint (4) represents both the loa balancing an capacity constraints at iniviual atacenters, where C min is the minimum loa that each atacenter must achieve, an C is the capacity of atacenter. Constraint (5) correspons to the simple fact that all the requests arriving at noe i shoul be serve, i.e. our algorithm is work-conserving. III. A DECENTRALIZED IMPLEMENTATION The optimization problem DC-OPT is essentially a seconorer cone program, an can be solve in polynomial time. However, this requires a central coorinator which introuces a single point of failure an is vulnerable to attacks. Further, the computational complexity of solving the cone program also increases significantly when the problem size scales up. A centralize solution also makes it less aaptive to suen changes in traffic eman in a flash crow scenario. Thus, for reasons of reliability, security, scalability, an performance, we are motivate to evelop istribute solutions in which the atacenters iteratively solve the optimization problem. A. Dual Decomposition Relax the constraints (2), (3), an (5), we can obtain the Lagrangian of DC-OPT: L(w,,, ) = w u + A q w! µ + w w + B! p w µ +! i 1 w i, i where,, an are the Lagrange multipliers associate with the banwith usage, electricity cost, an work conservation constraints, respectively. The ual function is then ( max L(w,,, ) g(,, ) = w (6) s.t. constraint (4) To solve g(,, ), it is equivalent to maximizing the following objective q w (u µ p µ ) w w where the constant terms in L(w,,, ) can be safely remove. The key observation here is that it can be ecompose into M per-atacenter maximization sub-problems q max w w (u µ p µ ) w w s.t. constraint (4), (7) The per-atacenter sub-problem naturally emboies an economic interpretation. Each atacenter strives to maximize the total utility of serving the requests, iscounte by the costs of violating the banwith, electricity cost, an work conservation constraints, as price by the Lagrange multipliers. It is still a secon-orer cone program. However the problem size has been reuce. The per-atacenter sub-problem has only N variables

an 2 constraints. In a typical prouction clou, the number of mapping noes N is on the orer of hunres, which can be solve efficiently by stanar optimization solvers with the computing power of a atacenter. B. A Distribute Algorithm We have shown that the ual function of DC-OPT can be ecompose into M per-atacenter maximization problem, which is a smaller secon-orer cone program. Now we nee to solve the ual problem min g(,, ) s.t.,. (8) The subgraient metho [17] can be use to solve the ual problem. The upating rules for the ual variables are as follows: " q + (l+1) = (l) + w (l) µ + w w A!#, (l+1) = " (l) + (l) (9)!# + p w µ B, (1) (l+1) i = (l) + (l) i! w i 1, 8i, (11) where [x] + represents max{,x}, an,, are the step sizes. Accoring to [17], the above proceure is guarantee to converge as long as the following conition is satisfie. Proposition 1: The subgraient upates as in (9) (11) converge to the optimal ual variables if a iminishing step size rule is followe for choosing,, [17]. The ual variables,, serve as price signals to coorinate the resource consumption an workloa conservation. For example, when the 95-percentile banwith of all atacenters excees the banwith cap, i.e. P w µ + p w w >A, the clou increases its price for the next iteration to suppress the excessive traffic. The process continues until it converges to the optimal resource allocation. Dual optimization by the subgraient metho can be one in a istribute fashion because of ual ecomposition. First, in each iteration, the per-atacenter sub-problems (7) can be solve concurrently by iniviual atacenters. Secon, subgraient upates can also be istributively performe by each atacenter an mapping noe. Here an nee to be upate with global information from all atacenters. This can be one in a istribute way as follows, using as an example. Initially, the previous (l) is mae common knowlege among the atacenters. First, a atacenter is ranomly chosen an given a token with the total buget B. It calculates its own electricity cost of serving the requests p w µ, an euct this amount from B. It puts a mark in the token, an pass it on to the next atacenter, who also upates the remaining buget, marks the token, an passes it further own. A atacenter etermines it is the last one in the loop by examining that except itself, everyone else has marke the token. It thus upates the remaining buget, calculates the upate buget price (l+1), an broacasts to each atacenter. Finally, i can be upate by each mapping noe, with w i receive from each atacenter. Algorithm 1 Optimal Distribute DC-OPT Algorithm 1. Initialize () an () to. Each noe initializes () i. 2. Each atacenter collects (l), (l), an inepenently solves the per-atacenter subproblem (7) using stanar optimization solvers an obtain w, which is broacast to each noe. 3. Each noe performs a subgraient upate for (l) i as in (11). The upate (l+1) i is broacast to atacenters. 4. A atacenter is ranomly chosen an given a token with the banwith cap A an buget B. 5. The atacenter eucts its banwith usage an electricity cost from the remaining banwith cap an buget respectively in the token, marks it, an passes it own. 6. Repeat step 5 until the last atacenter calculates the final remaining banwith cap an buget, upates (l) an (l) as in (9) an (1), an broacasts to every atacenter. 7. Return to step 2 until convergence. The complete istribute algorithm is shown in Algorithm 1. Since it optimally solves the ual problem (8), it optimally solves the primal problem DC-OPT because the uality gap for convex optimization problems is zero. Theorem 1: The istribute algorithm as shown in Algorithm 1 always converges, an when it converges its solution optimally solves the atacenter selection problem DC-OPT. IV. EVALUATION We present our simulation stuies in this section. A. Setup 1) Deman matrix: To represent the request traffic for a clou service, we use real-worl traces collecte from UUSee Inc. [11], a major online multimeia provier with servers eploye in ifferent geographical regions in China. The ataset contains, among other information, the banwith emans for UUSee vieo programs sample every 1 minutes, in a 12- ay perio uring the 28 Beijing Olympics. Although the scale of the UUSee infrastructure may not be as large as that of a clou provier, we believe the traces faithfully reflect the eman istribution for a clou service, an it is appropriate to use them for the purpose of benchmarking the performance of our atacenter selection algorithm. We assume that the preiction of mean an covariance of traffic emans can be one accurately [14], [15], an in the simulation we simply aopt the preicte values for µ an. We use the traffic emans of istinct vieo channels to represent emans of istinct mapping noes. We simulate a clou with 1 mapping noes. Since the ata is collecte every 1 minutes, the optimization epoch is also set to 1 minutes. The banwith cap A is set to 2 Mbps. The minimum

loa of a atacenter C min is 1 Mbps, an the capacity of atacenters C is ranomly rawn. 2) Datacenter placement an price matrix: To capture the location iversity of the clou infrastructure an electricity market, we assume the atacenters are eploye across the continental U.S. For the ease of exploration, we assume that there is one atacenter in a ranomly chosen hub in each regional electricity market as shown in Fig. 5 [13]. We use the 211 annual average ay-ahea on peak price ($/MWh) at these regions provie by the Feeral Energy Regulatory Commission (FERC) as the electricity price for each atacenter, i.e. p, as summarize in Table I [13]. Fig. 5. The U.S. electricity market an our clou atacenter map. Source: FERC [13]. TABLE I 211 ANNUAL AVERAGE DAY AHEAD ON PEAK PRICE ($/MWH) IN DIFFERENT REGIONAL MARKETS. SOURCE: FERC[13]. Region Hub Price California NP15 $35.83 Miwest Michigan Hub $42.73 New Englan Mass Hub $52.64 New York NY Zone J $62.71 Northwest California-Oregon Borer (COB) $32.57 PJM PJM West $51.99 Southeast VACAR $44.44 Southwest Four Corners $36.36 SPP SPP North $36.41 Texas ERCOT North $61.55 3) Performance matrix: We consier a utility function efine by the negative Eucliean istance between the mapping noes an the atacenters. This efinition instructs the algorithm to irect requests to atacenters in the geographical vicinity of a mapping noe whenever possible, in an effort to minimize the transmission elay an optimize viewer experience. To calculate the performance matrix, we first obtain the longitue an latitue of ten counties near each of the ten hubs as the exact locations of our atacenters in the U.S. We then ranomly choose another 1 counties as the locations of the 1 mapping noes. All the location information is obtaine from [18]. The Eucliean istance between any given pair of mapping noe an atacenter then can be reaily calculate, which constitutes the performance matrix U. Without loss of generality, we assume that =.1, i.e. serving 1 Mbps per epoch of 1 minutes consumes.1 kwh electricity. The buget B is set to $4 per epoch. 4) Benchmark: Finally, we use a banwith-agnostic atacenter selection scheme that shares the same objective function (1) an constraints (3) (5), except that it oes not consier the banwith usage, i.e. constraint (2), as the benchmark for the performance of DC-OPT. This problem is also a secon-orer cone program an can be efficiently solve. This is referre to as Benchmark in the following. B. Effectiveness We evaluate the effectiveness of our istribute atacenter selection algorithm. Fig. 6 shows the 95-percentile banwith consumption of each atacenter for a 1-epoch perio of time. We observe that, compare to the banwith-agnostic benchmark, DC-OPT reuces the banwith usage of most atacenters by 15% 2% by intelligently mixing negatively correlate emans. One may notice that atacenter 4, 6, 8, an 1 have the same banwith usage using both algorithms. This is ue to the unattractive electricity price an performance at these locations. From Table I, we observe that atacenter 4, 6, an 1 have the highest electricity prices among all locations. Also from our performance matrix we observe that atacenter 8 is far away from many of the noes. This prevents both DC- OPT an Benchmark from irecting requests to these locations beyon the minimum loa of 1 Mbps require by the loa balancing constraint (4). Fig. 7 emonstrates the average utility comparison between DC-OPT an Benchmark. We observe that DC-OPT has a slightly worse average utility across the time. The reason for the inferior performance is that in orer to reuce the banwith usage, sometimes DC-OPT nees to irect requests to locations that are not necessarily the closest, but are more banwith efficient because these emans effectively smooth out the aggregate traffic. By the same token, DC-OPT has to sacrifice the electricity cost in orer to satisfy the banwith usage constraint. This is illustrate in Fig. 8. The electricity cost is on average aroun 5%-1% higher than Benchmark. Note that both DC-OPT an Benchmark violate the cost constraint uring peak hours when eman rises to the point that this constraint becomes infeasible uring epoch 7 85. The average performance at this perio of time is also relatively worse as seen in Fig. 7. The results show that there is an inherent trae-off between banwith usage an performance/electricity cost. Essentially, DC-OPT strives to be more banwith efficient, an achieves a ifferent operating point on the trae-off curve. Accoring to [7], both the electricity an wie-area banwith account for aroun 15% of the atacenter costs, respectively. Thus the overall cost of DC-OPT is reuce, while the performance is comparable to when banwith usage is not consiere. DC- OPT represents a favorable solution than banwith-agnostic atacenter selection schemes for clou operators. V. RELATED WORK The topic of atacenter selection an loa irection for a geoistribute clou has starte to gain attention in the research

Mbps 8 6 4 2 DC OPT Benchmark 1 2 3 4 5 6 7 8 9 1 Datacenter km 45 5 55 6 65 DC OPT Benchmark 7 Time (epoch) $ 5 4.5 4 3.5 3 DC OPT Benchmark 2.5 Time (epoch) Fig. 6. 95-percentile banwith comparison. Fig. 7. Average utility comparison. Fig. 8. Total cost comparison. community. Qureshi et al. [5] introuce an intuitive iea of utilizing the location iversity of electricity spot price to intelligently irect requests to atacenters with lower prices. Wenell et al. [3] evelope a ecentralize atacenter selection algorithm for clou services, an evaluate its performance using a prototype an realistic traffic traces. Rao et al. [6] consiere a joint loa balancing an power control problem for Internet atacenters to exploit the time an location iversity of electricity price. [19] specifically consiere the effect of geographical loa balancing on proviing environmental gains by encouraging the use of green energy. [2] stuie a complementary problem of ata placement in a geo-istribute clou, consiering the ata locality. These works, however, o not consier banwith usage in their problem formulations. Our work relies on the iea of multiplexing emans with ifferent egrees of correlation to reuce the peak aggregate eman of atacenters. Similar iea has been propose in some recent works [14], [15], where a banwith reservation service in the clou is envisione for VoD applications, an multiplexing is utilize to reuce the total banwith reservation for a given level of QoS. Here we consier a more general setting where multiplexing is use for reucing the operating cost of the clou. Another recent work [21] iscusses correlation aware power optimization in atacenters. The focus is on local-area traffic in a atacenter network whose correlation statistics change frequently, while our approach eals with wie-area egress traffic of the atacenter. We also take into account the geographical iversity of electricity cost which is not consiere in these works. VI. CONCLUDING REMARKS In this paper, we presente a general optimization framework that consiers banwith usage an electricity costs to solve the atacenter selection problem for clou services. Our iea is to exploit the ifferent egrees of correlations between emans at ifferent locations to reuce the peak eman of aggregate traffic at atacenters, thereby reucing the billing amount of wie-area banwith. We aopte a ual ecomposition approach to solve the secon-orer cone program, an evelope a istribute algorithm base on the subgraient metho to iteratively achieve the optimal atacenter selection solution in a ecentralize fashion. Simulation results with real-worl traces an electricity prices show that our algorithm reuces the 95- percentile banwith usage by 2%, an offers comparable performance an electricity cost against banwith-agnostic solutions. Our work can be extene in many irections. One possible irection is to consier the online atacenter selection that makes ecision on-the-fly with sequentially arriving requests, which is more ifficult than the offline problem we solve in this paper. REFERENCES [1] Amazon web services (AWS), http://aws.amazon.com/. [2] DynDNS, http://yn.com/ns/. [3] P. Wenell, J. W. Jiang, M. J. Freeman, an J. Rexfor, DONAR: Decentralize server selection for clou services, in Proc. ACM SIGCOMM, 21. [4] Amazon AWS elastic loa balancing, http://aws.amazon.com/ elasticloabalancing/. [5] A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, an B. Maggs, Cutting the electricity bill for Internet-scale systems, in Proc. ACM SIGCOMM, 29. [6] L. Rao,. Liu, L. ie, an W. Liu, Minimizing electricity cost: Optimization of istribute Internet ata centers in a multi-electricitymarket environment, in Proc. IEEE INFOCOM, 21. [7] A. Greenberg, J. Hamilton, D. A. Maltz, an P. Patel, The Cost of a Clou: Research Problems in Data Center Networks, SIGCOMM Comput. Commun. Rev., vol. 39, no. 1, pp. 68 73, 29. [8]. Dimitropoulos, P. Hurley, A. Kin, an M. P. Stoecklin, On the 95- percentile billing metho, in Proc. PAM, 29. [9] N. Laoutaris, M. Sirivianos,. Yang, an P. Roriguez, Inter-atacenter bulk transfers with netstitcher, in Proc. ACM SIGCOMM, 211. [1] D. u an. Liu, Geographic trough filling for Internet atacenters, in Proc. IEEE INFOCOM, 212. [11] UUSee Inc. http://www.uusee.com/. [12] H. Alzoubi, S. Lee, M. Rabinovich, O. Spatscheck, an J. van er Merwe, A practical architecture for an anycast CDN, ACM Trans. Web, vol. 5, no. 4, October 211. [13] Feeral Energy Regulatory Commission, U.S. electric power markets, http://www.ferc.gov/market-oversight/mkt-electric/overview.asp, 211. [14] D. Niu, H. u, B. Li, an S. Zhao, Risk management for vieo-oneman servers leveraging eman forecast, in Proc. ACM Multimeia, 211. [15], Quality-assure clou banwith auto-scaling for vieo-oneman applications, in Proc. IEEE INFOCOM, 212. [16] J. Mo an J. Walran, Fair en-to-en winow-base congestion control, IEEE/ACM Trans. Netw., vol. 8, no. 5, pp. 556 567, October 2. [17] S. Boy an A. Mutapcic, Subgraient methos, Lecture notes of EE364b, Stanfor University, Winter Quarter 26-27. http://www. stanfor.eu/class/ee364b/notes/subgra metho notes.pf. [18] WebGIS, http://www.webgis.com/. [19] Z. Liu, M. Lin, A. Wierman, S. H. Low, an L. L. Anrew, Greening geographical loa balancing, in Proc. ACM Sigmetrics, 211. [2] S. Agarwal, J. Dunagan, N. Jain, S. Saroiu, A. Wolman, an H. Bhogan, Volley: Automate ata placement for geo-istribute clou services, in Proc. USENI NSDI, 21. [21]. Wang, Y. Yao,. Wang, K. Lu, an Q. Cao, CARPO: Correlationaware power optimization in ata center networks, in Proc. IEEE INFOCOM, 212.