Data Center Energy Cost Minimization: a Spatio-Temporal Scheduling Approach

23 Proceedings IEEE INFOCOM Data Center Energy Cost Minimization: a Spatio-Temporal Scheduling Approach Jianying Luo Dept. of Electrical Engineering Stanford University jyluo@stanford.edu Lei Rao, Xue Liu School of Computer Science McGill University {leirao,xueliu}@cs.mcgill.ca Abstract Cloud computing is supported by an infrastructure known as Internet data center (IDC). As cloud computing thrives, the energy consumption and cost for IDCs are exploding. There is growing interest in energy cost minimization for IDCs in deregulated electricity markets. In this paper we study how to leverage both geographic and temporal variation of energy price to minimize energy cost for distributed IDCs. To this end, we propose a novel spatio-temporal load balancing approach. Using reallife electricity price and workload traces, extensive evaluations demonstrate that the proposed spatio-temporal load balancing approach significantly reduces energy cost for distributed IDCs. I. INTRODUCTION Recent years have witnessed the vast expansion of cloud computing []. Internet data centers (IDCs), the infrastructure to support cloud-computing service, are developing with giant strides. As cloud computing thrives, energy cost along with energy consumption for IDCs is exploding. It is estimated that energy-related costs may amount to 4.6% of operational cost of large-scale IDCs [2]. Therefore, curbing energy cost has become very important for IDC operators, e.g., Amazon, Facebook, Google, and Microsoft. Emerging concurrently with IDCs are smart grids, the modernized electric grids. Smart grids have facilitated the transition of electricity markets into deregulated markets with dynamic pricing [3, 4]. There is growing interest in how to operate distributed IDCs and manage energy cost in emerging deregulated electricity markets. Most existing works tackle the energy cost minimization problem for IDCs by spatial load balancing [5, 6, 7, 8], temporal load balancing [9, ], or energy storage[, 2]. This paper studies how to leverage both geographic and temporal variation of electricity price to minimize energy cost for IDCs, and guarantee a service completion time for cloudcomputing user requests. We propose a novel spatio-temporal load balancing approach for distributed IDCs. In Fig. we illustrate an architecture that supports load scheduling in both spatial and temporal domain. Fig. (a) shows the portion for load balancing in the space domain: from a portal server to an IDC site. Fig. (b) shows the portion for load balancing in the time domain: queueing a user request for some time and dispatching it to execution at a later time. A central workload This work was done when Dr. Lei Rao was at McGill University. Dr. Rao is now with General Motors Research Labs. E-mail: lei.rao@gm.com. scheduler manages user requests queueing and dispatching to execution on IDC sites. The main contribution of this paper is twofold. First, we study an important research problem of energy cost minimization for distributed IDCs in deregulated electricity markets. To address this problem, we propose a novel spatio-temporal load balancing approach that exploits both geographic and temporal variation of electricity price. Second, extensive evaluations based on real-life data demonstrate that our proposed approach can significantly reduce energy cost and guarantee a service completion time for user requests. The rest of the paper is organized as follows. Section II models the dispatching and execution of user requests for cloud-computing service, and formulates the energy cost minimization problem for distributed IDCs. Section III uses real-life electricity price and workload traces to evaluate the efficacy of our approach. Section IV discusses the relevant work in the literature. Finally, Section V concludes the paper. II. PROBLEM FORMULATION In this section, we first describe portal server, and distributed IDC system. Next we explain workload of user requests, electricity price, workload queue, and workload scheduler. Then we give the queueing delay constraint for workload, and the capacity constraint for IDC sites. We further formulate the energy cost minimization problem for the distributed IDC system. We introduce the notations used throughout the paper in Table I. A. Front-end portal server, back-end distributed IDC system A cloud-computing service provider usually operates a group of front-end portal servers and a back-end distributed IDC system. Each portal server aggregates user requests originating from a service area. The distributed IDC system comprises a number of geographically separate IDC sites. In our modeling, there are J portal servers S j, j ( j J unless stated otherwise), and I geographically separate IDC sites IDC i, i ( i I unless stated otherwise). We model the portal servers and IDC sites as a discrete-time system evolving over a sequence of equal-length time slots. B. Workload of user requests, electricity price, workload queue, and workload scheduler In time slot k, is the workload of user requests arriving at portal server S j, and l [k] is the total workload arriving at 978--4673-5946-7/3/$3. 23 IEEE 34

l [k] l J- [k] S SJ- λ j,i [k][] IDC IDCi IDCI- (a) Load Balancing in Space Domain (b) Load Balancing in Time Domain Fig. : Architecture for Spatio-Temporal Load Balancing Notation J S j Q j D l [k] L [k] I IDC i C i C p i [k][d] λ j,i [k][d] λ,i [k][d] λ, [k][d] Λ, [k][d] η TABLE I: Notations Q j λ j,i [k][] IDC IDCi IDCI- λ j, [k][] λ j,i [k][] λ j,i- [k][] Definition total number of front-end portal servers front-end portal server j workload scheduler queue at S j the time slot equivalent of service completion time bound D sct user request workload arriving at S j in time slot k total user request workload arriving at all front-end portal servers in time slot k total user request workload arriving at all front-end portal servers in time slots through k total number of IDC sites IDC site i capacity of IDC i total capacity of all IDC sites in time slot k, real-life electricity price (or estimated electricity price) on IDC i in time slot k for d = (or in future time slot k + d for d D) dispatch from Q j to IDC i in time slot k for d = (or in future time slot k + d for d D) dispatch from all workload queues to IDC i in time slot k for d = (or in future time slot k + d for d D) dispatch from all workload queues to all IDC sites in time slot k for d = (or in future time slot k + d for d D) in time slot k, total workload dispatched from all workload queues to all IDC sites in time slots through k, and total estimated workload to dispatch from all workload scheduler queues to all IDC sites in all future time slots k + t, t d) amount of energy to execute one unit of workload all portal servers: l [k] = J j= l j[k]. In a time slot, IDC i cannot serve workloads more than its capacity denoted as C i. C denotes the total capacity of the distributed IDC system: C = I i= C i. In time slot k, the total workload arriving at all portal servers is assumed no more than the total capacity of the distributed IDC system: l [k] C, k. In time slot k, we collect electricity price p i [k][d], d ( d D unless stated otherwise) on the IDC i site: p i [k][] is the real-life price on the IDC i site in time slot k, and p i [k][d], d D, is the estimated price on the IDC i site in future time slot k + d. η is the amount of energy consumed to execute one unit of workload. As shown in Fig. (b), at each S j a FIFO queue Q j (referred to as workload queue) stores arriving user requests at its tail. A central scheduler fulfills spatio-temporal load balancing. The scheduler decides the workload denoted as λ j,i [k][] to dispatch from Q j to IDC i, j, i, in time slot k, and the estimated workload denoted as λ j,i [k][d] to dispatch from Q j to IDC i, j, i, in future time slot k + d, d, d D. Fig. (a) shows that in time slot k, the scheduler dispatches amount λ j,i [k][] of workload, from Q j to IDC i, j, i. A user request has a service completion time bound D sct. Due to space limitation, we discuss our design for the case that all user requests require the same service completion time bound. To accommodate more than one service delay bound for reasons such as priority, and different levels of tolerance to delay, the design can be instantiated one unit per service delay bound; the capacity of IDC sites can be partitioned into different slices with one slice per service delay bound; workload of user requests dispatched from each design-unit can be entitled to its slice of the capacity of IDC sites. The large-scale deployment of commodity computers in data centers has catalyzed application parallelization: monolithic jobs are replaced by functionally equivalent small tasks mapped into worker computers and executed in a shorter amount of time [3]. Therefore, we assume that a user request can be decomposed to small user tasks; each such task can be executed on an IDC site in a time slot; tasks forming a user request may be executed on multiple IDC sites and in multiple time slots. We assume that user requests are received at the beginning of a time slot; the scheduler then dispatches user tasks at the head of Q j to execution on IDC i, j, i. The service completion time for user tasks comprises the queueing time at Q j, the transport time from Q j to IDC i, and the execution time on IDC i. The scheduler may queue user tasks until a future time slot with cheaper electricity price. As the interval over which electricity price changes is much greater than the transport time and the execution time of user tasks, the queueing time dominates the service completion time. If user tasks are queued for no more than D time slots, the time-slot equivalent of D sct, their requirement for service completion time is deemed satisfied. C. Queueing delay constraint for workload, and capacity constraint for IDC sites The spatio-temporal load balancing is subject to a queueing delay constraint for workload and a capacity constraint for each IDC site. A cumulative arrival function L [k] counts the total workload that has been received by all portal servers in time slots through k : L [k] = k l [t]. t= A cumulative departure function Λ, [k][d] counts the total workload either dispatched by the workload scheduler to execution in time slots through k, or to be dispatched to execution in time slots k through k + d, d, J λ,i[k][d] = λ j,i[k][d], λ, [k][d] = λ,i[k][d], j= k Λ, [k][d] = λ, [t][] + t= i= d λ, [k][t], t= d, i, d. 34

The queueing delay for user tasks is bounded by D time slots: L [k D + d] Λ, [k][d], d, d D, () L [k] = Λ, [k][d]. (2) The workload and the estimated workload to dispatch from all workload queues to IDC i must not exceed the capacity of IDC i in time slots k and k + d, d, d D, respectively due to the capacity constraint for each IDC site: λ,i[k][d] C i, D. Energy cost minimization i, d. The workload scheduler aims at minimizing the estimated total energy cost to execute all user tasks currently in the workload queues throughout the immediate (D + ) time slots starting from the current time slot. The energy cost to execute amount λ, [k][] of workload on all IDC sites in time slot k is EC[k][] = λ,i[k][] η p i[k][]. (3) i= The energy cost to execute estimated amount λ, [k][d] of workload on all IDC sites in time slot k + d is EC[k][d] = λ,i[k][d] η p i[k][d], d, d D. (4) i= Therefore, the estimated total energy cost in time slots k through k + D is EC[k] = D EC[k][d]. (5) d= We formulate the following optimization problem named OPT to obtain the minimum value of EC[k]: min EC[k], λ j,i [k][d] s.t. L [k D + d] Λ, [k][d], d, d D, L [k] = Λ, [k][d], λ,i[k][d] C i, i, d, λ j,i[k][d], i, j, d. In time slot k, the scheduler solves the OPT problem, and dispatches amount λ j,i [k][] of workload from Q j to execution on IDC i, j, i. III. PERFORMANCE EVALUATION In this section, we first describe the experimental setup. Next we assess the energy cost saving achieved by the proposed spatio-temporal load balancing approach. Then we study energy cost per time slot, and queueing delay for user tasks. A. Experimental setup Our experiment settings include portal servers and workload of user requests, a distributed IDC system and real-life electricity price, load balancing schemes for comparison, and electricity price estimation. Normalized Workload 3 2.5 2.5.5 Enterprise IDC Workload 2 4 6 8 (a) Real-Life Enterprise IDC Workload Normalized Workload 3 2.5 2.5.5 Eastern US Central US Western US 2 4 6 8 (b) Generated Workload of User Requests from Three Areas Fig. 2: Real-Life Enterprise IDC Workload, and Generated Workload of User Requests Originating from Three Areas (Normalized to C ref ) ) Portal servers, and workload of user requests: In the experiments, three portal servers S j, j 2, respectively receive user requests submitted from three areas: the eastern US, the central US, and the western US. We obtain a 24- hour real-life workload denoted as l ref [k] at an enterprise production data center [4]. A time-slot interval is 5 minutes, a common divisor for electricity price intervals on all IDC sites. Fig. 2(a) shows l ref [k] normalized to C ref, the capacity of the reference data center. To model workload arriving at these portal servers, we generate three user-request workload sequences, j 2, by two transforms on the reference data center workload: first, we use the time in the Eastern Time Zone for time-keeping, and therefore delay l ref [k] by,, and 3 hours for the eastern, the central, and the western US areas respectively; second, we scale the magnitude of three intermediate workload functions by factors of 3, 2, and, which are approximately in line with the population in these three areas. Fig. 2(b) shows the resultant three user-request workload sequences normalized to C ref. 2) A distributed IDC system, and real-life electricity price: In the experiments, a distributed IDC system consists of three geographically separate IDC sites IDC i, i 2. These sites model three Internet data centers that Google operates in Atlanta, GA, Houston, TX, and Mountain View, CA. The capacity of each site is as follows: C = 2 C ref, C = 3 C ref, and C 2 = C ref. We retrieve the real-life electricity price data for Atlanta, Houston, and Mountain View [3, 4]. Atlanta has a regulated electricity market, where the electricity price for industrial customers does not change in a month. Houston and Mountain View have a deregulated electricity market, where real-time price changes on 5-minute and -hour interval bases respectively. We adjust price data so that they refer to the time in the Eastern Time Zone. Fig. 3 illustrates the 24-hour reallife electricity price sequences p i [k][], i 2, on these IDC sites on May 2, 29. 3) Load balancing schemes for comparison: We compare the following three load balancing schemes. (a) spatial load balancing (Spatial LB): it exploits only the geographic variation of electricity price. A user request received by a portal server is dispatched to execution on one, or some, or all IDC sites in the same time slot as received by portal servers. (b) temporal load balancing (): it exploits only the 342

Electricity Price ($/MWh) 8 6 4 2 Atlanta Houston Mountain View 2 4 6 8 Fig. 3: Real-Life Electricity Price in Atlanta, Houston, and Mountain View on May 2, 29 temporal variation of electricity price and schedules workload received by a portal server to execute on a pre-determined IDC site with a service completion time bound of D time slots. A portal server is paired with an IDC site whose capacity is the closest to the magnitude of the received workload. Specifically, a user request received by S is dispatched to IDC ; that received by S is dispatched to IDC ; and that received by S 2 is dispatched to IDC 2. (c) spatio-temporal load balancing (Spatio-): it exploits both geographic and temporal variation of electricity price. A user request received by a portal server in time slot k is executed on one, or some, or all IDC sites during the immediate (D + ) time slots. 4) Electricity price estimation: For each site IDC i, i 2, we generate an estimated electricity price sequence p i [k][d] such that the rank of each term in sequence p i [k][d] matches that of each term in sequence p i [k + d][], d, k. Using a proof similar to Theorem 4 in [], we can show that the scheduler produces the identical load scheduling result with the estimated electricity price sequence p i [k][d] compared to with the real-life electricity price sequence p i [k + d][]. B. Energy cost minimization Using the electricity price aforementioned, we assess the total energy cost to schedule the 24-hour workload received by the portal servers to execute on the IDC sites. The service completion time bound D varies from (5-min) to 8 (2- hour). Fig. 4 plots the total energy cost normalized to (C ref η) in three load balancing schemes. We make the following three observations: (a) Spatial LB is superior to for all service completion time bounds D 8 (2-hour). With D = 2 (5- hour) and 8 (2-hour), respectively costs 25.2% and 4.2% more than Spatial LB for energy. Thus Spatial LB is more effective in reducing energy cost than. (b) Spatio- is superior to Spatial LB by a greater margin when D gets larger. With D = 2 (5-hour) and 8 (2-hour), Spatio- respectively costs 2.% and 43.2% less than Spatial LB for energy. The temporal scheduling part in Spatio- contributes to this cost saving. (c) Electricity price at a location does not change very frequently or dramatically between two contiguous time slots. Normalized Total Energy Cost ($) 8 7 6 5 4 3 2 Spatial LB Spatio 2 4 6 8 Service Completion Time Bound D (5 min) Fig. 4: Total Energy Cost vs. Service Completion Time Bound D in Three Load Balancing Schemes (note that D = in Spatial LB) Normalized Energy Cost Per Time Slot ($) 3 25 2 5 5 Spatial LB Spatio 2 4 6 8 Fig. 5: Energy Cost Per Time Slot in Three Load Balancing Schemes As a result, exploiting the temporal variation of price does not significantly reduce energy cost, if the service completion time bound is short, e.g., D = 4 (-hour). Therefore, both Spatio- and are particularly suitable for delay-tolerant user requests, such as MapReduce batch jobs. Next we assess energy cost per time slot. Fig. 5 shows the energy cost normalized to (C ref η) on all IDC sites per time slot in three schemes with D = 2 (5-hour): Spatial LB features a smooth function of energy cost per time slot; has a function that underscores jerky rise and fall; Spatio- exhibits volatile energy cost per time slot, although the magnitude of energy cost is mostly less than that in. The area under each function curve is the total energy cost to execute the 24-hour workload in each scheme. It is clear that the energy cost incurred by Temporal LB is the most, and that incurred by Spatio- is the least. C. Queueing delay for user tasks Fig. 6 presents the cumulative distribution function (CDF) of queueing delay for user tasks on all IDC sites in Temporal LB, and Spatio- with D = 2 (5-hour). We make the following three observations: (a) Both schemes satisfy the queueing delay constraint. (b) Due to the experiment setup, Spatio- has an average queueing delay more than : as electricity price does not vary over time slots on IDC, Temporal LB dispatches workload received by portal server S, which accounts for one third of the total workload, to execution on IDC with little queueing delay. (c) As the queueing delay approaches 2 time slots (5-hour), the CDF function for Spatio- rises rapidly. This is 343

CDF of Queueing Delay.8.6.4.2 Spatio 5 5 2 Fig. 6: Queueing Delay Distribution in, and Spatio- a result of temporal scheduling: the heavy workload received by all portal servers after time slot 54 is mostly dispatched to execution on IDC in time slots as distant as possible from time slot 72 with the highest electricity price. IV. RELATED WORK Energy management for IDCs is an active research topic. Existing works mostly aim at reducing energy consumption in IDCs. In [5] Liu et al. presented an overview of challenges toward power management in IDCs. In [6] Andrews et al. studied the trade-off between server energy usage and network queueing. In [7] Xu et al. proposed trough filling for distributed IDCs to achieve energy efficiency. Recently energy cost minimization for IDCs has attracted much attention. To tackle this problem, most existing works utilize spatial load balancing, temporal load balancing, or energy storage. In [6] Qureshi et al. focused on the evaluation of electricity price data. In [5] Rao et al. proposed a spatial load balancing scheme for distributed IDCs. In [7, 8] Le et al. and Liu et al. proposed spatial scheduling approaches and promoted renewable energy. In [9] Yao et al. used a stochastic optimization approach. In [] Luo et al. proposed a temporal load balancing scheme for an IDC. In [, 2] Urgaonkar et al. and Guo et al. used energy storage to save energy cost. Different from their works, we propose a spatio-temporal load balancing approach in this paper. A key performance requirement in IDCs is service delay. Most existing works use average queueing delay to evaluate the IDC service delay performance [5, 9]. A recent study on real-life production systems showed that there existed a long tail in processing delay for user requests; the user requests experiencing the longest delay significantly degraded users experience [8, 9]. Thus it is necessary to provide a service delay bound for all user requests. We incorporate a queueing delay constraint into the formulated problem, and guarantee a service completion time for user requests in distributed IDCs. V. CONCLUSION Internet data centers (IDCs) incur huge energy cost. Minimizing energy cost for IDC operations has recently attracted much attention. In this paper, we study how to leverage both geographic and temporal variation of electricity price to minimize energy cost for distributed IDCs. We propose a novel spatio-temporal load balancing approach. Extensive evaluations demonstrate that the proposed spatio-temporal load balancing approach achieves significant energy cost saving compared to the schemes using either spatial load balancing or temporal load balancing alone. ACKNOWLEDGMENTS This work was supported in part by NSERC Discovery Grant 34823. REFERENCES [] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, A view of cloud computing, Commun. ACM, vol. 53, pp. 5 58, April 2. [2] J. Hamilton, Cooperative expendable micro-slice servers (cems): Low cost, low power servers for internet-scale services, Jan. 29. [3] United States Energy Information Administration, Dept. of Energy, http://www.eia.doe.gov. [4] United States Federal Energy Regulatory Commission, http://www.ferc.gov. [5] L. Rao, X. Liu, L. Xie, and W. Liu, Minimizing electricity cost: Optimization of distributed internet data centers in a multi-electricitymarket environment, in INFOCOM, 2 Proceedings IEEE, march 2, pp. 9. [6] A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, and B. Maggs, Cutting the electric bill for internet-scale systems, in Proceedings of the ACM SIGCOMM 29 conference on Data communication, ser. SIGCOMM 9, 29, pp. 23 34. [7] Z. Liu, M. Lin, A. Wierman, S. H. Low, and L. L. Andrew, Greening geographical load balancing, SIGMETRICS Perform. Eval. Rev., vol. 39, pp. 93 24, June 2. [8] K. Le, R. Bianchini, T. Nguyen, O. Bilgir, and M. Martonosi, Capping the brown energy consumption of internet services at low cost, in Green Computing Conference, 2 International, aug. 2, pp. 3 4. [9] Y. Yao, L. Huang, A. Sharma, L. Golubchik, and M. Neely, Data centers power reduction: A two time scale approach for delay tolerant workloads, in INFOCOM, 22 Proceedings IEEE, march 22, pp. 43 439. [] J. Luo, L. Rao, and X. Liu, eco-idc: Trade delay for energy cost with service delay guarantee for internet data centers, in Cluster Computing (CLUSTER), 22 IEEE International Conference on, sept. 22, pp. 45 53. [] R. Urgaonkar, B. Urgaonkar, M. J. Neely, and A. Sivasubramaniam, Optimal power cost management using stored energy in data centers, in Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, ser. SIGMETRICS, 2, pp. 22 232. [2] Y. Guo and Y. Fang, Electricity cost saving strategy in data centers by using energy storage, IEEE Transactions on Parallel and Distributed Systems, vol. 99, no. PrePrints, 22. [3] J. Diaz, C. Munoz-Caro, and A. Nino, A survey of parallel programming models and tools in the multi and many-core era, Parallel and Distributed Systems, IEEE Transactions on, vol. PP, no. 99, p., 22. [4] D. Gmach, J. Rolia, L. Cherkasova, and A. Kemper, Workload analysis and demand prediction of enterprise data center applications, in Workload Characterization, 27. IISWC 27. IEEE th International Symposium on, sept. 27, pp. 7 8. [5] J. Liu, F. Zhao, X. Liu, and W. He, Challenges towards elastic power management in internet data centers, in Proceedings of the 29 29th IEEE International Conference on Distributed Computing Systems Workshops, ser. ICDCSW 9, 29, pp. 65 72. [6] M. Andrews, S. Antonakopoulos, and L. Zhang, Energy-aware scheduling algorithms for network stability, in INFOCOM, 2 Proceedings IEEE, april 2, pp. 359 367. [7] D. Xu and X. Liu, Geographic trough filling for internet datacenters, in INFOCOM, 22 Proceedings IEEE, march 22, pp. 288 2885. [8] S. Kavulya, J. Tan, R. Gandhi, and P. Narasimhan, An analysis of traces from a production mapreduce cluster, Cluster Computing and the Grid, IEEE International Symposium on, vol., pp. 94 3, 2. [9] D. Ersoz, M. Yousif, and C. Das, Characterizing network traffic in a cluster-based, multi-tier data center, in Distributed Computing Systems, 27. ICDCS 7. 27th International Conference on, june 27, p. 59. 344