How Much Power Oversubscription is Safe and Allowed in Data Centers?
|
|
|
- Vivien Cummings
- 10 years ago
- Views:
Transcription
1 How Much Power Oversubscription is Safe and Allowed in Data Centers? Xing Fu 1,2, Xiaorui Wang 1,2, Charles Lefurgy 3 1 University of Tennessee, Knoxville 2 The Ohio State University 3 IBM Research, Austin
2 Introduction Power: a first-class constraint in data center design Power oversubscription by power capping Improves power facility utilization Improves server performance Power capping at different levels Servers, racks, and data centers However, they all share a common assumption Power should never exceed the rated power capacity? Otherwise the circuit breaker (CB) would trip? Not really! circuit breakers can sustain short overloads. 2
3 How Much Power Subscription is Safe? A CB trips or not depends on Magnitude of the overload Duration of the overload Ideal upper bound? Lower bound of the tolerance band This paper Investigates CB trip features Proposes adaptive power control to 1. Fully utilizes the allowed overload interval for maximized server perf. 2. Safely hosts more servers without upgrading power facilities 3 Two minutes Trip curve of a typical 1.17 Rated rated circuit breaker capacity
4 Proposed Solution: CB-Adaptive More than just a standalone controller A methodology that adapts the parameters of existing power controllers to engineer their settling times Example: adapts a server power controller [Lefurgy ICAC 07] 1. Obtain the tripping time from the CB tripping curve 2. The desired settling time should be the tripping time 3. Adapt controller parameter K to enforce the settling time Error Power Cap K Error CPU frequency Power measured Server Adapt controller parameter K 4
5 CB-Adaptive Design Details System model p( k + 1) = p( k) + Ad( k) p(k) is the power of the server d(k) is the change to the CPU frequency A is a hardware-specific parameter when the server runs LINPACK How to adapt the controller parameter? The relationship between the parameter and the settling time The parameter is a function of the measured power, the rated current of CB, and the control period. 5
6 Two CB-Adaptive Improvements Temperature-aware CB-Adaptive The CB trip curve is impacted by the ambient temperature. The rated current of CB is a linear function of the temperature. K is also a function of ambient temperature. CB-Proactive Delicately increases DVFS level in a proactive way Further improves the server performance When and to what extent the DVFS level is increased? CB enters the long-delay region The rated current Increase the frequency to the highest level Temperature 6
7 Discussion on Power Oversubscription Possible applications of CB-Adaptive Hosting additional servers Safety issues A typical power delivery system Every component can tolerate overloads like CBs Overload capacity: power beyond which permanent damage occurs to the component 7
8 More Discussion Components other than CBs do not experience overloads frequently. It is less likely that many servers reach their peak power simultaneously. Evidenced by a real Google data center [Fan ISCA 07] When only a branch circuit is overloaded CB-Adaptive can be applied directly When multiple branch circuits are overloaded CB-Adaptive needs to consider the tripping time of components other than CBs. 8
9 Hardware Testbed Dell OptiPlex 380 Rockwell Allen- Bradley 1489-A Industrial CB Workloads SPEC CPU2006 SPEC JBB LINPACK 9
10 Baselines NoControl Estimates the peak power consumption of a server P-Control No power caps Unsafe and conservative Measures the power in every control period A non-adaptive proportional controller calculates frequency changes to enforce a power budget. P-Control-CB The power budget is different from that of P-Control Upper bound of the long-delay region of the CB 10
11 Power Consumption (Watt) Po w er C o n su m p tio n (W att) Power Consumption (Watt) Power Consumption (Watt) Power Control Comparison NoControl Rated Limit Long-Delay Upper Limit Control Period (5sec) P-Control-CB Long-Delay Upper Limit P-Control Control Period (5sec) Converge to 80W after hours NoControl P-Control CB-Adaptive Rated Limit Long-Delay Upper Limit Control Period (5sec) Converge to 80W CB-Proactive Rated Lim it Long-Delay Upper Lim it Control Period (5sec) NoControl causes the CB trips. Unsafe P-Control & P-Control-CB Unsafe and conservative CB-Adaptive fully utilizes overload intervals of CBs. Raise CPU freq for higher performance 11
12 CB-Adaptive CB-Proactive Performance Comparison CB-Adaptive outperforms P-Control by 66%, for LINPACK 29 % to 49%, for SPEC CPU %, for SPEC JBB LINPACK Performance (Gflops) 0.0 P-Control P-Control-CB SPECJBB Performance (bops) LINPACK SPECJBB gcc bzip2 perlbench h264ref libquantum sjeng hmmer gobmk mcf P-Control CB-Adaptive CB-Proactive sphinx3 wrf lbm tonto GemsFDTD calculix povray soplex dealii namd leslie3d cactusadm gromacs zeusmp milc gamess bwaves xalancbmk astar omnetpp SPEC2006 benchmarks Performance (Base Ratio)
13 Impact of Temperature Trip Time (Sec) NoControl (21.7 ) NoControl (26.4 ) NoControl (31.6 ) NoControl (34.8 ) P-Control-CB (45 ) CB-Adaptive (45 ) CB-Proactive (45 ) Temperature (degree Celsius) Temperature impacts the trip time significantly. Temperature-blind solutions P-Control-CB, CB- Adaptive and CB-Proactive are not safe. 13
14 Temperature-Aware CB-Adaptive As the temperature increases, the performance of servers decreases. The performance decrease is modest. 14
15 Power Provisioning Analysis NoControl The number of servers = Rated power of the CB estimated server power The estimation is too conservative 7 servers hosted per branch P-Control Enforce a power budget instead of an estimation of power 13 servers hosted per branch CB-Adaptive Enforce a higher power budget than P-Control 20 servers hosted per branch 15
16 Conclusions A common assumption of existing power capping Peak power should never exceed the rated CB capacity This paper Systematically studies the CB tripping characteristics Identifies ideal upper bound of safe power oversubscription Proposes two adaptive power control strategies Evaluation on safe power oversubscription A single server: 38% performance improvement Circuit branch: host 54% more servers without upgrading power infrastructure 16
17 Questions? Acknowledgements NSF CAREER Award CNS NSF CSR Grant CNS NSF SHF Grant CCF Prof. Leon Tolbert at the University of Tennessee Thank you! 17
18 BackUp 18
19 Control Theoretic Analysis How to adapt the controller parameter? m K = A Details of the derivation Z transform of the system model Z-domain controller Calculate the close loop transfer function Reverse Z transform 19
20 Power Provisioning Analysis UPS cannot tolerate overloads Not a problem because each UPS run at 50% its capacity Factors limiting overload capacities 20
21 21
Achieving QoS in Server Virtualization
Achieving QoS in Server Virtualization Intel Platform Shared Resource Monitoring/Control in Xen Chao Peng ([email protected]) 1 Increasing QoS demand in Server Virtualization Data center & Cloud infrastructure
Computer Architecture
Computer Architecture Slide Sets WS 2013/2014 Prof. Dr. Uwe Brinkschulte M.Sc. Benjamin Betting Part 6 Fundamentals in Performance Evaluation Computer Architecture Part 6 page 1 of 22 Prof. Dr. Uwe Brinkschulte,
An OS-oriented performance monitoring tool for multicore systems
An OS-oriented performance monitoring tool for multicore systems J.C. Sáez, J. Casas, A. Serrano, R. Rodríguez-Rodríguez, F. Castro, D. Chaver, M. Prieto-Matias Department of Computer Architecture Complutense
CPU Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate:
CPU Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing a CPU clock running at a constant clock rate: Clock cycle where: Clock rate = 1 / clock cycle f = 1 /C
A-DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters
A-DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters Hui Wang, Canturk Isci, Lavanya Subramanian, Jongmoo Choi, Depei Qian, Onur Mutlu Beihang University, IBM Thomas J. Watson
Performance Characterization of SPEC CPU2006 Integer Benchmarks on x86-64 64 Architecture
Performance Characterization of SPEC CPU2006 Integer Benchmarks on x86-64 64 Architecture Dong Ye David Kaeli Northeastern University Joydeep Ray Christophe Harle AMD Inc. IISWC 2006 1 Outline Motivation
When Prefetching Works, When It Doesn t, and Why
When Prefetching Works, When It Doesn t, and Why JAEKYU LEE, HYESOON KIM, and RICHARD VUDUC, Georgia Institute of Technology In emerging and future high-end processor systems, tolerating increasing cache
Compiler-Assisted Binary Parsing
Compiler-Assisted Binary Parsing Tugrul Ince [email protected] PD Week 2012 26 27 March 2012 Parsing Binary Files Binary analysis is common for o Performance modeling o Computer security o Maintenance
Cache Capacity and Memory Bandwidth Scaling Limits of Highly Threaded Processors
Cache Capacity and Memory Bandwidth Scaling Limits of Highly Threaded Processors Jeff Stuecheli 12 Lizy Kurian John 1 1 Department of Electrical and Computer Engineering, University of Texas at Austin
secubt : Hacking the Hackers with User-Space Virtualization
secubt : Hacking the Hackers with User-Space Virtualization Mathias Payer Department of Computer Science ETH Zurich Abstract In the age of coordinated malware distribution and zero-day exploits security
Linear-time Modeling of Program Working Set in Shared Cache
Linear-time Modeling of Program Working Set in Shared Cache Xiaoya Xiang, Bin Bao, Chen Ding, Yaoqing Gao Computer Science Department, University of Rochester IBM Toronto Software Lab {xiang,bao,cding}@cs.rochester.edu,
Memory Bandwidth Management for Efficient Performance Isolation in Multi-core Platforms
.9/TC.5.5889, IEEE Transactions on Computers Memory Bandwidth Management for Efficient Performance Isolation in Multi-core Platforms Heechul Yun, Gang Yao, Rodolfo Pellizzoni, Marco Caccamo, Lui Sha University
Analysis of Memory Sensitive SPEC CPU2006 Integer Benchmarks for Big Data Benchmarking
Analysis of Memory Sensitive SPEC CPU2006 Integer Benchmarks for Big Data Benchmarking Kathlene Hurt and Eugene John Department of Electrical and Computer Engineering University of Texas at San Antonio
HQEMU: A Multi-Threaded and Retargetable Dynamic Binary Translator on Multicores
H: A Multi-Threaded and Retargetable Dynamic Binary Translator on Multicores Ding-Yong Hong National Tsing Hua University Institute of Information Science Academia Sinica, Taiwan [email protected]
Practical Memory Checking with Dr. Memory
Practical Memory Checking with Dr. Memory Derek Bruening Google [email protected] Qin Zhao Massachusetts Institute of Technology qin [email protected] Abstract Memory corruption, reading uninitialized
Delivering on the Promise of Universal Memory for Spin-Transfer Torque RAM (STT-RAM)
Delivering on the Promise of Universal Memory for Spin-Transfer Torque RAM (STT-RAM) Anurag Nigam, Clinton W. Smullen, IV, Vidyabhushan Mohan, 3 Eugene Chen, Sudhanva Gurumurthi, and Mircea R. Stan ECE
Understanding the Behavior of In-Memory Computing Workloads
Understanding the Behavior of In-Memory Computing Workloads Tao Jiang, Qianlong Zhang, Rui Hou, Lin Chai, Sally A. Mckee, Zhen Jia, and Ninghui Sun SKL Computer Architecture, ICT, CAS, Beijing, China University
Building a More Efficient Data Center from Servers to Software. Aman Kansal
Building a More Efficient Data Center from Servers to Software Aman Kansal Data centers growing in number, Microsoft has more than 10 and less than 100 DCs worldwide Quincy Chicago Dublin Amsterdam Japan
Characterizing Processor Thermal Behavior
Characterizing Processor Thermal Behavior Francisco J. Mesa-Martínez Ehsan K.Ardestani Jose Renau Department of Computer Engineering University of California Santa Cruz http://masc.cse.ucsc.edu Abstract
DATA centers often comprise thousands of enterprise
LeakageAware Cooling Management for Improving Server Energy Efficiency Marina Zapater, Ozan Tuncer, José L. Ayala, José M. Moya, Kalyan Vaidyanathan, Kenny Gross and Ayse K. Coskun Abstract The computational
Benchmarking the Amazon Elastic Compute Cloud (EC2)
Benchmarking the Amazon Elastic Compute Cloud (EC2) A Major Qualifying Project Report submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUTE in partial fulfillment of the requirements for the
SHIP: Scalable Hierarchical Power Control for Large-Scale Data Centers
SHIP: Scalable Hierarchical Power Control for Large-Scale Data Centers Xiaorui Wang, Ming Chen Charles Lefurgy, Tom W. Keller University of Tennessee, Knoxville, TN 37996 IBM Research, Austin, TX 78758
Dynamic Power Variations in Data Centers and Network Rooms
Dynamic Power Variations in Data Centers and Network Rooms By Jim Spitaels White Paper #43 Revision 2 Executive Summary The power requirement required by data centers and network rooms varies on a minute
Statistical Profiling-based Techniques for Effective Power Provisioning in Data Centers
Statistical Profiling-based Techniques for Effective Power Provisioning in Data Centers Sriram Govindan, Jeonghwan Choi, Bhuvan Urgaonkar, Anand Sivasubramaniam, Andrea Baldini Penn State, KAIST, Tata
Dynamic Power Variations in Data Centers and Network Rooms
Dynamic Power Variations in Data Centers and Network Rooms White Paper 43 Revision 3 by James Spitaels > Executive summary The power requirement required by data centers and network rooms varies on a minute
Power Budgeting for Virtualized Data Centers
Power Budgeting for Virtualized Data Centers Harold Lim Duke University Aman Kansal Microsoft Research Jie Liu Microsoft Research Abstract Power costs are very significant for data centers. To maximally
FACT: a Framework for Adaptive Contention-aware Thread migrations
FACT: a Framework for Adaptive Contention-aware Thread migrations Kishore Kumar Pusukuri Department of Computer Science and Engineering University of California, Riverside, CA 92507. [email protected]
Leveraging Thermal Storage to Cut the Electricity Bill for Datacenter Cooling
Leveraging Thermal Storage to Cut the Electricity Bill for Datacenter Cooling Yefu Wang1, Xiaorui Wang1,2, and Yanwei Zhang1 ABSTRACT The Ohio State University 14 1 1 8 6 4 9 8 Time (1 minuts) 7 6 4 3
Memory Access Control in Multiprocessor for Real-time Systems with Mixed Criticality
Memory Access Control in Multiprocessor for Real-time Systems with Mixed Criticality Heechul Yun +, Gang Yao +, Rodolfo Pellizzoni *, Marco Caccamo +, Lui Sha + University of Illinois at Urbana and Champaign
Case study: End-to-end data centre infrastructure management
Case study: End-to-end data centre infrastructure management Situation: A leading public sector organisation suspected that their air conditioning units were not cooling the data centre efficiently. Consequently,
IT@Intel. Comparing Multi-Core Processors for Server Virtualization
White Paper Intel Information Technology Computer Manufacturing Server Virtualization Comparing Multi-Core Processors for Server Virtualization Intel IT tested servers based on select Intel multi-core
Run-time Resource Management in SOA Virtualized Environments. Danilo Ardagna, Raffaela Mirandola, Marco Trubian, Li Zhang
Run-time Resource Management in SOA Virtualized Environments Danilo Ardagna, Raffaela Mirandola, Marco Trubian, Li Zhang Amsterdam, August 25 2009 SOI Run-time Management 2 SOI=SOA + virtualization Goal:
Comparing the Carbon Footprints of 11G and 12G Rack Servers from Dell
Comparing the Carbon Footprints of 11G and 12G Rack Servers from Dell Markus Stutz, Regulatory Principal Engineer, Environmental Affairs August 2013. Contents Comparing two rack server generations... 4
Fine-Grained User-Space Security Through Virtualization. Mathias Payer and Thomas R. Gross ETH Zurich
Fine-Grained User-Space Security Through Virtualization Mathias Payer and Thomas R. Gross ETH Zurich Motivation Applications often vulnerable to security exploits Solution: restrict application access
MCCCSim A Highly Configurable Multi Core Cache Contention Simulator
MCCCSim A Highly Configurable Multi Core Cache Contention Simulator Michael Zwick, Marko Durkovic, Florian Obermeier, Walter Bamberger and Klaus Diepold Lehrstuhl für Datenverarbeitung Technische Universität
Solve your IT energy crisis WITH An energy SMArT SoluTIon FroM Dell
Solve your IT energy crisis WITH AN ENERGY SMART SOLUTION FROM DELL overcome DATA center energy challenges IT managers share a common and pressing problem: how to reduce energy consumption and cost without
Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and Dell PowerEdge M1000e Blade Enclosure
White Paper Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and Dell PowerEdge M1000e Blade Enclosure White Paper March 2014 2014 Cisco and/or its affiliates. All rights reserved. This
Measuring Power in your Data Center
RESEARCH UNDERWRITER WHITE PAPER LEAN, CLEAN & GREEN Raritan Measuring Power in your Data Center Herman Chan, Director, Power Solutions Business Unit, and Greg More, Senior Product Marketing Manager There
Measuring Energy Efficiency in a Data Center
The goals of the greening of the Data Center are to minimize energy consumption and reduce the emission of green house gases (carbon footprint) while maximizing IT performance. Energy efficiency metrics
DELL VS. SUN SERVERS: R910 PERFORMANCE COMPARISON SPECint_rate_base2006
DELL VS. SUN SERVERS: R910 PERFORMANCE COMPARISON OUR FINDINGS The latest, most powerful Dell PowerEdge servers deliver better performance than Sun SPARC Enterprise servers. In Principled Technologies
Computer Science 146/246 Homework #3
Computer Science 146/246 Homework #3 Due 11:59 P.M. Sunday, April 12th, 2015 We played with a Pin-based cache simulator for Homework 2. This homework will prepare you to setup and run a detailed microarchitecture-level
Virtualizing Performance Asymmetric Multi-core Systems
Virtualizing Performance Asymmetric Multi- Systems Youngjin Kwon, Changdae Kim, Seungryoul Maeng, and Jaehyuk Huh Computer Science Department, KAIST {yjkwon and cdkim}@calab.kaist.ac.kr, {maeng and jhhuh}@kaist.ac.kr
Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and IBM FlexSystem Enterprise Chassis
White Paper Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and IBM FlexSystem Enterprise Chassis White Paper March 2014 2014 Cisco and/or its affiliates. All rights reserved. This document
Building the Energy-Smart Datacenter
Building the Energy-Smart Datacenter How Dell PowerEdge servers, services, tools and partnerships can help customers save money by saving energy. As businesses grow, they need more computational capacity.
Data Center Power Distribution at the Cabinet Level: The top 12 questions you need to ask to improve data center energy effciency
Data Center Power Distribution at the Cabinet Level: The top 12 questions you need to ask to improve data center energy effciency White Paper STI-100-010 November 2013 HEADQUARTERS - NORTH AMERICA Server
Measuring Power in your Data Center: The Roadmap to your PUE and Carbon Footprint
Measuring Power in your Data Center: The Roadmap to your PUE and Carbon Footprint Energy usage in the data center Electricity Transformer/ UPS Air 10% Movement 12% Cooling 25% Lighting, etc. 3% IT Equipment
Energy Savings in the Data Center Starts with Power Monitoring
Energy Savings in the Data Center Starts with Power Monitoring As global competition intensifies, companies are increasingly turning to technology to help turn mountains of data into a competitive edge.
Paying to Save: Reducing Cost of Colocation Data Center via Rewards
Paying to Save: Reducing Cost of Colocation Data Center via Rewards Mohammad A. Islam, Hasan Mahmud, Shaolei Ren, Xiaorui Wang* Florida International University *The Ohio State University Supported in
SPARC64 VII Fujitsu s Next Generation Quad-Core Processor
SPARC64 VII Fujitsu s Next Generation Quad-Core Processor August 26, 2008 Takumi Maruyama LSI Development Division Next Generation Technical Computing Unit Fujitsu Limited High Performance Technology High
Towards Energy Efficient Query Processing in Database Management System
Towards Energy Efficient Query Processing in Database Management System Report by: Ajaz Shaik, Ervina Cergani Abstract Rising concerns about the amount of energy consumed by the data centers, several computer
Sizing Circuit Breakers for Self-Regulating Heat Tracers
Sizing Circuit Breakers for Self-Regulating Heat s Effective Resistance (ohm/m) Introduction Circuit breaker sizing for S/R tracers is quite easily accomplished by using the specification sheet data or
Considerations for a Highly Available Intelligent Rack Power Distribution Unit. A White Paper on Availability
Considerations for a Highly Available Intelligent Rack Power Distribution Unit A White Paper on Availability Introduction Data centers are currently undergoing a period of great change. Data center managers
E-cubed: End-to-End Energy Management of Virtual Desktop Infrastructure with Guaranteed Performance
E-cubed: End-to-End Energy Management of Virtual Desktop Infrastructure with Guaranteed Performance Xing Fu, Tariq Magdon-Ismail, Vikram Makhija, Rishi Bidarkar, Anne Holler VMWare Jing Zhang University
In-network Monitoring and Control Policy for DVFS of CMP Networkson-Chip and Last Level Caches
In-network Monitoring and Control Policy for DVFS of CMP Networkson-Chip and Last Level Caches Xi Chen 1, Zheng Xu 1, Hyungjun Kim 1, Paul V. Gratz 1, Jiang Hu 1, Michael Kishinevsky 2 and Umit Ogras 2
Energy management White paper. Greening the data center with IBM Tivoli software: an integrated approach to managing energy.
Energy management White paper Greening the data center with IBM Tivoli software: an integrated approach to managing energy. May 2008 2 Contents 3 Key considerations for the green data center 3 Managing
Green HPC - Dynamic Power Management in HPC
Gr eenhpc Dynami cpower Management i nhpc AT ECHNOL OGYWHI T EP APER Green HPC Dynamic Power Management in HPC 2 Green HPC - Dynamic Power Management in HPC Introduction... 3 Green Strategies... 4 Implementation...
Comparative performance test Red Hat Enterprise Linux 5.1 and Red Hat Enterprise Linux 3 AS on Intel-based servers
Principled Technologies Comparative performance test Red Hat Enterprise Linux 5.1 and Red Hat Enterprise Linux 3 AS on Intel-based servers Principled Technologies, Inc. Agenda Overview System configurations
Prefetch-Aware Shared-Resource Management for Multi-Core Systems
Prefetch-Aware Shared-Resource Management for Multi-Core Systems Eiman Ebrahimi Chang Joo Lee Onur Mutlu Yale N. Patt HPS Research Group The University of Texas at Austin {ebrahimi, patt}@hps.utexas.edu
A Taxonomy and Survey of Energy-Efficient Data Centers and Cloud Computing Systems
A Taxonomy and Survey of Energy-Efficient Data Centers and Cloud Computing Systems Anton Beloglazov, Rajkumar Buyya, Young Choon Lee, and Albert Zomaya Present by Leping Wang 1/25/2012 Outline Background
