Sustainability and Energy Efficiency in Data Centers Design and Operation



Similar documents
Sustainability Issues in Digital Preservation

A Survey on Carbon Emission Management and Intelligent System using Cloud

Call for Papers for Next-Generation Networking and Internet Symposium

Stochastic Model Predictive Control for data centers

Energy Constrained Resource Scheduling for Cloud Environment

Towards a Flexible and Energy Adaptive Datacenter Infrastructure

Datacenter Efficiency

A Study on Service Oriented Network Virtualization convergence of Cloud Computing

DISC University of Minnesota Digital Technology Center Intelligent Storage Consortium

A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing

Energy Conscious Virtual Machine Migration by Job Shop Scheduling Algorithm

Green Computing Tutorial Prashant Shenoy, David Irwin

Expansion of Data Center s Energetic Degrees of Freedom to Employ Green Energy Sources

Infrastructure as a Service (IaaS)

Green Cloud Computing: Balancing and Minimization of Energy Consumption

Mobile Adaptive Opportunistic Junction for Health Care Networking in Different Geographical Region

Keywords: Cloudsim, MIPS, Gridlet, Virtual machine, Data center, Simulation, SaaS, PaaS, IaaS, VM. Introduction

ENERGY EFFICIENT DATA CENTRES AND STORAGE. Peter James University of Bradford

An Introduction to Data Center Infrastructure Management

Willow: A Control System for Energy and Thermal Adaptive Computing

Research, innovation, and policy on energy efficient data centres: an EU perspective

Special Issue on Advances of Utility and Cloud Computing Technologies and Services

Introduction to Computer Networking: Trends and Issues

Introduction to IBM tools to manage energy consumption

Panel Session: Lessons Learned in Smart Grid Cybersecurity

Enhancing Data Center Sustainability Through Energy Adaptive Computing

5 Simple Steps to Maximizing the Business Value of Your Datacenter

Seize the benefits of green energy and the smart grid

DESIGN CHALLENGES OF TECHNOLOGY SCALING

System Models for Distributed and Cloud Computing

Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing

Cloud Computing and Software Agents: Towards Cloud Intelligent Services

Energy Adaptive Mechanism for P2P File Sharing Protocols

Sustainable Data Centers: Enabled by Supply and Demand Side Management

Efficient and Enhanced Load Balancing Algorithms in Cloud Computing

Efficient Data Replication Scheme based on Hadoop Distributed File System

Data Centers That Deliver Better Results. Bring Your Building Together

SMART GRID AN INTRODUCTION

Datacenters and Cloud Computing. Jia Rao Assistant Professor in CS

An Overview on Important Aspects of Cloud Computing

Security of Web Applications and Browsers: Challenges and Solutions

Zen Internet Case Study

present: Assistant Professor, Foster Faculty Fellow Michael G. Foster School of Business, University of Washington

Mobile Multimedia Meet Cloud: Challenges and Future Directions

Graduated Student: José O. Nogueras Colón Adviser: Yahya M. Masalmah, Ph.D.

Power efficiency and power management in HP ProLiant servers

Cisco EnergyWise and CA ecosoftware: Deliver Energy Optimization for the Data Center

CURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING

Experimental Evaluation of Energy Savings of Virtual Machines in the Implementation of Cloud Computing

New Jersey Big Data Alliance

Data Center Efficiency in the Scalable Enterprise

HP Data Center Efficiency For The Next Generation. Pete Deacon Canadian Portfolio Manager Power and Cooling Services

Advances in Smart Systems Research : ISSN : Vol. 3. No. 3 : pp.

Grid Computing Vs. Cloud Computing

Sustain and Gain Because Going Green Makes Good Business Sense IBM Corporation IT Optimization 6-Dec-07 1

IEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper

Energy efficiency, a new focus for general-purpose

Big Data R&D Initiative

Please give me your feedback

Keynote 1. Scale and Programmability in Google s Software Defined Data Center WAN. Amin M. Vahdat University of California San Diego

Power Reduction Techniques in the SoC Clock Network. Clock Power

Managing Data Center Power and Cooling

Keynote Speaker, 2015 IEEE Symposium on Computational Intelligence in Cyber Security (CICS 2015), Cape Town, South Africa, December 7-10, 2015.

3D On-chip Data Center Networks Using Circuit Switches and Packet Switches

Smarter Data Centers. Integrated Data Center Service Management

How High Temperature Data Centers & Intel Technologies save Energy, Money, Water and Greenhouse Gas Emissions

Multi-dimensional Affinity Aware VM Placement Algorithm in Cloud Computing

Challenges In Intelligent Management Of Power And Cooling Towards Sustainable Data Centre

A Systems of Systems. The Internet of Things. perspective on. Johan Lukkien. Eindhoven University

Data Center Infrastructure Management. optimize. your data center with our. DCIM weather station. Your business technologists.

Heterogeneous Sensor System on Chip

IMPLEMENTATION OF VIRTUAL MACHINES FOR DISTRIBUTION OF DATA RESOURCES

Near Sheltered and Loyal storage Space Navigating in Cloud

Data Center Equipment Power Trends

Managing Smart Grid Information in the Cloud: Opportunities, Model, and Applications

Curriculum Vitae RESEARCH INTERESTS EDUCATION. SELECTED PUBLICATION Journal. Current Employment: (August, 2012 )

Data Center Specific Thermal and Energy Saving Techniques

Big-Data Computing with Smart Clouds and IoT Sensing

An Architecture Model of Sensor Information System Based on Cloud Computing

DataCenter 2020: first results for energy-optimization at existing data centers

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

9700 South Cass Avenue, Lemont, IL URL: fulin

An Empirical Approach - Distributed Mobility Management for Target Tracking in MANETs

How To Understand Cloud Computing

European Code of Conduct for Data Centre. Presentation of the Awards 2013

Avoiding Overload Using Virtual Machine in Cloud Data Centre

Big Data Mining Services and Knowledge Discovery Applications on Clouds

Load Balancing and Maintaining the Qos on Cloud Partitioning For the Public Cloud

David G. Belanger, PhD, Senior Research Fellow, Stevens Institute of Technology, New Jersey, USA Topic: Big Data - The Next Phase Abstract

ENHANCING MOBILE PEER-TO-PEER ENVIRONMENT WITH NEIGHBORHOOD INFORMATION

Randal E. Bryant. Education. Employment

Green Computing. What is Green Computing?

Dynamic Scheduling and Pricing in Wireless Cloud Computing

Communication and Embedded Systems: Towards a Smart Grid. Radu Stoleru, Alex Sprintson, Narasimha Reddy, and P. R. Kumar

Device-centric Code is deployed to individual devices, mostly preprovisioned

Effect of Rack Server Population on Temperatures in Data Centers

Static-Noise-Margin Analysis of Conventional 6T SRAM Cell at 45nm Technology

Datacenters are at the center of businesses

How To Build A Cloud Computer

Transcription:

Sustainability and Energy Efficiency in Data Centers Design and Operation Krishna Kant Intel and GMU krishna.kant@intel.com David Du University of Minnesota du@cs.umn.edu Abstract The tutorial will provide a comprehensive overview of how to make data center design and operation more sustainable using several mechanisms: (a) smart management of power/thermal/cooling at all levels, (b) replacing overdesign at all levels with leaner designs coupled with smarter control, and (c) enabling the use of variable sources of renewable energy. The tutorial will provide a comprehensive coverage of these issues starting with the basics of data center architecture and power/thermal management and illustrate ideas with an example of how to adapt operations in the face of varying energy supply and demands. The tutorial shall also expose a number of challenges in making data centers more sustainable. 1 Basic Information Speaker: Dr. Krishna Kant Aims/Learning objectives; 1. Elucidate sustainability issues and understand challenges in sustainable design and operation of data centers. 2. Discuss need for holistic and coordinated treatment of major power and thermal issues in data centers from server components (CPU, memory, NIC,...) to data center wide management. 3. Introduce energy adaptive computing (EAC) as a paradigm to address sustainability issues and exploitation of renewable energy in data centers. Illustrated with a concrete example of dynamic adaptation in an environment with varying energy supply and demands. Duration: Half day (about 3.5 hours). Content and Audience 1. Target audience: Researchers, engineers, and graduate students interested in understanding the emerging sustainability and energy efficiency issues at all levels of data centers. 2. Content Level & Prerequisites: Much of the material is at intermediate level suitable for those who have a working knowledge of computer systems. In particular, it is assumed that the attendees possess basic knowledge in computer architecture, operating systems, networking, and performance evaluation. 1

Relevance to HPCA: The area of power/thermal management from chips to entire data centers has been of direct interest to HPCA attendees for some time. However, as the need for data centers continues to expand and both the IT and electricity infrastructures involve integration of highly distributed and heterogenous facilities, it is becoming critical to account for a variety of sustainability issues as well. Some of these issue include impact of variable energy sources for operating data centers, increasingly disruptive climate related events, and rapidly expanding life-cycle energy footprint of the IT infrastructure. This tutorial attempts to fill this gap. Since sustainability is a rather new and immature field, the tutorial also envisions to spur thought processes that would hopefully lead to more research and development into various aspects of making data centers more enviornmentally friendly while continuing to support increasingly higher computational capabilities. Tutorial History: An earlier version of the tutorial was presented at the US Supercomputing conference held in Portland, OR in Nov 2009. This version was focused only on power/thermal management in data centers. The tutorial was attended by about 60 people and was very well received. A more recent version was presented at the International Supercomputing Conference (ISC) in Hamburg, June 2011. It was attended by about 30 people and was again well received. The ISC version significantly enhanced the presentation based on author s work on energy adaptive computing and brought in the sustainability focus to the tutorial. This version is available at www.kkant.net/tutorials/ The version to be presented at HPCA will further enhance this with the issues related to interdependence between IT and energy infrastructures. The presentation will also be tuned further based on the needs of HPCA and additional work in the literature. 2 Tutorial Description Data centers form the backbone for supporting increasingly sophisticated computing required at all levels from small mobile devices involved in social networking to large scale problem solving such as drug discovery or climate modeling. The increasing computing power results in increasing energy usage, which has a number of direct and indirect impacts starting with the unsustainable power density at the chip level to large carbon footprint and natural resource usage at the data center level. The purpose of this tutorial is to present a holistic view of issues of energy efficiency and sustainability in this context. Since the need for energy efficiency is well recognized by now, this tutorial will focus on bringing in the larger sustainability issues and show how they can be considered in the design and operation of data centers. The tutorial will start with an overview of architecture of data centers and point out various sources of energy wastage in data centers including power conversion, distribution and delivery, cooling and thermal management, semiconductor related power losses, and software inefficiencies. Continuing with the background, the tutorial will then address basics of power management including power states from individual components to the system level and their characteristics. It will then introduce three basic mechanisms to manage power: (a) use of sleep states at various levels of granularity, (b) use of active lowpower states, and (c) width control, i.e., controlling number of active copies of a component type. The tutorial will then consider power/thermal management at data center level and discuss issues in cooling, airflow management, power delivery and distribution, etc. and corresponding modeling issues. Next, the it will address the larger sustainability issues including overdesign & overprovisioning of infrastructure 2

at all levels, water wastage in chiller plants, impact of large power draw from the grid, etc. The tutorial will show how data center design could be made more sustainabile by substituting smarter control for overdesign, minimizing use of natural resources, and using locally produced renewable energy. It will illustrate the ideas with a concrete example of adaptation to varying demands and energy availability and expose the challenges of adaptation in various environments. The final part of the tutorial will consider the challenges brought about by three emerging phenomena: (a) increasingly disruptive and larger scale events, likely triggered by climate change, (b) increasingly the large integration of IT infrastructures (e.g., geographically distributed data centers that cooperate to provide ubiquitous cloud computing capabilities), and (c) wide-scale distributed energy generation integrated into the electric grid. The tutorial will elaborate on the availability and reliability challenges brought about by this combination and some ways to address them. The rough outline of the tutorial is as follows: 1. Architectural overview of data centers including servers, storage, data center network, inband/outof-band management infrastructure, cooling and power distribution infrastructure. This overview will also point out sources of energy wastage and their causes at all levels. 2. Basic relationships between voltage, speed, power, temperature, and cooling. Power management basics including active and passive power states at various levels and basic power/thermal management algorithms. Power measurement and estimation issues, especially for virtualized environments. 3. Power delivery and cooling: Smart power supplies and smart VRs (voltage regulators) and challenges in their control and corresponding performance issues. Rack/Chassis level power delivery, power capping and emergency rideout issues. Basic thermal and cooling modeling, coordinated cooling and power control, cooling aware task scheduling, and management plane issues for servers. 4. Sustainability Issues in data center design: overdesign and its impact, Energy and thermal adaptive computing paradigm to enable frugal designs to provide required QoS and use variable sources of (renewable) energy. Illustration of energy adaptive computing using a concrete example with varying supply and demands. 5. A discussion of challenges in adaptation for various components and in various environment including (a) adaptation of computing, storage and network infrastructure, (b) end-to-end adaption (client, network, data-center), and (c) adaptation in cloud computing environment. The purpose of this part is provide food-for-thought to spur further research in the area. 6. A discussion of impact of distribued energy generation and its use to power data centers and other IT infrastructures. It will be shown that the variability of renewable sources makes a distributed power network more prone to blackouts and load shedding, and tolerating large scale disruptions requires special considerations in designing and operating a collection of data centers. 3 Speaker Biography Krishna Kant received his Ph.D. degree in Computer Science from The University of Texas at Dallas in 1981. Since May 1997- April 2008 he worked at Intel Corp where he worked on a variety of Internet and data center issues, especially power and thermal issues at various levels. Since April 2008 he has been on the staff of National Science Foundation (NSF), where he adminsters the computer systems 3

research (CSR) program in the CISE directorate. He also represents CISE in the emerging NSF wide sustainability programs. He is currently on a research professor position with George Mason University, Fairfax VA where he works on cloud computing security issues. From 1992-97, he was with Network Services Performance and Control group at Bellcore, Red Bank, NJ. Prior to 1992 he was he was an associate professor of computer science at the Pennsylvania State University, University Park, PA. He is the author of a highly regarded book titled Introduction to Computer System Performance Evaluation (New York: McGraw-Hill, 1992). He has published extensively in conferences and journals in a variety of areas of computer science and engineering. His current areas of research interest include energy efficient and sustainable computing, cloud computing security, and robustness in the Internet. David Du received a Ph.D. degree from University of Washington (Seattle) in 1981. He joined University of Minnesota as a faculty since 1981. He is currently the Qwest Chair Professor of Computer Science and Engineering at University of Minnesota, Minneapolis. He has served as a Program Director (IPA) at National Science Foundation CISE/CNS Division from March 2006 to September 2008. Dr. Du has a wide range of research expertise including multimedia computing, mass storage systems, high-speed networking, sensor networks, cyber security, high-performance file systems and I/O, database design, and CAD for VLSI circuits. He has authored and co-authored over 200 technical papers including 100 referred journal publications in these research areas. Dr. Du is an IEEE Fellow (since 1998) and a Fellow of the Minnesota Supercomputer Institute. He is currently serving on the Editorial Boards of several international journals. He has served as conference chair and program committee chair for a number of conferences in the past. Most recently he is the Conference Chair for IEEE Symposium on Security and Privacy 2009, Program Co-Chair for International Conference on Parallel Processing 2009 and Conference General Chair of IEEE Conference on Distributed Computing Systems 2011. 4 Tutorial Resources The tutorial will be based on the extensive experience of the authors in power/thermal issues and the extensive literature on the subject. Some of the work of authors relevant to the tutorial is listed below: 1. K. Kant, Sustainability Considerations in Network Design and Operation, invited talk at ICCCN 2011, Maui, Hawaii, Aug 2011. 2. K. Kant, M. Murugan and D. Du, Willow: A Control System for Energy and Thermal Adaptive Computing, Proc. of IPDPS 2011 3. K. Kant, Multistate Power Management of Communication Links, Proc. of COMSNET 2010. 4. K. Kant, A Control Scheme for Batching DRAM Requests to Improve Power Efciency, Accepted for Sigmetrics 2011. 5. K. Kant, N. Udar, R. Viswanathan, Localization and Location Based Services in Data Centers, IEEE Network magazine, Nov 2008. 6. K. Kant, Data Center Evolution A Tutorial on State of the Art, Issues, and Challenges, Elsevier Networks Journal, Dec 2009. 7. N. Manadagere and D. Du Data Center Power Management, submitted to ACM Computing Survey 4

8. N. Mandagere, J. Diehl and D. Du GreenStor: Application-Aided Energy-Efficient Storage, Proc. of IEEE Mass Storage Systems and Technology 2007 9. D. Du Recent Advancement and Future Challenges of Storage Systems, Proceedings of IEEE, Vol. 96, No. 11, pp. 1875-1886, 2008 (invited paper) 5