Introduction to IBM tools to manage energy consumption



Similar documents
Data Sheet FUJITSU Server PRIMERGY CX400 M1 Multi-Node Server Enclosure

How To Manage Energy At An Energy Efficient Cost

Energy Management in a Cloud Computing Environment

Data Sheet FUJITSU Server PRIMERGY CX400 S2 Multi-Node Server Enclosure

A Taxonomy and Survey of Energy-Efficient Data Centers and Cloud Computing Systems

Data Sheet FUJITSU Server PRIMERGY CX420 S1 Out-of-the-box Dual Node Cluster Server

How High Temperature Data Centers & Intel Technologies save Energy, Money, Water and Greenhouse Gas Emissions

CREATING & MANAGING A DYNAMIC INFRASTRUCTURE

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC

Managing Data Center Power and Cooling

Energy-aware job scheduler for highperformance

BC43: Virtualization and the Green Factor. Ed Harnish

CarbonDecisions. The green data centre. Why becoming a green data centre makes good business sense

Low Power AMD Athlon 64 and AMD Opteron Processors

Thermal Management of Datacenter

How High Temperature Data Centers & Intel Technologies save Energy, Money, Water and Greenhouse Gas Emissions

The CEETHERM Data Center Laboratory

Data Center 2020: Delivering high density in the Data Center; efficiently and reliably

Green HPC - Dynamic Power Management in HPC

Power efficiency and power management in HP ProLiant servers

DATA CENTER COOLING INNOVATIVE COOLING TECHNOLOGIES FOR YOUR DATA CENTER

Bytes and BTUs: Holistic Approaches to Data Center Energy Efficiency. Steve Hammond NREL

DataCenter 2020: first results for energy-optimization at existing data centers

Linux on POWER for Green Data Center

Energy Constrained Resource Scheduling for Cloud Environment

7 Best Practices for Increasing Efficiency, Availability and Capacity. XXXX XXXXXXXX Liebert North America

Energy Efficiency in New and Existing Data Centers-Where the Opportunities May Lie

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Intelligent Power Optimization for Higher Server Density Racks

Green Data Center and Virtualization Reducing Power by Up to 50% Pages 10

Power and Cooling Innovations in Dell PowerEdge Servers

Agenda. Context. System Power Management Issues. Power Capping Overview. Power capping participants. Recommendations

HUAWEI Tecal E6000 Blade Server

Data Sheet Fujitsu PRIMERGY CX1000 S1 with 38 cloud server nodes

Energy Efficient Data Centre at Imperial College. M. Okan Kibaroglu IT Production Services Manager Imperial College London.

Typical Air Cooled Data Centre Compared to Iceotope s Liquid Cooled Solution

How to Build a Data Centre Cooling Budget. Ian Cathcart

DataCenter Data Center Management and Efficiency at Its Best. OpenFlow/SDN in Data Centers for Energy Conservation.

Datasheet Fujitsu PRIMERGY CX1000 S1 with 38 CX120 S1 The Cool-Central Architecture

Server Platform Optimized for Data Centers

A Smart Investment for Flexible, Modular and Scalable Blade Architecture Designed for High-Performance Computing.

HPC TCO: Cooling and Computer Room Efficiency

How To Buy An Fujitsu Primergy Bx400 S1 Blade Server

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and IBM FlexSystem Enterprise Chassis

SoSe 2014 Dozenten: Prof. Dr. Thomas Ludwig, Dr. Manuel Dolz Vorgetragen von Hakob Aridzanjan

Energy management White paper. Greening the data center with IBM Tivoli software: an integrated approach to managing energy.

Comparison of computational services at LRZ

How To Build A Cloud Computer

HP ProLiant Gen8 vs Gen9 Server Blades on Data Warehouse Workloads

TSUBAME-KFC : a Modern Liquid Submersion Cooling Prototype Towards Exascale

Petascale Software Challenges. Piyush Chaudhary High Performance Computing

Optimization of energy consumption in HPC centers: Energy Efficiency Project of Castile and Leon Supercomputing Center - FCSCL

SGI High Performance Computing

System integration oriented data center. planning. In terms of ATEN's eco Sensors DCIM solution


PRIMERGY server-based High Performance Computing solutions

Intel Data Center Manager. Data center IT agility and control

Dr. John E. Kelly III Senior Vice President, Director of Research. Differentiating IBM: Research

Intel Labs at ISSCC Copyright Intel Corporation 2012

MEGWARE HPC Cluster am LRZ eine mehr als 12-jährige Zusammenarbeit. Prof. Dieter Kranzlmüller (LRZ)

IBM and Dynamic Infrastructure. Doug Neilson, IBM Systems Group May 2009

Trends in High-Performance Computing for Power Grid Applications

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and Dell PowerEdge M1000e Blade Enclosure

Summary. Key results at a glance:

Minimization of Costs and Energy Consumption in a Data Center by a Workload-based Capacity Management

How To Write An Article On An Hp Appsystem For Spera Hana

Pacific Northwest National Laboratory (PNNL)

I don t want to move to the Arctic - a HPC Data Center Take on PUE and Energy Efficiency

Green ICT: Consistent Actions to Reduce Energy Consumption

Green Computing: Datacentres

Last time. Data Center as a Computer. Today. Data Center Construction (and management)

Evoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca

Datacenter Efficiency

Desktop Consolidation. Stéphane Verdy, CTO Devon IT

Scaling in a Hypervisor Environment

Energy Efficiency Opportunities in Federal High Performance Computing Data Centers

Energy Efficiency and Green Data Centers. Overview of Recommendations ITU-T L.1300 and ITU-T L.1310

Data Center Technology: Physical Infrastructure

Big Data Performance Growth on the Rise

Power Management in the Cisco Unified Computing System: An Integrated Approach

Statistical Profiling-based Techniques for Effective Power Provisioning in Data Centers

Sensor Monitoring and Remote Technologies 9 Voyager St, Linbro Park, Johannesburg Tel: ;

Datasheet Fujitsu PRIMERGY CX1000 S1 with 38 cloud server nodes The Cool-Central Architecture

IBM Europe Announcement ZG , dated March 11, 2008

Case study: End-to-end data centre infrastructure management

ICT and the Green Data Centre

CUTTING-EDGE SOLUTIONS FOR TODAY AND TOMORROW. Dell PowerEdge M-Series Blade Servers

Data Center & IT Infrastructure Optimization. Trends & Best Practices. Mickey Iqbal - IBM Distinguished Engineer. IBM Global Technology Services

Transcription:

Journée Thématique Emergente EDF Clamart, 13 janvier 2011 Les aspects énergétiques du calcul Introduction to IBM tools to manage energy consumption François Thomas, Luigi Brochard [ft,luigi.brochard]@fr.ibm.com

Agenda Why IBM? The Power Cycle and Equation Power7 EnergyScale Software to Manage Power Summary 2

Agenda Why IBM? The Power Cycle and Equation Power7 EnergyScale Software to Manage Power Summary 3

Why IBM? Early in the game Consistently at the top of the Green500 list Now a strong selling point 4

Why IBM? Early in the game Blue Gene L design started in 1999-2000 First appearance in Top500 list in 2004 Before the first Green 500 list was out (2005) Consistently at the top of the Green500 list Blue Gene P : 2007 Cell Broadband Engine (RoadRunner) + BG/P : 2008-2009 Blue Gene Q : 2010 onwards SuperMUC : 2011? Our expertise in energy efficicency is now a strong selling point Purpose built (Blue Gene) Acceleratedcomputing (RoadRunner, GPU) COTS hardware (Intel based) 5

Sources : green500.org and hpcwire.com 6

LRZ + IBM Germany Smart System Cooling: Innovative Hot Water usage First High End HPC System with Hot Water Cooling Compute Nodes are cooled with hot water Inlet temperature up to 45 C Enables All-Year free cooling in Garching Aquasar Prototype No cooling aggregates (compressors) required Enables Re-Use of waste heat of system Heating or Process Energy Developed in Germany, @ IBM Böblingen Lab 7 Smarter Systems for a Smarter Planet. 2010 IBM Germany GmbH

LRZ + IBM Germany Smart Job Scheduling: Energy Aware Application Scheduling and System Management First Implementation of Energy Aware HPC Software Stack on x86 Application Energy consumption will be monitored, stored and reported to the user For a second application run, the scheduler will decide based on administrative policies Which Processor Frequency is optimal for the application Lower Frequency reduces energy consumption Currently not used system nodes will put to sleep mode or shutdown based on administrator capacity expectations 8 Smarter Systems for a Smarter Planet. 2010 IBM Germany GmbH

Agenda Why IBM? The Power Cycle and Equation Power7 EnergyScale Software to Manage Power Summary 9

The power cycle : power, compute and cool Fuel Oil 48 Hrs. Typical Generators N+1 Uninterruptible Power Supply Batteries 10-15 min UPS Cooling Towers Data Center 75F eir PDU A 85F deg water 55F deg water Static Switch A Utility Provider 2 Sources 10 PDU B 45F deg water Server Raised Floor Static Switch B 95F deg water Chillers N+1 CRAC Units Makeup Water Storage 55F deg air

Green Datacenter Market Drivers and Trends Increased green consciousness, and rising cost of power IT demand outpaces technology improvements Server energy use doubled 2000-2005; expected to increase15%/year 15 % power growth per year is not sustainable Koomey Study: Server use 1.2% of U.S. energy ICT industries consume 2% ww energy Carbon dioxide emission like global aviation Real Actions Needed Brouillard, APC, 2006 Source IDC 2006, Document# 201722, "The impact of Power and Cooling on Datacenter Infrastructure, John Humphreys, Jed Scaramella" 11 Future datacenters dominated by energy cost; half energy spent on cooling

How much does it cost? Acquisition costs vs Energy costs over 4 years Ratio of Pow er Ratio of Costs 12 Acquisition Costs IT Pow er Energy Costs Cooling Pow er

Our approach is at multiple levels micro-electronics Energy is pervasive in IBM design (especially in our journey to Exascale) Long history of energy efficient designs : SOI, SMT, edram,... Server and rack level Energy management features on all recent IBM servers Water cooling : rear door heat exchangers (idataplex) «cold plate» (Power6, Power7, BG/Q) Hot water cooling (LRZ) Software level Application tuning Unified software for power management Cluster management Power and energy aware job schedulers Data center level Centres of expertise in datacenter design Example : the Green Data Center in IBM Montpellier, France Another example : hot water cooling at IBM Boeblingen, Germany Best practices, monitoring 13

Module Heat Flux (W/cm2 ) The Power Problem 14 12 Bipolar LowPower Multicore CMOS 10 * Frequency => Power ~ Frequency3 => two cores at 80% frequency consumes as much a one core at 100% frequency. We have a frequency problem: Power per chip is constant due to cooling => multicores at constant frequency And we have a passive power problem Smaller lithography => more leakage current => more idle power 8 6 Junction Transistor 4 Integrated Circuit 2 0 1950 1960 3DI 1970 1980 1990 2000 2010 2020 2030 10 1.0E+10 10 Number of Transistors Power = Capacitance * Voltage2 9 1.0E+09 10 1 Billion 8 10 1.0E+08 ~50% CAGR 1.0E+07 107 6 10 1 Million 1.0E+06 5 10 1.0E+05 1980 14 1985 1990 1995 2000 2005 2010

Passive Power continues to explode Oxide thickness is near the limit. Traditional CMOS scaling has ended. Density improvements will continue but power efficiency from technology will only improve very slowly. Historic trend of power efficiency improvement will slow 15

Agenda Why IBM? The Power Cycle and Equation Power7 EnergyScale Software to Manage Power Summary 16

POWER7 Processor IBM s 45nm SOI process 567 mm2, 1.2B transistors 8 out-of-order cores, 4-way SMT 32KB L1 D/I, 256KB L2 per core, 32MB shared L3 in IBM s edram process 2 on-chip memory controllers, 2 pairs of buffered memory channels each Designed for blades, commercial SMPs, supercomputers 17 4X cores in similar power envelope Designed for energy-efficiency and effective power management.

Thermal, Power and Activity Sensors 44 digital thermal sensors (5 per chiplet, 4 extra-chiplet) on chip; Max chiplet thermal sensor(s) also directly available to firmware. On-board ambient temperature sensor, memory buffer/dimm thermal sensors and VRM thermal-trip logic. On-board measurement circuits and A/D channels for Performance/activity sensors 18 full system, processor socket, memory sub-system, I/O sub-system and fan power measurements Core-level usage with active cycle counts, instruction throughput counts Core-level memory hierarchy usage event-based programmable weight counters for frequency impact at high loads Memory controller-level activity requests and power-mode usage stats

Rack to Rack: Power 755 Compared to Power 575 (POWER6) Power 755 Power 575 Cores/chip 8 4 Total cores 32 32 Frequency 3.3 GHz 4.7 GHz Memory (max) 256 GB 256 GB Cooling Air Water Cores/rack Rack type 320 19 448 24 Power (Watts) (Linpack) 1650 5400 Each Power 755 node offers the same core count as Power 575 with: 40-50% Improvement in Performance Air Cooling vs. Water Cooling 1/3 of the Energy Consumption 37% Improvement in floor space for a 64 node configuration Green500 ~ 495 MFlops/Watt 19

IBM EnergyScale functions Power / Thermal Trending Collect and report power consumption, inlet and exhaust temp Power Capping Guaranteed (Hard Cap) Enforces a power cap via Dynamic Frequency and Voltage Slewing Soft Power Cap Attempted lower cap, but not guaranteed. Energy Management Modes Enhanced for P7 Static Power Save (SPS) Save power via a fixed voltage and frequency drop as much as 30% down for P7 Dynamic Power Save (DPS) Optimize power vs performance using Dynamic Voltage and Frequency Slewing Will provide performance boost at very high utilization Will save power at most utilizations Dynamic Power Save - Favor Performance (DPS-FP) Will provide performance boost at most utilizations Will save power only at very low utilization 20

High Level System Power Control View Architected Idle Instructions (Doze, Nap, ) PHYP Policy and Feedback Communication interface Sensor information (temp, current, performance) TPMD P7 Chip Mode 2B,3, 4, 5 P-state I/O 21 Fans Memory Mode 1 & 2A Idle state

Cooperative Power Management in EnergyScale System monitoring and management tools Active Energy Manager FSP Operating Systems Real-time power/thermal control, policyguided, performance-aware energy saving algorithms Dynamic resource folding and any explicit low-power mode control TPMD Hypervisor Off-chip/On-board sensors & controls 22 POWER7 Mechanisms access, low-level coordination among controllers, in-band/out-ofband comm. channel, autonomous/configurable control engines, sensors.

Agenda Why IBM? The Power Cycle and Equation Power7 EnergyScale Software to Manage Power Summary 23

Some examples IBM Active Energy Manager (AEM) Monitor the power consumption at the node/rack level Manages the power consumption (capping, trending, provisioning) IBM Research tools Much higher sampling rates than AEM Can separate CPU power, RAM power, other power Down to every VRM on a motherboard Cluster management tool Extension to xcat (extreme Cluster Cloud Administration Toolkit) To query and set power states Job Scheduler Extension to LoadLeveler Power and Energy aware job scheduling function 24

IBM Systems Director Active Energy Manager (AEM) Monitoring energy in a data center lets you begin to manage it AEM is a cornerstone of the IBM energy management framework Measure, Monitor, and control energy usage Power and Thermal Measurement Supports System x, POWER, and z System natively Supports other equipment via external sensors Integrates with Infrastructure Management Integrates with Enterprise Management 25

IBM Systems Director Active Energy Manager V4.2 AEM application supported on: Windows, AIX, and Linux (x86, POWER, and System z) Web-based user interface requiring only a browser Energy thresholding Enables a user to set an energy or temperature threshold and be notified when it is reached (or allow an action to automatically be taken) Soft power capping (an option within power capping) Ability to set a lower energy cap value to enable clients to save energy Easily set power caps on multiple systems Group capping (an option within power capping): Data to aid in server power on/off scenarios 26 Enables a user to set an energy cap for a group of servers (such as all the servers in a rack) Understand time to IPL and standby power Number of lifetime IPLs and reliability threshold (P7 only)

xcat Manage power consumption on an ad hoc basis For example, while cluster is being installed, or when there is high power consumption in other parts of the lab for a period of time Query: Power saving mode, Power capping value, power consumed info, CPU usage, fan speed, environment temperature Set: Power saving mode and Power capping value 27

Power and Energy Aware LoadLeveler 28 Goals Identify idle nodes in the cluster and put them in the lowest power mode Provide to system admins query capability on historical usage of power and energy by workload, user, etc. Reduction of energy consumption on workloads with minimal impact to performance Choices for system admin: Decide to use Energy Optimize policy or not on his system Decide the max performance degradation one application will be impacted by, if the Energy policy is applied If Energy Policy is on policy is applied only to jobs that match the performance degradation criteria System admin can query LL DB to evaluate the impact of the potential policy on performance degradation and energy saving

Summary IBM started early being hurt by working on the energy consumption of its servers. Energy management is pervasive in IBM servers design, from chips to servers to clusters to datacenters. And even more so with the trend to Exascale. Good energy management can be a key differentiator in some HPC deals. We try to tackle the problem at various levels : chip design, system design, cluster management software, job schedulers. We have monitoring tools that will work across the whole IBM portfolio of servers whatever the microprocessor architecture (IBM or Intel) or the form factor (rackable servers, blades, integrated racks) Using those tools, our customer can save quite a lot on their energy bill 29

Thank you. Questions?