Power Efficiency Metrics for the Top500. Shoaib Kamil and John Shalf CRD/NERSC Lawrence Berkeley National Lab



Similar documents
Energy Efficient High Performance Computing Power Measurement Methodology. (version 1.2RC2)

Green HPC - Dynamic Power Management in HPC

Server Technology, Inc.

Data Center Energy Efficiency Looking Beyond PUE

Calculating power requirements for HP ProLiant rack-mounted systems

Measuring Power in your Data Center: The Roadmap to your PUE and Carbon Footprint

Performance In the Cloud. White paper

Holistic, Energy Efficient Datacentre Cardiff Going Green Can Save Money

How To Compare Amazon Ec2 To A Supercomputer For Scientific Applications

Power Distribution Considerations for Data Center Racks

The Green Index: A Metric for Evaluating System-Wide Energy Efficiency in HPC Systems

TUE, a new energy-efficiency metric applied at ORNL's Jaguar

Storage Systems Can Now Get ENERGY STAR Labels and Why You Should Care. PRESENTATION TITLE GOES HERE Dennis Martin President, Demartek

Performance of HPC Applications on the Amazon Web Services Cloud

ENERGY STAR Program Requirements Product Specification for Computer Servers. Test Method Rev. Apr-2013

Data Center Specific Thermal and Energy Saving Techniques

Metering and Managing University of Sheffield Data Centres. Chris Cartledge Independent Consultant

Performance Monitoring on NERSC s POWER 5 System

File System Suite of Benchmarks

Debunking some Common Misconceptions of Science in the Cloud

Power Management in the Cisco Unified Computing System: An Integrated Approach

Green IT Solutions From Desktop to Datacenter. Dave Spear Sr. Engagement Manager & Sustainability Advocate dave.spear@microsoft.

Minimization of Costs and Energy Consumption in a Data Center by a Workload-based Capacity Management

Managing Data Center Power and Cooling

ECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009

Power Benchmarking: A New Methodology for Analyzing Performance by Applying Energy Efficiency Metrics

CoolEmAll - Tools for realising an energy efficient data centre

Energy Efficient MapReduce

Energy Efficient High-tech Buildings

Datacenter Power Delivery Architectures : Efficiency and Annual Operating Costs

A Holistic Model of the Energy-Efficiency of Hypervisors

Datacenter Design Issues

Solve your IT energy crisis WITH An energy SMArT SoluTIon FroM Dell

Sizing guide for SAP and VMware ESX Server running on HP ProLiant x86-64 platforms

Increasing Data Center Efficiency by Using Improved High Density Power Distribution

Energy-aware job scheduler for highperformance

How green is your data center?

Clusters: Mainstream Technology for CAE

Reducing Data Center Energy Consumption

Dynamic Power Variations in Data Centers and Network Rooms

Response Time: The time required for the UUT to complete an I/O request.

Statistical Profiling-based Techniques for Effective Power Provisioning in Data Centers

High-Efficiency AC Power Distribution for Data Centers

Cloud Computing through Virtualization and HPC technologies

A Practical Method for Estimating Performance Degradation on Multicore Processors, and its Application to HPC Workloads.

How To Make Your Computer More Efficient

Characterizing Task Usage Shapes in Google s Compute Clusters

Panasas High Performance Storage Powers the First Petaflop Supercomputer at Los Alamos National Laboratory

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)

White Paper on Consolidation Ratios for VDI implementations

Parallel Analysis and Visualization on Cray Compute Node Linux

Data Center Energy Efficiency. SC07 Birds of a Feather November, 2007 Bill Tschudi wftschudi@lbl.gov

How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (

Green Data Center Program

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and IBM FlexSystem Enterprise Chassis

Chapter 7: Distributed Systems: Warehouse-Scale Computing. Fall 2011 Jussi Kangasharju

Dynamic Power Variations in Data Centers and Network Rooms

SPC BENCHMARK 2/ENERGY EXECUTIVE SUMMARY ORACLE CORPORATION ORACLE ZFS STORAGE ZS3-2 APPLIANCE (2-NODE CLUSTER) SPC-2/E V1.5

Energy Efficiency Opportunities in Federal High Performance Computing Data Centers

Performance And Scalability In Oracle9i And SQL Server 2000

Improve Power saving and efficiency in virtualized environment of datacenter by right choice of memory. Whitepaper

Low Power AMD Athlon 64 and AMD Opteron Processors

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

Step by Step Guide To vstorage Backup Server (Proxy) Sizing

Last time. Data Center as a Computer. Today. Data Center Construction (and management)

Linux Cluster Computing An Administrator s Perspective

COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1)

Data Center Infrastructure Management: How to Optimize Data Center Operations

Case study: End-to-end data centre infrastructure management

Convergence of Complexity

Server Technology, Inc.

Power efficiency and power management in HP ProLiant servers

How High Temperature Data Centers & Intel Technologies save Energy, Money, Water and Greenhouse Gas Emissions

Host Power Management in VMware vsphere 5

CS 147: Computer Systems Performance Analysis

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

AC vs DC Power Distribution for Data Centers

DC Cook Plant Process Computer Replacement Project

SAS deployment on IBM Power servers with IBM PowerVM dedicated-donating LPARs

WHITE PAPER Server Innovations: Examining DC Power as an Alternative for Increasing Data Center Efficiency and Reliability

Performance Testing. Slow data transfer rate may be inherent in hardware but can also result from software-related problems, such as:

Big Data Management in the Clouds and HPC Systems

Comprehensive Job Analysis. With MPG s Performance Navigator

Data Center Leadership In the DOE Better Buildings Challenge

The Methodology Behind the Dell SQL Server Advisor Tool

Kashif Iqbal - PhD Kashif.iqbal@ichec.ie

CLUSTER COMPUTING TODAY

benchmarking Amazon EC2 for high-performance scientific computing

General Recommendations for a Federal Data Center Energy Management Dashboard Display

Intelligent Power Optimization for Higher Server Density Racks

The Efficient Data Center

The following are general terms that we have found being used by tenants, landlords, IT Staff and consultants when discussing facility space.

The Revival of Direct Attached Storage for Oracle Databases

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect

IT Cost Savings Audit. Information Technology Report. Prepared for XYZ Inc. November 2009

DCIM Readiness on the Rise as Significant Data Center Capacity Remains Unused. A Research Report from the Experts in Business-Critical Continuity TM

Measuring Processor Power

ENERGY STAR Servers Server Energy Use Evaluation Resources February 23, Introduction

SUN ORACLE EXADATA STORAGE SERVER

Managing Adaptability in Heterogeneous Architectures through Performance Monitoring and Prediction

Transcription:

Power Efficiency Metrics for the Top500 Shoaib Kamil and John Shalf CRD/NERSC Lawrence Berkeley National Lab

Power for Single Processors

HPC Concurrency on the Rise Total # of Processors in Top15 350000 300000 250000 Processors 200000 150000 100000 50000 0 Jun-93 Dec-93 Jun-94 Dec-94 Jun-95 Dec-95 Jun-96 Dec-96 Jun-97 Dec-97 Jun-98 Dec-98 Jun-99 Dec-99 Jun-00 Dec-00 Jun-01 Dec-01 Jun-02 Dec-02 Jun-03 Dec-03 Jun-04 Dec-04 Jun-05 Dec-05 Jun-06 List

HPC Power Draw on the Rise 4000.00 3000.00 3500.00 Growth in Power Consumption (Top50) Excluding Cooling System System Power Power (kw) (kw) 2500.00 3000.00 2000.00 2500.00 2000.00 1500.00 1500.00 1000.00 1000.00 max pwr Avg. Power Top5 avg pwr Avg. Power Top50 500.00 500.00 0.00 Jun-00 Jun-00 Sep-00 Sep-00 Dec-00 Dec-00 Mar-01 Mar-01 Jun-01 Jun-01 Sep-01 Sep-01 Dec-01 Dec-01 Mar-02 Mar-02 Jun-02 Jun-02 Sep-02 Sep-02 Dec-02 Dec-02 Mar-03 Mar-03 Jun-03 Jun-03 Sep-03 Sep-03 Dec-03 Dec-03 Mar-04 Mar-04 Jun-04 Jun-04 Sep-04 Sep-04 Dec-04 Dec-04 Mar-05 Mar-05 Jun-05 Jun-05 Sep-05 Sep-05 Dec-05 Dec-05 Mar-06 Mar-06 Jun-06 Jun-06

Broad Objective Use Top500 List to track power efficiency trends Raise Community Awareness of HPC System Power Efficiency Push vendors toward more power efficient solutions by providing a venue to compare their power consumption

Specific Proposal Require all Top500 sites to report system power consumption under a LINPACK workload Establish rules of engagement to govern fair collection of the data Must establish data collection procedures to make the data collection easy and have little impact on center operations We wish to convince you that this can be done with minimal pain If it cannot, we want to hear your input so that the rules can be drafted in a way that will work for ALL Top500 respondents

What do we mean by Easy We will try to prove to you that one can measure from a single node or cabinet and project the power consumption for the overall system At cabinet and node level, You do not have to take your entire system out of service to measure power under LINPACK sample a few representative pieces running proportionally smaller copies of LINPACK and project it to the full system scale Can use simpler/less-expensive power measurement apparatus

Many Ways to Measure Power Clamp meters +: easy to use, don t need to disconnect test systems, wide variety of voltages -: very inaccurate for more than one wire Inline meters +: accurate, easy to use, can output over serial -: must disconnect test system, limited voltage, limited current Power panels / PDU panels Unknown accuracy, must stand and read, usually coarse-grained (unable to differentiate power loads) Sometimes the best or only option: can get numbers for an entire HPC system Integrated monitoring in system power supplies (Cray XT) +: accurate, easy to use - : only measures single cabinet. Must know power supply conversion efficiency to project nominal power use at wall socket

Testing our Methodology Look at power usage using variety of synthetic and real benchmarks Memory intensive : STREAM CPU intensive: HPL/Linpack IO intensive: IOZone, MADbench Simulated workloads: NAS PB, NERSC SSP Compare single node vs cabinet/cluster vs entire system Is power consumed when running LINPACK similar to that of a real workload? Does power consumed by LINPACK change with concurrency? Can we predict full system power from cabinet/node power?

Single Node Tests: AMD Opteron Highest power usage is 2x NAS FT and LU

Similar Results when Testing Other CPU Architectures Core Duo AMD Opteron IBM PPC970/G5 Power consumption far less than manufacturer estimated nameplate power Idle power much lower than active power Power consumption when running LINPACK is very close to power consumed when running other compute intensive applications

Full System Test Entire System Power Usage 1400 1200 1000 Kilowatts 800 600 400 STREAM HPL Throughput No idle() Idle() loop Catamount CNL 200 0 Time--> Tests run across all 19,353 compute cores Throughput: NERSC realistic workload composed of full applications idle() loop allows powersave on unused processors; (generally more efficient)

Single Rack Tests Single Cabinet Power Usage 300 250 Amps (@ 52 VDC) 200 150 100 STREAM GTC FT.D LU.D CG.D paratec pmemd HPL 50 0 Time --> Administrative utility gives rack DC amps & voltage HPL & Paratec are highest power usage

Modeling the Entire System: AC to DC Conversion Commodity desktop machines are ~75% efficient Google uses new, 90% efficient power supplies Our test system has has ~90% efficient power supplies

Modeling the Entire System Actual, Watts AC Full System Power Usage, Model Using Actual vs. Single Rack x Num Racks 1.40E+06 90% Eff 80% Eff 1.20E+06 Model, with FS (est. 50 MW) 1.00E+06 8.00E+05 Watts 6.00E+05 4.00E+05 2.00E+05 0.00E+00 idle STREAM HPL Error factor is 0.05 if we assume 90% efficiency

Conclusions Power utilization under an HPL/Linpack load is a good estimator for power usage under mixed workloads for single nodes, cabinets / clusters, and large scale systems Idle power is not Nameplate and CPU power are not LINPACK running on one node or rack consumes approxmimately same power as the node would consume if it were part of full-sys parallel LINPACK job We can estimate overall power usage using a subset of the entire HPC system and extrapolating to total number of nodes using a variety of power measurement techniques And the estimates mostly agree with one-another! Disk subsystem is a small fraction of overall power (50-60KW vs 1,200 KW) Disk power dominated by spindles and power supplies Idle power for disks not significantly different from active power

Top500 Power Data Collection Measure System Power when running LINPACK If measured on circuit for full system (eg. PDU or Panel), then run LINPACK on full system If cannot differentiate system from other devices sharing same circuit, then isolate components using line meter or inductive clamp meter (can borrow from local Power Co.) If measured from line meter or clamp meter, then just run on one rack, measure representative components comprising system and extrapolate to entire system If measured from integral power supply, account for power supply losses in projections Target of projections is RMS AC wall-socket power consumed by HPC system Must convert measurements of DC power consumption accordingly

Top500 Power Data Collection What to include All components comprising delivered HPC system aside from external disk subsystem (eg. SAN) Can extrapolate from measurement of said components while under a LINPACK load (even if load is local) What to exclude Exclude cooling Exclude PDU and other power conversion infrastructure losses that are not part of the deliverd HPC system Exclude disk subsystem (if not integral): should discuss this further

Complementary Efforts Our Effort creates a metric for compute-intensive parallel scientific workloads Metrics for I/O intensive workloads: JouleSort by HP Labs Metrics for transactional workloads: EPA EnergyStar Server Metrics Re-ranking of Top500 for Power Efficiency: http://www.green500.org/

Single Node Tests: IO Highly variable, less than compute-only Very difficult to assess power draw for I/O

Modeling the Entire System: Disks Must take into account disk subsystem Drive model matters Deskstar 9.6W idle, 13.6W under load Tonka 7.4W idle, 12.6W under load Using DDN-provided numbers, estimated power draw for model disk subsystem is 50KW idle, 60KW active Observed using PDU panel: ~48KW idle