Building an energy dashboard. Energy measurement and visualization in current HPC systems



Similar documents
Part I Courses Syllabus

Power and Energy aware job scheduling techniques

Performance Counter. Non-Uniform Memory Access Seminar Karsten Tausche

MAQAO Performance Analysis and Optimization Tool

D5.6 Prototype demonstration of performance monitoring tools on a system with multiple ARM boards Version 1.0

Measuring Energy and Power with PAPI

Evoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca

Jezelf Groen Rekenen met Supercomputers

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

Performance of Software Switching

Parallel Programming Survey

1 Bull, 2011 Bull Extreme Computing

STUDY OF PERFORMANCE COUNTERS AND PROFILING TOOLS TO MONITOR PERFORMANCE OF APPLICATION

Multi-Threading Performance on Commodity Multi-Core Processors

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing

HPC in Oil and Gas Exploration

The High Performance Internet of Things: using GVirtuS for gluing cloud computing and ubiquitous connected devices

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect

Exascale Challenges and General Purpose Processors. Avinash Sodani, Ph.D. Chief Architect, Knights Landing Processor Intel Corporation

Building a Top500-class Supercomputing Cluster at LNS-BUAP

FLOW-3D Performance Benchmark and Profiling. September 2012

Keys to node-level performance analysis and threading in HPC applications

Pedraforca: ARM + GPU prototype

Intel Xeon Processor E5-2600

HPC performance applications on Virtual Clusters

Hardware performance monitoring. Zoltán Majó

Performance with the Oracle Database Cloud

Big Data. Value, use cases and architectures. Petar Torre Lead Architect Service Provider Group. Dubrovnik, Croatia, South East Europe May, 2013

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

Performance Analysis and Optimization Tool

Big Data Management in the Clouds and HPC Systems

Optimizing Shared Resource Contention in HPC Clusters

NVIDIA Tools For Profiling And Monitoring. David Goodwin

End-user Tools for Application Performance Analysis Using Hardware Counters

22S:295 Seminar in Applied Statistics High Performance Computing in Statistics

Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child

The Green Index: A Metric for Evaluating System-Wide Energy Efficiency in HPC Systems

High Performance Computing. Course Notes HPC Fundamentals

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

The Top Six Advantages of CUDA-Ready Clusters. Ian Lumb Bright Evangelist

Scientific Computing Data Management Visions

Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o

Manjrasoft Market Oriented Cloud Computing Platform

Release Notes for Open Grid Scheduler/Grid Engine. Version: Grid Engine

Infrastructure Matters: POWER8 vs. Xeon x86

~ Greetings from WSU CAPPLab ~

Software Performance and Scalability

The path to the cloud training

Basics of VTune Performance Analyzer. Intel Software College. Objectives. VTune Performance Analyzer. Agenda


MPI / ClusterTools Update and Plans

Operating System Support for Multiprocessor Systems-on-Chip

Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

HPC & Big Data THE TIME HAS COME FOR A SCALABLE FRAMEWORK

Overview of HPC Resources at Vanderbilt

CHAPTER 1 INTRODUCTION

Maintaining Non-Stop Services with Multi Layer Monitoring

Enabling Technologies for Distributed Computing

The Benefits of POWER7+ and PowerVM over Intel and an x86 Hypervisor

How To Compare Amazon Ec2 To A Supercomputer For Scientific Applications

Unleashing the Performance Potential of GPUs for Atmospheric Dynamic Solvers

A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures

HPC Cloud Computing Guide PENGUIN ( ) HPC

How To Build A Cloud Computer

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

GPU Computing - CUDA

Stovepipes to Clouds. Rick Reid Principal Engineer SGI Federal by SGI Federal. Published by The Aerospace Corporation with permission.

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

Hybrid Cluster Management: Reducing Stress, increasing productivity and preparing for the future

Visit to the National University for Defense Technology Changsha, China. Jack Dongarra. University of Tennessee. Oak Ridge National Laboratory

High Performance Computing in CST STUDIO SUITE

BSC vision on Big Data and extreme scale computing

Enterprise HPC & Cloud Computing for Engineering Simulation. Barbara Hutchings Director, Strategic Partnerships ANSYS, Inc.

VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5

Xeon+FPGA Platform for the Data Center

Auto-Tunning of Data Communication on Heterogeneous Systems

Equalizer. Parallel OpenGL Application Framework. Stefan Eilemann, Eyescale Software GmbH

How To Manage Energy At An Energy Efficient Cost

Energy Management in a Cloud Computing Environment

GPUs for Scientific Computing

Toward a practical HPC Cloud : Performance tuning of a virtualized HPC cluster

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS

Transcription:

Building an energy dashboard Energy measurement and visualization in current HPC systems Thomas Geenen 1/58 thomas.geenen@surfsara.nl

SURFsara The Dutch national HPC center 2H 2014 > 1PFlop GPGPU accelerators Grid HPC Cloud Hadoop Data Services 2/58 thomas.geenen@surfsara.nl

SURFsara 3/58 thomas.geenen@surfsara.nl

SURFsara 4/58 thomas.geenen@surfsara.nl

Energy consumption 5/58 thomas.geenen@surfsara.nl Dongarra et al. Energy Footprint of Advanced Dense Numerical Linear Algebra using tile algorithms on multicore architecture PowerPack: Intel Xeon Sandy Bridge

Energy consumption 6/58 thomas.geenen@surfsara.nl Dongarra et al. Energy Footprint of Advanced Dense Numerical Linear Algebra using tile algorithms on multicore architecture PowerPack: Intel Xeon Sandy Bridge

Energy consumption 7/58 thomas.geenen@surfsara.nl Dongarra et al. Energy Footprint of Advanced Dense Numerical Linear Algebra using tile algorithms on multicore architecture PowerPack: Intel Xeon Sandy Bridge

Energy consumption 8/58 thomas.geenen@surfsara.nl Dongarra et al. Energy Footprint of Advanced Dense Numerical Linear Algebra using tile algorithms on multicore architecture PowerPack: Intel Xeon Sandy Bridge

Energy consumption 9/58 thomas.geenen@surfsara.nl Dongarra et al. Energy Footprint of Advanced Dense Numerical Linear Algebra using tile algorithms on multicore architecture PowerPack: Intel Xeon Sandy Bridge

Energy measurement Different sources Accuracy Sampling rate Overhead Processing measurements Postprocessing Visualization Interpretation 10/58 thomas.geenen@surfsara.nl

Node sensors 11/58 thomas.geenen@surfsara.nl Running average power limit (RAPL) baseboard management controller (BMC), Intelligent Platform Management Interface (IPMI)

Node sensors Different sources Direct from the CPU Running average power limit (RAPL) Performance Application Programming Interface (PAPI) From component baseboard management controller (BMC) Intel node manager Intelligent Platform Management Interface (IPMI) 14/58 thomas.geenen@surfsara.nl Running average power limit (RAPL) baseboard management controller (BMC), Intelligent Platform Management Interface (IPMI)

RAPL RAPL is not an analog power meter! RAPL uses a software power model running on a helper controller Energy is estimated using hardware performance counters temperature, leakage models and I/O models The model is used for CPU throttling, turbo-boost Values are exposed to users model-specific register (MSR) 15/58 thomas.geenen@surfsara.nl Running average power limit (RAPL) baseboard management controller (BMC), Intelligent Platform Management Interface (IPMI)

RAPL Intel Documentation indicates Energy readings are Updated roughly every millisecond (1 KHz) Rotem et al. show results match actual hardware * 16/58 thomas.geenen@surfsara.nl Rothem: Power-Management Architecture of the Intel Microarchitecture Code-Named Sandy Bridge, IEEE micro, 2012

RAPL More detailed study shows small deviations for different loads 17/58 thomas.geenen@surfsara.nl Hackenberg et al.: Power measurement techniques on standard compute nodes: a quantitative approach, IEEE, 2013

PAPI performance application programming interface (PAPI) 18/58 thomas.geenen@surfsara.nl MSRs can be accessed via /dev/cpu/*/msr

PAPI Performance application programming interface (PAPI) Read special registers (MSR) Performance counter hardware Intel, AMD, NVIDIA, ARM RAPL, APM, NVML, custom Measure energy and Flops, cycles Memory access, cache misses Ivy bridge 11 counters 19/58 thomas.geenen@surfsara.nl MSRs can be accessed via /dev/cpu/*/msr

Profiling applications Time Where is the time spend What is the application doing PAPI (hardware calls) MPI (communication between processes) OpenMP (communication between threads) Couple with energy consumption Same profile 20/58 thomas.geenen@surfsara.nl

Profiling applications 21/58 thomas.geenen@surfsara.nl

Profiling applications 22/58 thomas.geenen@surfsara.nl

Profiling applications 23/58 thomas.geenen@surfsara.nl

Profiling applications 24/58 thomas.geenen@surfsara.nl

25/58 thomas.geenen@surfsara.nl

IPMI BMC 26/58 thomas.geenen@surfsara.nl

IPMI BMC 27/58 thomas.geenen@surfsara.nl

IPMI BMC 28/58 thomas.geenen@surfsara.nl

IPMI BMC Measure energy consumption of other components Baseboard Management Controller (BMC) IPMI Low sample rate 1 4 Hz Overhead Improves On chip averaging Higher sample rate Still low 29/58 thomas.geenen@surfsara.nl Baseboard management controller (BMC) Intelligent Platform Management Interface (IPMI)

Reporting What do we want to present to the end user Can use PAPI and tools for detailed analysis Misses part of the energy consumption Information on per-run level Energy consumption per run (total) More general view (total per component) Timeline Correlate with other data PAPI and BMC 30/58 thomas.geenen@surfsara.nl

SLURM Use the job scheduler to collect energy consumption data Typical situation on HPC systems Many users on the same system Share resources Have to schedule jobs Job is put in a queue Runs when resources are available SLURM Simple Linux Utility for Resource Management Open source 31/58 thomas.geenen@surfsara.nl

SLURM Use the job scheduler to collect energy consumption data Modular design Plugins for monitoring Energy consumption RAPL IPMI Grand total Timeline Uses additional threads to collect data (IPMI) 32/58 thomas.geenen@surfsara.nl

SLURM Use the job scheduler to collect energy consumption data 33/58 thomas.geenen@surfsara.nl

SLURM Use the job scheduler to collect energy consumption data Totals in database Timeseries in file HDF5 (XML) Scalable data format Individual sensors (IPMI) RAPL External sensor 34/58 thomas.geenen@surfsara.nl

Conclusions Many sensors available on current cluster hardware Different levels of detail Many profilers available Common api PAPI Combine with performance metrics Present totals to users Combine different measurements in one file (time series) Slurm tools 35/58 thomas.geenen@surfsara.nl

QUESTIONS? 36/58 thomas.geenen@surfsara.nl