D5.6 Prototype demonstration of performance monitoring tools on a system with multiple ARM boards Version 1.0

Similar documents
STUDY AND SIMULATION OF A DISTRIBUTED REAL-TIME FAULT-TOLERANCE WEB MONITORING SYSTEM

Building an energy dashboard. Energy measurement and visualization in current HPC systems

STUDY OF PERFORMANCE COUNTERS AND PROFILING TOOLS TO MONITOR PERFORMANCE OF APPLICATION

Performance Monitoring of Parallel Scientific Applications

Part I Courses Syllabus

DB2 Connect for NT and the Microsoft Windows NT Load Balancing Service

Monitoring Infrastructure for Superclusters: Experiences at MareNostrum

End-user Tools for Application Performance Analysis Using Hardware Counters

Pedraforca: ARM + GPU prototype

NVIDIA Tools For Profiling And Monitoring. David Goodwin

A Brief Survery of Linux Performance Engineering. Philip J. Mucci University of Tennessee, Knoxville

White Paper. Real-time Capabilities for Linux SGI REACT Real-Time for Linux

Using PAPI for hardware performance monitoring on Linux systems

Hadoop. Sunday, November 25, 12

Performance Counter. Non-Uniform Memory Access Seminar Karsten Tausche

Medical Device Design: Shorten Prototype and Deployment Time with NI Tools. NI Technical Symposium 2008

How To Build A Cloud Computer

IOVU-571N ARM-based Panel PC

GETTING STARTED WITH ANDROID DEVELOPMENT FOR EMBEDDED SYSTEMS

RCL: Software Prototype

PAPI - PERFORMANCE API. ANDRÉ PEREIRA ampereira@di.uminho.pt

Virtual Machines.

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A Scalable Network Monitoring and Bandwidth Throttling System for Cloud Computing

GEDAE TM - A Graphical Programming and Autocode Generation Tool for Signal Processor Applications

ON THE IMPLEMENTATION OF ADAPTIVE FLOW MEASUREMENT IN THE SDN-ENABLED NETWORK: A PROTOTYPE

ORACLE DATABASE 10G ENTERPRISE EDITION

Optimizing Shared Resource Contention in HPC Clusters

MAQAO Performance Analysis and Optimization Tool

Machine Learning Algorithm for Pro-active Fault Detection in Hadoop Cluster

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

Equalizer. Parallel OpenGL Application Framework. Stefan Eilemann, Eyescale Software GmbH

BSC vision on Big Data and extreme scale computing

The virtualization of SAP environments to accommodate standardization and easier management is gaining momentum in data centers.

Linux Performance Optimizations for Big Data Environments

How To Monitor Performance On A Microsoft Powerbook (Powerbook) On A Network (Powerbus) On An Uniden (Powergen) With A Microsatellite) On The Microsonde (Powerstation) On Your Computer (Power

Operating System for the K computer

GPU Profiling with AMD CodeXL

Virtualization for Cloud Computing

Base One's Rich Client Architecture

Weekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay

Exploring Oracle E-Business Suite Load Balancing Options. Venkat Perumal IT Convergence

Weighted Total Mark. Weighted Exam Mark

Embedded and mobile fingerprint. technology. FingerCell EDK

Prototyping Connected-Devices for the Internet of Things. Angus Wong

Integration of the OCM-G Monitoring System into the MonALISA Infrastructure

How To Visualize Performance Data In A Computer Program

JReport Server Deployment Scenarios

The ganglia distributed monitoring system: design, implementation, and experience

Clusters: Mainstream Technology for CAE

Network for Sustainable Ultrascale Computing (NESUS)

Cloud Computing with Red Hat Solutions. Sivaram Shunmugam Red Hat Asia Pacific Pte Ltd.

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

Monitoring DKRZ

Detection of Distributed Denial of Service Attack with Hadoop on Live Network

Regional SEE-GRID-SCI Training for Site Administrators Institute of Physics Belgrade March 5-6, 2009

Introduction to Gluster. Versions 3.0.x

Application Performance Monitoring of a scalable Java web-application in a cloud infrastructure

Grid Scheduling Dictionary of Terms and Keywords

D5.5 Intermediate Report on Porting and Tuning of System Software to ARM Architecture Version 1.0

HPC performance applications on Virtual Clusters

Tools and strategies to monitor the ATLAS online computing farm

Perf Tool: Performance Analysis Tool for Linux

A SURVEY ON AUTOMATED SERVER MONITORING

Multi-Threading Performance on Commodity Multi-Core Processors

Visualizing gem5 via ARM DS-5 Streamline. Dam Sunwoo ARM R&D December 2012

Performance Analysis and Optimization Tool

PEPPERDATA OVERVIEW AND DIFFERENTIATORS

Big Data Management in the Clouds and HPC Systems

CDH AND BUSINESS CONTINUITY:

Embedded Linux RADAR device

Interoperability Testing and iwarp Performance. Whitepaper

Automating Big Data Benchmarking for Different Architectures with ALOJA

Intel DPDK Boosts Server Appliance Performance White Paper

WHITE PAPER. ClusterWorX 2.1 from Linux NetworX. Cluster Management Solution C ONTENTS INTRODUCTION

Industry First X86-based Single Board Computer JaguarBoard Released

CHAPTER 1 INTRODUCTION

Open Source in Network Administration: the ntop Project

Using Linux Clusters as VoD Servers

Dynamic Thermal Management for Distributed Systems

Rackspace Cloud Databases and Container-based Virtualization

Oracle BI Publisher Enterprise Cluster Deployment. An Oracle White Paper August 2007

Cluster, Grid, Cloud Concepts

Manjrasoft Market Oriented Cloud Computing Platform

RCL: Design and Open Specification

Cloud Computing. Up until now

INTRODUCTION ADVANTAGES OF RUNNING ORACLE 11G ON WINDOWS. Edward Whalen, Performance Tuning Corporation

Software Distributed Shared Memory Scalability and New Applications

Transcription:

D5.6 Prototype demonstration of performance monitoring tools on a system with multiple ARM boards Document Information Contract Number 288777 Project Website www.montblanc-project.eu Contractual Deadline M24 Dissemintation Level PU Nature Prototype Coordinator Alex Ramirez (BSC) Contributors Chris Adeniyi-Jones (ARM) Reviewers Petar Radojković (BSC) Keywords Performance monitoring tools, PAPI, perf, Ganglia Notices: The research leading to these results has received funding from the European Community s Seventh Framework Programme (FP7/2007-2013) under grant agreement n o 288777 c 2013 Mont-Blanc Consortium Partners. All rights reserved.

Change Log Version Description of Change v0.1 Version released to Internal Reviewer v1.0 Final version sent to EU 2

Contents Executive Summary 4 1 Introduction 5 2 Performance Application Programming Interface 6 3 Performance Monitoring Tools 6 3.1 perf............................................. 6 3.2 Ganglia........................................... 6 4 Demonstration 7 3

Executive Summary In this deliverable, we present the current status of the low-level software components required for gathering information about the performance of HPC applications running on ARM-based systems. This work will enable performance monitoring tools to be ported to the Mont-Blanc prototype. 4

1 Introduction With its aim to build a large-scale HPC cluster using embedded mobile technology, a substantial part of the Mont-Blanc project consists of porting and tuning applications to the target architecture. Tuning these applications requires detailed performance information and tools to visualize that information. One of the goals of Work package 5 of the Mont-Blanc project is to enable the use of ARM performance monitoring hardware for detailed analysis of performance of HPC applications. This report summarizes our work in this regard. The rest of this report is organized as follows: Section 2 summarizes the status of the Performance Application Programming Interface (PAPI) [pap] for the ARM architecture. PAPI library provides a consistent interface and methodology for using the performance counter hardware found in modern microprocessors. Section 3 describes various performance monitoring tools for the ARM architecture. Section 4 outlines the plan for a demonstration of performance monitoring tools running on several ARM-based boards. 5

2 Performance Application Programming Interface The current release of PAPI is 5.2.0 and supports the ARM architecture using the perf events driver on Linux kernels 2.6.32 and above. The perf events interface is built into these kernels and can be used directly requiring no patches. PAPI provides two interfaces to the underlying counter hardware; a simple, high level interface for the acquisition of simple measurements and a fully programmable, low level interface directed towards users with more sophisticated needs. The low level PAPI interface deals with hardware events in groups called EventSets. EventSets reflect how the counters are most frequently used, such as taking simultaneous measurements of different hardware events and relating them to one another. 3 Performance Monitoring Tools 3.1 perf The perf [per] userspace tools use the perf events kernel ABI to provide access to hardware performance counters. They can be used as a basis for profiling applications to trace dynamic control flow and identify hotspots. 3.2 Ganglia Ganglia [Gan] is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. It leverages widely used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization. It uses carefully engineered data structures and algorithms to achieve very low per-node overheads and high concurrency. The implementation is robust, has been ported to an extensive set of operating systems and processor architectures, and is currently in use on thousands of clusters around the world. It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes. We have installed the Ganglia monitoring system on an Arndaleboard and tested that the default metrics can be measured and reported via the Ganglia web-frontend. We have also tested the addition of custom metrics such as CPU temperature and CPU clock frequency. 6

Figure 1: Example output from the Ganglia web-front end monitoring a single Arndaleboard 4 Demonstration We plan to demonstrate the Ganglia tool being used to monitor several Arndaleboards. We will also show how the userspace perf tools can be used for low-level profiling of applications. 7

References [Gan] Ganglia monitoring system. http://ganglia.sourceforge.net/. [pap] Performance Application Programming Interface. http://icl.cs.utk.edu/papi. [per] perf: Linux profiling with performance counters. http://perf.wiki.kernel.org/ index.php/main_page. 8