Report on Project: Advanced System Monitoring for the Parallel Tools Platform (PTP)

Similar documents
Mitglied der Helmholtz-Gemeinschaft. System monitoring with LLview and the Parallel Tools Platform

A highly configurable and efficient simulator for job schedulers on supercomputers

Welcome to the. Jülich Supercomputing Centre. D. Rohe and N. Attig Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich

JUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert

A QUICK OVERVIEW OF THE OMNeT++ IDE

Online Transaction Processing in SQL Server 2008

New Features in Primavera P6 EPPM 16.1

Slide.Show Quick Start Guide

Designing portal site structure and page layout using IBM Rational Application Developer V7 Part of a series on portal and portlet development

LOCKSS on LINUX. CentOS6 Installation Manual 08/22/2013

DeployStudio Server Quick Install

Product Comparison List

Exporting from WebSphere Business Modeler Unit 23

Communiqué 4. Standardized Global Content Management. Designed for World s Leading Enterprises. Industry Leading Products & Platform

Microsoft Project 2010

Technical Report. Implementation and Performance Testing of Business Rules Evaluation Systems in a Computing Grid. Brian Fletcher x

COURSE CONTENT Big Data and Hadoop Training

Microsoft Enterprise Search for IT Professionals Course 10802A; 3 Days, Instructor-led

Virto Pivot View for Microsoft SharePoint Release User and Installation Guide

Distributed Computing and Big Data: Hadoop and MapReduce

Parallel file I/O bottlenecks and solutions

Diagram 1: Islands of storage across a digital broadcast workflow

Mathematical Libraries on JUQUEEN. JSC Training Course

JustClust User Manual

Configuring Celerra for Security Information Management with Network Intelligence s envision

VectaStar NMS A GUIDE TO VECTASTAR NETWORK MANAGEMENT

Reporting Services. White Paper. Published: August 2007 Updated: July 2008

Developing Parallel Applications with the Eclipse Parallel Tools Platform

DEVELOPMENT OF AN ANALYSIS AND REPORTING TOOL FOR ORACLE FORMS SOURCE CODES

Access Teams with Microsoft Dynamics CRM 2013

SQL Server 2008 Performance and Scale

UFTP High-performance data transfer for UNICORE

Using DeployR to Solve the R Integration Problem

Building Lab as a Service (LaaS) Clouds with TestShell

Department of Veterans Affairs. Open Source Electronic Health Record Services

ILMT Central Team. Performance tuning. IBM License Metric Tool 9.0 Questions & Answers IBM Corporation

10231B: Designing a Microsoft SharePoint 2010 Infrastructure

SCADA Questions and Answers

Chapter 7. Using Hadoop Cluster and MapReduce

Mathematical Libraries and Application Software on JUROPA and JUQUEEN

Integrating Microsoft Word with Other Office Applications

ProxySG TechBrief Implementing a Reverse Proxy

WebSphere Business Monitor

POOSL IDE User Manual

CA Spectrum. Virtual Host Manager Solution Guide. Release 9.3

Parallel Visualization of Petascale Simulation Results from GROMACS, NAMD and CP2K on IBM Blue Gene/P using VisIt Visualization Toolkit

NNMi120 Network Node Manager i Software 9.x Essentials

<no narration for this slide>

NoSQL and Hadoop Technologies On Oracle Cloud

Introduction to NaviGenie SDK Client API for Android

Rocket AS v6.3. Benefits of upgrading

User's Guide - Beta 1 Draft

Big Data Visualization with JReport

automates system administration for homogeneous and heterogeneous networks

Introduction to Synoptic

SourceAnywhere Service Configurator can be launched from Start -> All Programs -> Dynamsoft SourceAnywhere Server.

Microsoft Services Exceed your business with Microsoft SharePoint Server 2010

Advanced planning / change impact features

Attix5 Pro Server Edition

ANDROID APPS DEVELOPMENT FOR MOBILE AND TABLET DEVICE (LEVEL I)

IBM Platform Computing : infrastructure management for HPC solutions on OpenPOWER Jing Li, Software Development Manager IBM

What s New in IBM Web Experience Factory IBM Corporation

Datasheet iscsi Protocol

How To Create An Integrated Visualization For A Network Security System (For A Free Download)

CVE-401/CVA-500 FastTrack

WebSphere Business Monitor

Table Of Contents: I. MapifyPro: Installation. II. General Overview & License Activation. III. Map Settings. IV. Map Location Settings. V.

Google

Elgg 1.8 Social Networking

Ansible in Depth WHITEPAPER. ansible.com

MicroStrategy Desktop MicroStrategy 10.2: New features overview. microstrategy.com 1

IBM Operational Decision Manager Version 8 Release 5. Getting Started with Business Rules

<Insert Picture Here> Oracle Application Express 4.0

LOCKSS on LINUX. Installation Manual and the OpenBSD Transition 02/17/2011

Integration of the OCM-G Monitoring System into the MonALISA Infrastructure

Running a Workflow on a PowerCenter Grid

Important Notice. (c) Cloudera, Inc. All rights reserved.

vrealize Operations Manager User Guide

Software-defined Storage Architecture for Analytics Computing

Continuous Integration

Bringing Big Data into the Enterprise

DAVE Usage with SVN. Presentation and Tutorial v 2.0. May, 2014

Managing a local Galaxy Instance. Anushka Brownley / Adam Kraut BioTeam Inc.

Tableau Your Data! Wiley. with Tableau Software. the InterWorks Bl Team. Fast and Easy Visual Analysis. Daniel G. Murray and

SharePoint Comparison of Features

IBM Tivoli Netcool Configuration Manager 6.3 Administration and Implementation

Tivoli Enterprise Portal

Monitoring PostgreSQL database with Verax NMS

NetIQ Operations Center 5: The Best IT Management Tool in the World Lab

Red Hat Enterprise Portal Server: Architecture and Features

ZEN LOAD BALANCER EE v3.04 DATASHEET The Load Balancing made easy

Transcription:

Mitglied der Helmholtz-Gemeinschaft Report on Project: Advanced System Monitoring for the Parallel Tools Platform (PTP) September, 2014 Wolfgang Frings and Carsten Karbach

Project progress Server caching Customized display layouts Optimized job localization Usage history and scheduling prediction September, 2014 Wolfgang Frings and Carsten Karbach 2 30

Content 1 LLview/PTP overview 2 Server caching 3 Display layouts 4 Optimized job localization 5 Embed LLview diagrams September, 2014 Wolfgang Frings and Carsten Karbach 3 30

Mitglied der Helmholtz-Gemeinschaft Part I: LLview/PTP overview September, 2014 Wolfgang Frings and Carsten Karbach

1. LLview/PTP overview LLview Visualizes supercomputer status on a single screen Source: Screenshot LLview for JUQUEEN September, 2014 Wolfgang Frings and Carsten Karbach 5 30

1. LLview/PTP overview PTP Parallel Tools Platform What is PTP? IDE for parallel application development Based on Eclipse Open-source project Developers: IBM, U.Oregon, UTK, Heidelberg University, NCSA, UIUC, JSC,... September, 2014 Wolfgang Frings and Carsten Karbach 6 30

1. LLview/PTP overview Monitoring Architecture Supercomputer Client llview-client update request LML data gatherer LML_da LML response PTP September, 2014 Wolfgang Frings and Carsten Karbach 7 30

Mitglied der Helmholtz-Gemeinschaft Part II: Server caching September, 2014 Wolfgang Frings and Carsten Karbach

2. Server caching Architecture multiple users on the same target system cache LML file in public directory (e.g. /tmp), use LML cache as data source default: each client triggers independent status data update caching: daemon retrieves status data, clients use cached data Cache workflow September, 2014 Wolfgang Frings and Carsten Karbach 9 30

2. Server caching Implementation LML caching is included in latest Eclipse release Luna (July 14) from http://www.eclipse.org/downloads/ Implementation is documented on bug 427386 Install daemon, once for each system 1 Create monitoring connection and start it 2 SSH to target system, switch to.eclipsesettings/util/install 3 Call install script: perl installcrontab.pl 4 Run util/install/lml da crontab.sh every minute September, 2014 Wolfgang Frings and Carsten Karbach 10 30

2. Server caching Install step 1 Start monitoring connection September, 2014 Wolfgang Frings and Carsten Karbach 11 30

2. Server caching Install steps 2, 3, 4 Install daemon and run it September, 2014 Wolfgang Frings and Carsten Karbach 12 30

2. Server caching Client usage Default update mode: active caching If no daemon installed no caching Users can deactivate caching via Force Data Update September, 2014 Wolfgang Frings and Carsten Karbach 13 30

2. Server caching Advantages of caching Faster response time System Cache [s] No Cache [s] JUDGE 1.1 2.8 JUROPA 5.8 11.1 JUQUEEN 1.7 34.4 Recording of history data is possible Enhancement with data not accessible to normal user Decreased load for the system September, 2014 Wolfgang Frings and Carsten Karbach 14 30

Mitglied der Helmholtz-Gemeinschaft Part III: Display layouts September, 2014 Wolfgang Frings and Carsten Karbach

3. Display layouts Customized LML layouts understand system architecture and hierarchy map topology into LML layout advantages: level of detail, automatic job filtering, display node names, improved performance workplan: tutorial on LML layouts, contact system administrators of partner XSEDE/PRACE sites, support writing of LML layouts, ask for feedback September, 2014 Wolfgang Frings and Carsten Karbach 16 30

3. Display layouts Example JUDGE Default layout September, 2014 Wolfgang Frings and Carsten Karbach 17 30

3. Display layouts Example JUDGE Customized layout alternating background colors improved grid for square display highlighted GPUs September, 2014 Wolfgang Frings and Carsten Karbach 18 30

3. Display layouts Example JUROPA Default layout September, 2014 Wolfgang Frings and Carsten Karbach 19 30

3. Display layouts Example JUROPA Customized layout four level hierarchy: partition, rack, node, core level of detail (only with layout) better overview September, 2014 Wolfgang Frings and Carsten Karbach 20 30

3. Display layouts Example JUQUEEN Default layout September, 2014 Wolfgang Frings and Carsten Karbach 21 30

3. Display layouts Example JUQUEEN Customized layout colored rows do not display rack names September, 2014 Wolfgang Frings and Carsten Karbach 22 30

3. Display layouts Scalability 1 Scalable server scripts LML da Select only required data 2 Scalable data format LML Data compression Redundancy avoidance Uses system hierarchy 3 Scalable visualization PTP Filter status data Levels of detail Zoom function September, 2014 Wolfgang Frings and Carsten Karbach 23 30

Mitglied der Helmholtz-Gemeinschaft Part IV: Optimize job localization September, 2014 Wolfgang Frings and Carsten Karbach

4. Optimize job localization Job selection raised in bug 403060 allow selection of multiple jobs keep selected job selected until it is de-selected mark entire connected area of each job Source: https://bugs.eclipse.org/bugs/attachment.cgi?id=228316 September, 2014 Wolfgang Frings and Carsten Karbach 25 30

Mitglied der Helmholtz-Gemeinschaft Part V: Embed LLview diagrams September, 2014 Wolfgang Frings and Carsten Karbach

5. Embed LLview diagrams LLview diagrams histograms prediction load history Challenges Re-implementation in Java? Double update for LLview and PTP September, 2014 Wolfgang Frings and Carsten Karbach 27 30

5. Embed LLview diagrams First implementation step September, 2014 Wolfgang Frings and Carsten Karbach 28 30

5. Embed LLview diagrams Evaluation Advantages expecting little implementation effort single source of diagram implementation use of SVG graphics within PTP and web client SVG graphics can contain tags for interactive data analysis Problems and risks adding LLview-client to PTP source, license analysis of SVG performance and support September, 2014 Wolfgang Frings and Carsten Karbach 29 30

Contact E-mail: c.karbach@fz-juelich.de, w.frings@fz-juelich.de PTP http://www.eclipse.org/ptp/ LLview http://www.fz-juelich.de/jsc/llview September, 2014 Wolfgang Frings and Carsten Karbach 30 30