Lustre Monitoring with OpenTSDB



Similar documents
New and Improved Lustre Performance Monitoring Tool. Torben Kling Petersen, PhD Principal Engineer. Chris Bloxham Principal Architect

Lustre & Cluster. - monitoring the whole thing Erich Focht

Lustre * Filesystem for Cloud and Hadoop *

Scaling Graphite Installations

Fine-grained File System Monitoring with Lustre Jobstat

Lustre performance monitoring and trouble shooting

High Performance Computing OpenStack Options. September 22, 2015

A New Quality of Service (QoS) Policy for Lustre Utilizing the Lustre Network Request Scheduler (NRS) Framework

April 8th - 10th, 2014 LUG14 LUG14. Lustre Log Analyzer. Kalpak Shah. DataDirect Networks. ddn.com DataDirect Networks. All Rights Reserved.

Performance Comparison of Intel Enterprise Edition for Lustre* software and HDFS for MapReduce Applications

Jason Hill HPC Operations Group ORNL Cray User s Group 2011, Fairbanks, AK

Performance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems

Time-Series Databases and Machine Learning

Real-time Big Data Analytics with Storm

Experiences with Lustre* and Hadoop*

Current Status of FEFS for the K computer

Building Scalable Big Data Infrastructure Using Open Source Software. Sam William

Towards Smart and Intelligent SDN Controller

Unified Batch & Stream Processing Platform

Monitoring Tools for Large Scale Systems

HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013

Spectrum Scale. Problem Determination. Mathias Dietz

Monitoring Remedy with BMC Solutions

How To Scale Out Of A Nosql Database

MS 20487A Developing Windows Azure and Web Services

Clusters in the Cloud

Log management with Logstash and Elasticsearch. Matteo Dessalvi

CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Performance Testing of Big Data Applications

Peers Techno log ies Pv t. L td. HADOOP

Hadoop MapReduce over Lustre* High Performance Data Division Omkar Kulkarni April 16, 2013

XpoLog Center Suite Data Sheet

An Alternative Storage Solution for MapReduce. Eric Lomascolo Director, Solutions Marketing

Monitoring Infrastructure for Superclusters: Experiences at MareNostrum

Big Data for Big Science. Bernard Doering Business Development, EMEA Big Data Software

Hitachi Content Platform as a Continuous Integration Build Artifact Storage System

Big Data. Facebook Wall Data using Graph API. Presented by: Prashant Patel Jaykrushna Patel

Big Data Management and Security

Big data blue print for cloud architecture

高 通 量 科 学 计 算 集 群 及 Lustre 文 件 系 统. High Throughput Scientific Computing Clusters And Lustre Filesystem In Tsinghua University

Realtime Apache Hadoop at Facebook. Jonathan Gray & Dhruba Borthakur June 14, 2011 at SIGMOD, Athens

Benchmarking Hadoop & HBase on Violin

Business Application Services Testing

the missing log collector Treasure Data, Inc. Muga Nishizawa

Hadoop & Spark Using Amazon EMR

Hadoop* on Lustre* Liu Ying High Performance Data Division, Intel Corporation

STeP-IN SUMMIT June 2014 at Bangalore, Hyderabad, Pune - INDIA. Performance testing Hadoop based big data analytics solutions

Evaluation of Nagios for Real-time Cloud Virtual Machine Monitoring

A Scalable Data Transformation Framework using the Hadoop Ecosystem

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Enabling High performance Big Data platform with RDMA

Big Data Analytics - Accelerated. stream-horizon.com

LMT Lustre Monitoring Tools

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

PTC System Monitor Solution Training

ITG Software Engineering

Big Data Technologies Compared June 2014

Oracle Big Data SQL Technical Update

Quick Reference Selling Guide for Intel Lustre Solutions Overview

EOFS Workshop Paris Sept, Lustre at exascale. Eric Barton. CTO Whamcloud, Inc Whamcloud, Inc.

HPC data becomes Big Data. Peter Braam

Scalable Architecture on Amazon AWS Cloud

SCF/FEF Evaluation of Nagios and Zabbix Monitoring Systems. Ed Simmonds and Jason Harrington 7/20/2009

Dell Reference Configuration for Hortonworks Data Platform

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam

End Your Data Center Logging Chaos with VMware vcenter Log Insight

Wait, How Many Metrics? Monitoring at Quantcast

This module provides an overview of service and cloud technologies using the Microsoft.NET Framework and the Windows Azure cloud.

Cloud3DView: Gamifying Data Center Management

Database Scalability and Oracle 12c

Lustre SMB Gateway. Integrating Lustre with Windows

Axibase Time Series Database

Moving From Hadoop to Spark

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

vrealize Operations Manager Customization and Administration Guide

Cray Lustre File System Monitoring

Near Real Time Indexing Kafka Message to Apache Blur using Spark Streaming. by Dibyendu Bhattacharya

Monitoring HP OO 10. Overview. Available Tools. HP OO Community Guides

Big Data Analytics - Accelerated. stream-horizon.com

Application Performance Monitoring of a scalable Java web-application in a cloud infrastructure

Best Practices for Hadoop Data Analysis with Tableau

New Storage System Solutions

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Bernd Ahlers Michael Friedrich. Log Monitoring Simplified Get the best out of Graylog2 & Icinga 2

Tuning WebSphere Application Server ND 7.0. Royal Cyber Inc.

Benchmarking Cassandra on Violin

Cloud Bursting with SLURM and Bright Cluster Manager. Martijn de Vries CTO

Load Testing Analysis Services Gerhard Brückl

Improve performance and availability of Banking Portal with HADOOP

Application Performance Testing Basics

Wisdom from Crowds of Machines

Transcription:

Lustre Monitoring with OpenTSDB 2015/9/22 DataDirect Networks Japan, Inc. Shuichi Ihara

2 Lustre Monitoring Background Lustre is a black box Users and Administrators want to know what s going on? Find Crazy Jobs in advance to prevent slow down. Lustre statistics are valuable big data Not only monitoring and visualization, but also analysis Predictable operations might become possible. Helps optimize applications and data relocation. Open Source based monitoring tool In general, open source is common in HPC system and it s straightforward. Various combinations are possible and will enable new use cases.

3 Components of Lustre Monitoring Flexible data collector (monitoring agent) Collects statistics from Lustre /proc and sends them to the monitoring server over the network. Runs on servers as well as client and routers. Data store Receives stats from agents and stores them into the database. This can be historic and query-able data Interface for data analytics Collected data is not only for visualization, but also for analytics. Application I/O analytics, filesystem analytics, etc.

4 Flexible Data Collector (Monitoring Agent) Lots of agents exist to collect Lustre performance statistics collectd is a reasonable options Actively developed, supported and documented Running on many Enterprise/HPC system Written in C with over 90 plugins are available. Supports many backend database for data store. Unfortunately, a Lustre plugin was not available, so we developed one. Publish the lustre-plugin codes? Yes, we want to do so, but there were several discussions at LUG15. (e.g. stats in /proc or /sys in the future?)

5 Collectd Lustre Plugin Design Our framework consists of two core components Core logic layer (Lustre plugin) Statistics definition layer(xml file and XML parser) XML for Lustre /proc information A single XML file for all definitions of Lustre data collection No need to maintain massive error-prone scripts. Extendable without core logic layer change. Easy to support multiple Lustre version and Lustre distributions in the same cluster. XML file automatically generated from m4 style definition file

6 Architecture of lustre-plugin and enhanced collected configuration Lustre 2.1 XML collectd (lustre-plugin) DDN Lustre 2.5 XML collectd (lustre-plugin) GPFS XML collectd (gpfs-plugin) XML DDN H/W collectd (ddn-plugin) Standard collectd DDN Enhanced collectd core features and standard plugin framework (CPU, memory, Network, etc... lot of type statics supported) OpenTSDB plugin Graphite plugin Ganglia plugin Zabbix plugin Nagios plugin OpenTSDB Server Graphite Server gmetad Server Zabbix Server Nagios Server Support many datastore

7 Scalable Backend Data Store RDD and SQL based data stores do not scale RDD works well on small system, but writing 10M statics into files are very challenging (few million IOPS!) SQL is faster than RDD, but still hits next level scalability, while database design is complex. NoSQL based key-value stores are best OpenTSDB/Hbase. Key, value and tags are easy adaption for Lustre statics data store. No need for complex database schema. Retention awareness is needed need policy for managing a stats data archive

8 OpenTSDB Open Source Time Series Database OpenTSDB is a distributed DB (multiple TSDs), running on top of HBase. It is scaling very well, depending on the underlying Hadoop cluster. Trillion data point store is easily possible. Many ingest and query options are available A lot of open source based agents are available (collectd has OpenTSDB plugin). GUI (part of OpenTSDB) or other open source visualization tools (Grafana) CLI, HTTP API

9 Adaption of Lustre stats into TimeSeries Lustre and time series... Metrics o Defined each metric with each Lustre s proc entry (e.g. /proc/fs/ lustre/mdt/*/md_stats) Tags o Items in each proc entry (e.g. open, close, getattr in md_stats) o Additional items FS name, OST, MDT index number, etc.. Data Points o collectd sends packed metric, tags and value as a data point o (e.g.) tsdb_name md_stats, tsdb_tags optype=getattr fs_name=scratch3 mdt_index=mdt0000, value 2

10 1.7 Trillion Data Points in 24 hours Developed stress plugin for collectd Generating dummy stats with collectd For regression tests and benchmark tool of Lustre monitoring. It works in conjunction with other collectd plugins. Stress test on Lustre-plugin and OpenTSDB Setup two nodes hadoop/hbase cluster, with OpenTSDB runing on top of it. 16 metrics generators (servers) generated total of 20M stats every second and send two the OpenTSDB servers. Passed 24 hours stress test and stored 1.7 Trillion stats without any problems.

11 Application aware I/O monitoring Scalable backend data store Now, we have scalable backend data store OpenTSDB. Store any type of mercies whatever we want to collect. Lustre Job stats is awesome, but need to be integration. Lustre JOB stats feature is useful, but administrator is not interested in I/O stats just only based on JOBID. (Array jobs. Job associates with another jobs, e.g. Genmic pipeline) Lustre performance stats should be associated with all JOBID/GID/UID/NID or custom any IDs.

12 A realistic example on Lustre What happened here? Who, or What jobs caused burst I/O? But, often it s not a single job or a single user. Thus: What are top10 users/groups and jobs in an active Lustre I/O file system?

13 TopN Query with OpenTSDB Implemented TopN Query based on OpenTSDB # topn -m mdt_jobstats_samples -r 1434973270 -s -t slurm_job_uid! Time(ms): 2015-06-22 20:41:25.000000 Interval: 5(s)! rate fs_name mdt_index slurm_job_uid fqdn slurm_job_id slurm_job_gid! 15264.00 scratch1 MDT0000 1044 mds06 13290 1044! 15076.80 scratch1 MDT0000 1045 mds06 13286 1045! 13812.40 scratch1 MDT0000 1049 mds06-1049! 13456.80 scratch1 MDT0000 1048 mds06-1048! 9180.80 scratch1 MDT0000 1050 mds06 13285 1050! 8909.40 scratch1 MDT0000 1047 mds06 13289 1047! 8779.60 scratch1 MDT0000 502 mds06-503! 5049.00 scratch1 MDT0000 501 mds06 13291 502! Not only monitoring, but also a diagnostic tool! Quaries can be issued live of for specific time periods. (Rewind feature) Lustre job stats associated with all UID/GID/JOBID/NID (or custom ID), rather than just one, is stored into OpenTSDB.

14 User reference(1) Tokyo Institute of Technology More than 1.2M stats into single monitoring server every 30 sec 14 Lustre servers, 154 OSTs for 3 lustre filesystems 1700 clients mount all 3 filesystems Lustre-2.1 is running. No jobstats! But collecting client based stats from export directory in /proc on Lusre servers. (Total stats = #OST * #Client * #metrics) Demonstrated more than half Trillion stats stored into OpenTSDB over 6 months.

15 User reference (2) Okinawa Institute of Science and Technology Graduate University 3PB Lustre filesystem Single Lustre filesystem (12 OSSs, 108 OSTs and over 400 clients) Lustre-2.5 base Lustre jobstats integrated with SLURM, running on production system Unique Lustre Jobstats configuration with Collectd Lustre plugin that runs on existing on Jobstats framework. Collect jobs stats associated with all UID/GID/JOBID are stored into OpenTSDB. TopN feature helps to find a root causes for unexpected burst I/O (i.e. who and what jobs caused problems).

16 Conclusions Developed new lustre plugin of collected. It s flexible, extendable and easy maintainable. Designed a Lustre monitoring framework based on a lustre collectd-plugin and OpenTSDB. The framework is running on several production systems to resolve today s lustre monitoring limitation. Demonstrated 1.7 Trillion data store into OpenTSDB in 24 hours. We will continue scalable testing for multi-trillion data store in a few hours. Started the investigation of log data re-use for analysis.

17 Thank you!