EUMEDGrid-Support Supporting EUMEDGRID-Support e-infrastructure sustainability Overview of monitoring tools



Similar documents
Information and accounting systems. Lauri Anton

Monitoring the Grid at local, national, and global levels

Retirement of glite3.1 and unsupported glite 3.2

CY-01-KIMON operational issues

ATLAS job monitoring in the Dashboard Framework

Hostname (DNS Resolvable) Network Objects

Network Monitoring. Sebastian Büttrich, NSRC / IT University of Copenhagen Last edit: February 2012, ICTP Trieste

GETTING STARTED WITH SQL SERVER

Using IPM to Measure Network Performance

Lab Conducting a Network Capture with Wireshark

Rational Software. Getting Started with Rational Customer Service Online Case Management. Release 1.0

Setting up Scan to

SESSION 1: INFRASTRUCTURE SESSIONS

How to Scale out SharePoint Server 2007 from a single server farm to a 3 server farm with Microsoft Network Load Balancing on the Web servers.

towards EGI Operations Portal

McAfee Advanced Threat Defense 3.6.0

Modern Approach for User and Service Management. Michal Procházka CESNET Czech Republic

Immersion Day. Creating an Elastic Load Balancer. Rev

About the Canon Mobile Scanning MEAP Application

Oracle Forms Developer 10g: Build Internet Applications

Using Windows Task Scheduler instead of the Backup Express Scheduler

Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services

a) Network connection problems (check these for existing installations)

SSL VPN Technology White Paper

Quickstart guide to Configuring WebTitan

PoS(EGICF12-EMITC2)091

An Introduction to the Moodle Online Learning Platform

How To Deploy Office 2016 With Office 2016 Deployment Tool

Deploying Endpoint Protection Updates Offline Using SCCM 2012 R2

Introduction to Network Monitoring and Management

AD RMS Windows Server 2008 to Windows Server 2008 R2 Migration and Upgrade Guide... 2 About this guide... 2

Grid Engine. The EPIKH Project (Exchange Programme to advance e-infrastructure Know-How)

ADSL Router Quick Installation Guide Revised, edited and illustrated by Neo

INTRODUCTION 5 COLLABORATION RIBBON 5 SELECT THE UPDATING METHOD 6 MAKE YOUR PROJECT COLLABORATIVE 8 PROCESSING RECEIVED TASK UPDATES 9

Hands-On Lab. Building a Data-Driven Master/Detail Business Form using Visual Studio Lab version: Last updated: 12/10/2010.

Importing data from Linux LDAP server to HA3969U

Perceptive Intelligent Capture Solution Configration Manager

NetBrain Enterprise Edition 6.0a NetBrain Server Backup and Failover Setup

CS 326e F2002 Lab 1. Basic Network Setup & Ethereal Time: 2 hrs

AfNOG Monitoring of IP Services. Ayitey Bulley Material generously borrowed from the NSRC NME course

Virtual Clusters as a New Service of MetaCentrum, the Czech NGI

vcenter Chargeback User s Guide vcenter Chargeback 1.0 EN

The EU DataGrid Information and Monitoring Services

Configuring an IP (SIP) Polycom Soundstation on the Avaya IP Office

Creating Value through Innovation MAGENTO 1.X TO MAGENTO 2.0 MIGRATION

rpaf KTl enterprise Grid Control 11gR1: Business Oracle Enterprise Manager Service Management services using Oracle Enterprise Manager 11gR1

Client/Server Grid applications to manage complex workflows

Note: With v3.2, the DocuSign Fetch application was renamed DocuSign Retrieve.

Grids & networks monitoring - practical approach

AUTHENTICATION... 2 Step 1:Set up your LDAP server... 2 Step 2: Set up your username... 4 WRITEBACK REPORT... 8 Step 1: Table structures...

Lab - Building an Internet of Things Application Hands-On Lab

Contents. Introduction. Prerequisites. Requirements. Components Used

Greenplum Database (software-only environments): Greenplum Database (4.0 and higher supported, or higher recommended)

Step-by-step installation guide for monitoring untrusted servers using Operations Manager ( Part 3 of 3)

Astaro User Portal: Getting Software and Certificates Astaro IPsec Client: Configuring the Client...14

Enabling Collaboration Using the Biomedical Informatics Research Network (BIRN)

Avaya Video Conferencing Manager Deployment Guide

Setup Guide: Server-side synchronization for CRM Online and Exchange Server

Summer Webinar Series Network Monitoring Probe Virtual Appliance

Special Note Ethernet Connection Problems and Handling Methods (CS203 / CS468 / CS469)

Synchronizer Installation

Online Application Instruction Document

Personal Telepresence. Place the VidyoPortal/VidyoRouter on a public Static IP address

MapCenter: An Open Grid Status Visualization Tool

Microsoft Office System Tip Sheet

Installation Manual. ihud for iracing. 1 MightyGate

Learn how to create web enabled (browser) forms in InfoPath 2013 and publish them in SharePoint InfoPath 2013 Web Enabled (Browser) forms

CONNECTING THE RASPBERRY PI TO A NETWORK

14.1. bs^ir^qfkd=obcib`qflk= Ñçê=emI=rkfuI=~åÇ=léÉåsjp=eçëíë

IBM Information Server

PC120 ALM Performance Center 11.5 Essentials

Configuration of Enterprise Services using SICF and SOA Manager

Robust & Reliable DNS Operations Logging & Monitoring

DEPLOYMENT GUIDE Version 1.2. Deploying F5 with Oracle E-Business Suite 12

Managing Identities and Admin Access

Network Security: Workshop. Dr. Anat Bremler-Barr. Assignment #2 Analyze dump files Solution Taken from

SQL Server Analysis Services Complete Practical & Real-time Training

CloudCIX Bootcamp. The essential IaaS getting started guide.

Setting up VPN connection: DI-824VUP+ with Windows PPTP client

Administrator s Guide ALMComplete Support Ticket Manager

Type Message Description Probable Cause Suggested Action. Fan in the system is not functioning or room temperature

Grandstream Networks, Inc.

Load Balancing. Outlook Web Access. Web Mail Using Equalizer

Integration Guide. LogicNow MAXfocus

HOW TO CREATE AN HTML5 JEOPARDY- STYLE GAME IN CAPTIVATE

Sitecore E-Commerce Cookbook

GridICE: monitoring the user/application activities on the grid

Chapter 15: Forms. User Guide. 1 P a g e

PANDORA FMS NETWORK DEVICES MONITORING

Router Setup Manual. NETGEAR, Inc Great America Parkway Santa Clara, CA USA

Qvis Security Technical Support Field Manual LX Series

Rochester Institute of Technology. Finance and Administration. Drupal 7 Training Documentation

GT WS MDS WebMDS: User's Guide

INTEGRATION GUIDE. DIGIPASS Authentication for Juniper SSL-VPN

Cloud services in PL-Grid and EGI Infrastructures

OSG PUBLIC STORAGE. Tanya Levshina

Owner of the content within this article is Written by Marc Grote

Cre-X-Mice Database. User guide

CCProxy. Server Installation

Dashboard applications to monitor experiment activities at sites

Transcription:

EUMEDGrid-Support Supporting EUMEDGRID-Support e-infrastructure sustainability Overview of monitoring tools Fulvio Galeazzi GARR Amman, November 24th 2011 EUMEDGRID-Support ROC-training school

Contents Will be a hands-on session, describing: Tool to Describe your site: GOC-DB Tool to Monitor site: Nagios Tool to perform Simple network measurements: Smokeping Will not go much in detail, but will provide more information for the curious ones 2

1. [ GOC-DB ]

GOC-DB: what it is Grid Operations Center DataBase Stores (static) site information (responsible persons, email addresses, resources,...) Very important to spend a minute to fill it correctly, since other tools depend on it

GOC-DB: main page https://gocdb.africa-grid.org/portal/index.php

GOC-DB: requesting role Write access to GOCDB is restricted. Make sure your X509 certificate is loaded in the browser, then click Manage Roles in left column. Choose site Choose privilege Send email to Riccardo, Mario, Fulvio.

GOC-DB: browse sites

GOC-DB: endpoints Each site provides one or more endpoints, or services: sitebdii, UI, CE, SE, LFC, WMS, LB,... Select a site, scroll down to see endpoints : namely, services installed at sites

GOC-DB: add and describe endpoint Site-admins can insert a new endpoint: please pay attention to the Service Type : your CE should be identified as CREAM-CE! Monitored set to Y triggers Nagios monitoring.

2. [ Nagios ]

What is Nagios? Nagios is the tool to monitor: Availability: is site/service up and properly configured? Reliability: is site/service really working? Nagios runs automatically: Gets site description (and downtimes) from GOCDB Schedules a number of tests ( ping, LDAP queries, real GRID jobs) Moreover, Nagios offers a web interface to get to test results, history,...

Nagios pre-requisites If you like your site to be green make sure: Site is properly described in GOCDB, site status is Certified and service status is Monitored Site supports ops VO: not only on services, but also on WNs Services are up, running, functional Note: please make sure you set your site to Certified in GOCDB (and/or service to Monitored ) only when they are really working for ops VO

Nagios: how the thing works

Nagios and the NGIs Nagios is the official tool used within EGI Infrastructure for distributed monitoring submission, transport, storage and visualization of probes relies on existing technologies (Nagios, ActiveMQ, Django) deployed at each NGI Results are used to measure the availability and reliability of EGI sites

Nagios tests Tests are scheduled automatically If you have rights, you can trigger execution of checks (HANDLE WITH CARE: do not Force checks) A hierarchy exists May be confusing/worrying sometimes Make sure you understand/guess whether there may be an upstream cause for the problem you noticed https://tomtools.cern.ch/confluence/display/sam/probes+org.sam

Nagios for EUMEDGrid https://nagios.africa-grid.org/nagios/

Nagios: hosts view

Nagios: host status

Nagios: host status detail

Nagios: host groups

Nagios: MyEGI https://nagios.africa-grid.org/myegi/

MyEGI: Gridmap view

3. [ Smokeping ]

Smokeping: what it is A tool for monitoring network latency Time for ping : every 5 minutes, send 20 packets and measure packet loss, round-trip-time Extended to measure time to reply to a command: every 5 minutes, executes ldapsearch and measure time to reply and number of times command is unanswered Keeps history of tests Implements a star configuration From smokeping server to the world Cannot be aware of problems between sites Configuration is manual: contact grid-tech@garr.it 24

Smokeping: network latency https://dpm2-4.dir.garr.it/smokeping/smokeping.cgi 25

Smokeping: network site view 26

Smokeping: host view 27

Smokeping: central services 28

Smokeping: site services 29

Conclusions Quite some information is available May be too much... information is useful only if you check it :-) Need to have a to-do list for ROC shifters Can be mutuated from existing ones Want to host any of these tools in your site/country? Just ask and we'll be glad to help! 30