http://www.grnet.gr GRNET NOC network monitoring & TF-NOC Zurich Alex Kosiaris (alex@noc.grnet.gr) Leonidas Poulopoulos (leopoul@noc.grnet.gr)
Network monitoring Constant monitoring of network for components Failing Malfunctioning Notification of users via mail, sms, web interface Monitoring aids in Preventing or limiting downtime Effectively tracing errors and coping with them Translating from machine errors to user friendly ones Keeping performance, errors and faults archive 2
Visualization Network Topology Clearer view of network topology Easier reading of links and relations Services Better anticipation of service deployment Effective and appealing marketing Any network related set of data Traffic, Errors, Service requests Charts, Maps, Graphs, Tables 3
Tools Tool Requirements Developed (if possible) in-house using the same (ifpossible)development framework with a widely deployed and accepted network mgmt middleware Maintained in-house Bound together using a common data source With the least possible overhead to the network devices, services What we do 70% of the tools is developed in-house Python/Django: 70%, PHP: 30% Use SNMP (99%) to harvest network data Release updates every 2-3 months Use a MySQL database to bind data Avoid live SNMP queries to devices 4
Tools (2) NMS & Monitoring Alcatel & Adva NMS HP OV Nagios, Munin, mrtg/rrd In-house grnetdb (~150 tables) MySQL Topology, devices features discovery: SNMP, PHP, custom RDBMS layer (4 times/day) Visualization: Google Maps API, Django framework, DOJO JS framework, JSON data serialization Graphs: Django framework, jquery, rrd Monitoring: Nagios with data feed from grnetdb 5
DJANGO framework Architecture GRNET PHP SNMP getters Network discovery core functionality GRNET RDBMS Device Graphs Network Topology Rancid Nagios Ticketing (Jira) CLI check scripts grnetdb MySQL Hostmaster H/W Inventory L1 topology builder Widgets 6
Device Graphs http://mon.grnet.gr/rg Django (Python) templates - backend jquery & jquery UI Mobile flavour (jquery mobile) Network device configuration retrieved from grnetdb Poll devices using a smart algorithm (minimize overhead) RRD graphs (rrdtool Python) Minor administrative interference Devices and ifces discovered by PHP SNMP script automatically Device graph types determined automatically Personalization Custom search engine Version releases every 2-3 months Not open sourced yet Abstraction layer has to be implemented Rewrite parts of code get rid of GRNET-specific parts 7
Device Graphs (2) Mobile flavor will be soon released 8
Network Topology http://mon.grnet.gr/network/maps/ Google Maps API v2 (soon to be ported to v3) Django Framework DOJO Javascript Framework Data serialized to JSON and fed to API Topology (L1, L2, L3) Network weathermap (live data from rrd files) Points of presence GRNET clients Lightweight edition eases integration with other apps TF-NOC, GRNET Zurich NOC network monitoring & 9
Network Topology (2) GRNET NOC network monitoring & TF-NOC, Zurich 10
Nagios based Common Infrastructure Network Alarming Django Python tool to query grnetdb and generate configuration Servers/Services Populated through automation tool Puppet An effort to maximize SNR Notifications go to interested parties only Web interface supports authorization so only relevant information is available Plans for Load Balancing/HA setup 11
Visualization Alarming (2) Standard interface not topology aware Hates circular (aka rendundant) links Nagios map cgi ugly Information decimation difficult Nagvis to the rescue Supports multiple maps Maps web editable Visual and audible alarming Still a work in progress 12
Alarming (3) 13
Thank you Questions? TF-NOC, Zurich GRNET NOC network monitoring & 14