Monitoring an HP platform based solution and SNMPv2/v3 alarm forwarding and synchronization with an existing NMS



Similar documents
mbits Network Operations Centrec

Management, Logging and Troubleshooting

Simple Network Management Protocol

Cisco Application Networking Manager Version 2.0

A Brief. Introduction. of MG-SOFT s SNMP Network Management Products. Document Version 1.3, published in June, 2008

OnCommand Performance Manager 1.1

O p e n N M S. Marcin Rybacki OpenNMS

NetCrunch 6. AdRem. Network Monitoring Server. Document. Monitor. Manage

Cisco CRS-1/IOS-XR Device Management 3.5.2: Based on Cisco Active Network Abstraction Software

TELE 301 Network Management

PANDORA FMS NETWORK DEVICE MONITORING

PANDORA FMS NETWORK DEVICES MONITORING

NMS300 Network Management System

How To Set Up Foglight Nms For A Proof Of Concept

SNMP and OpenNMS. Part 2 OpenNMS

TimePictra Release 10.0

Cisco Active Network Abstraction 4.0

PostgreSQL Clustering with Red Hat Cluster Suite

Junos Space. Virtual Appliance Deployment and Configuration Guide. Release 14.1R2. Modified: Revision 2

Cisco Prime Data Center Network Manager Release 6.1

Evaluation of Enterprise Data Protection using SEP Software

Installation Guide. Copyright (c) 2015 The OpenNMS Group, Inc. OpenNMS SNAPSHOT Last updated :19:20 EDT

Network Management and Monitoring Software

HP Insight Remote Support

Configuring an OpenNMS Stand-by Server

Network Manager 6.1. Network operations management software. NEC Corporation

ActiveXperts Network Monitor. White Paper

Maintaining Non-Stop Services with Multi Layer Monitoring

MONITORING EMC GREENPLUM DCA WITH NAGIOS

SNMP SECURITY A CLOSER LOOK JEFFERY E. HAMMONDS EAST CAROLINA UNIVERSITY ICTN 6865

Deploying the BIG-IP LTM with the Cacti Open Source Network Monitoring System

1 Data Center Infrastructure Remote Monitoring

Centralized Orchestration and Performance Monitoring

OnCommand Unified Manager 6.3

An Overview of SNMP on the IMG

Cisco Unified CM Disaster Recovery System

PowerVC 1.2 Q Power Systems Virtualization Center

CA Virtual Assurance/ Systems Performance for IM r12 DACHSUG 2011

0DQDJLQJ#0XOWLVHUYLFH#1HWZRUNV

Oracle Communications Session Delivery Manager

CA Nimsoft Monitor. snmptd Guide. v3.0 series

OnCommand Unified Manager 6.2

ANS Monitoring as a Service. Customer requirements

CA Nimsoft Monitor. snmpcollector Release Notes. All versions

Diagnostics and Troubleshooting Using Event Policies and Actions

VMware vcenter Log Insight Getting Started Guide

RUGGEDCOM NMS for Linux v1.6

Release Notes for Fuel and Fuel Web Version 3.0.1

securityprobe 5E Standard

Application Discovery Manager User s Guide vcenter Application Discovery Manager 6.2.1

Disaster Recovery System Administration Guide for Cisco Unified Communications Manager Release 8.5(1)

SEP Disaster Recovery and Backup Restore: Best

Chapter 18. Network Management Basics

Monitoring VMware-based Virtual Infrastructures with OpenNMS

Network Monitoring with SNMP

How to manage non-hp x86 Windows servers with HP SIM

TPAf KTl Pen source. System Monitoring. Zenoss Core 3.x Network and

Preinstallation Requirements Guide

I N S T A L L A T I O N M A N U A L

Vidi NMs Network Management

FileNet System Manager Dashboard Help

Cisco Active Network Abstraction Gateway High Availability Solution

C7000 ENCLOSURES PDF

Active Directory - User, group, and computer account management in active directory on a domain controller. - User and group access and permissions.

shortcut Tap into learning NOW! Visit for a complete list of Short Cuts. Your Short Cut to Knowledge

Clustered Data ONTAP 8.3

Vistara Lifecycle Management

Heroix Longitude Quick Start Guide V7.1

Server & Application Monitor

Operations Manager: Network Monitoring

OpenClovis Product Presentation

Configuring SNMP Monitoring

OnCommand Performance Manager 1.1

IM and Presence Disaster Recovery System

Integrating SAP BusinessObjects with Hadoop. Using a multi-node Hadoop Cluster

PZVM1 Administration Guide. V1.1 February 2014 Alain Ganuchaud. Page 1/27

ENC Enterprise Network Center. Intuitive, Real-time Monitoring and Management of Distributed Devices. Benefits. Access anytime, anywhere

Remote Monitoring Unit SC8100. Monitoring Unit SC8100

RUGGEDCOM NMS. Monitor Availability Quick detection of network failures at the port and

INTRODUCTION TO CLOUD MANAGEMENT

Veritas Cluster Server

VectaStar NMS A GUIDE TO VECTASTAR NETWORK MANAGEMENT

Network Monitoring with SNMP

IBM BladeCenter H with Cisco VFrame Software A Comparison with HP Virtual Connect

HP OpenView Network Node Manager

Disaster Recovery System Administration Guide for Cisco Unified Contact Center Express Release 8.0(2)

SNMP. Simple Network Management Protocol

Cisco Advanced Services Network Management Systems Architectural Leading Practice

An Introduction to Service Containers

Managing your Red Hat Enterprise Linux guests with RHN Satellite

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

Enterprise IT is complex. Today, IT infrastructure spans the physical, the virtual and applications, and crosses public, private and hybrid clouds.

APPLICATION NOTES High-Availability Load Balancing with the Brocade ServerIron ADX and McAfee Firewall Enterprise (Sidewinder)

SIMPLE NETWORK MANAGEMENT PROTOCOL (SNMP)

EMC Integrated Infrastructure for VMware

AKIPS Network Monitor Installation, Configuration & Upgrade Guide Version 15. AKIPS Pty Ltd

How To Use Mindarray For Business

HP SiteScope 11.x Essentials

Transcription:

Monitoring an HP platform based solution and SNMPv2/v3 alarm forwarding and synchronization with an existing NMS Roberto Pulvirenti (roberto.pulvirenti@gfmnet.it) Powered by Antonio Russo March 15th, 2013

Background A vendor/system integrator provided a Customer Experience Management (CEM) solution to one of the biggest mobile operator, able to load, aggregate and transform the data from the OpCos in a central repository where OpCos can schedule ad-hoc reports. The solution included an Element Management System offering fault and performance management functions based on OpenNMS, which was configured and customized in order to: monitor with the adequate depth Oracle processes, HP Brocade SAN switches, Cisco switches, HP Blade servers, HP EVA 4400 storage, OS resources (disks, memory ), other relevant processes and services. be installed and deployed in high availability exploiting the clustering functionality offered by the operating system (Rhel Cluster). act as single point of integration for the northbound Umbrella Management System (Netcool) forward to Netcool only the alarms that are evaluated as relevant for VF according to configurable filter criteria forward traps via SNMPv3 or SNMPv2 and according to a MIB wrapping the alarm format managed by OpenNMS. feature a trap based heartbeat functionality just to periodically notify Netcool that OpenNMS is alive implement a trap based synchronization functionality that allows Netcool to synchronize its alarms on demand www.gfmintegration.com 2

Hardware Architecture SAI CORE Reporting Server BOE Server ETL Server BODS Server SAI CORE Reporting Server BOE Server ETL Server BODS Server OpenNMS/FTM LAN switches SNMP Oracle Real Application Cluster OpenNMS/FTM SAN switches SAN Disk 6 x Blade: 507864-B21 HP BL460c G6 2 x Cisco Catalyst 3020 Blade Switch for HP c-class BladeSystem 1x StorageWorks: HP EVA4400 2 x HP Brocade 8/24c SAN Switch for HP c-class BladeSystem 1 x HP OnBoard Admin Console www.gfmintegration.com 3

High level logical architecture IBM netcool/ ITNM Global NOC in Germany Traps Local NOC in Spain OpenNMS Traps/polling SAI Application incl. Oracle File Transfer Application VGE Application Infrastructure Components e.g. ETL jobs, reporting, B&R e.g. retry limit exceeded, user unknown e.g. filter failed e.g. HP Server, storage, cisco, tapes www.gfmintegration.com 4

Deployment 2 Blade servers in active-standby availability exploiting RHEL5.3 cluster OCFS2 as general-purpose shared-disk cluster file system Clustered resources: o o opennms-rg (including PostgreSQL and OpenNMS) with its own VIP ftm-rg (a file transfer/parsing application required by the project) with its own VIP OpenNMS Version 1.8.10 [root@nms opennms]# ll $OPENNMS_HOME total 36 drwxr-xr-x 2 root root 4096 May 17 10:23 bin drwxr-xr-x 8 root root 4096 May 23 13:44 contrib lrwxrwxrwx 1 root root 16 Mar 24 16:28 etc -> /app/opennms/etc drwxrwxr-x 7 root root 4096 May 18 15:05 etc.orig drwxr-xr-x 5 root root 4096 Mar 22 19:45 jetty-webapps drwxr-xr-x 8 root root 20480 May 17 00:50 lib lrwxrwxrwx 1 root root 22 Mar 24 16:30 logs -> /app/opennms/data/logs lrwxrwxrwx 1 root root 23 Mar 24 16:30 share -> /app/opennms/data/share Filesystem Size Mounted on Notes /dev/mapper/fc5p1 /dev/mapper/fc5p2 /dev/mapper/fc5p3 12 GB /app/opennms/etc OpenNMS configuration files 51 GB /app/opennms/data OpenNMS logs and rrd files 60 GB /app/opennms/pgsql PostgreSQL database www.gfmintegration.com 5

Delivered packages and main installation steps NSN provides the following packages for installing OpenNMS for CEM solution: Opennms_1.8.10.tar.gz. This contains the the packages for installing OpenNMS (release 1.8.10) without any customized upgrades FWSYNC_1.8.10_2.0.5.tar.gz. This contains the jar files that upgrade OpenNMS to cover customer s requirements. config_template_1.0.0.tar.gz. This provides some configuration files proposed as template for the customized solution. VFGUI.tar.gz. This provides few files that update the OpenNMS WEB GUI with logo and colours that better recall the customer style. Setup yum repository with ISO RHEL CDROM image on OpenNMS nodes just to make easy installation of OS packages. Install net-snmp package on all Cem servers Setup yum local repository related to OpenNMS on OpenNMS nodes Install HP SNMP agents on all Cem servers Install SUN JDK package on OpenNMS nodes Install Postgresql-9.0 on OpenNMS nodes Install OpenNMS on OpenNMS nodes Install the NSN customization for forwarding and synchronize alarms with Netcool on OpenNMS nodes. Deploy configuration files proposed as templates in order to speed up the required configuration and following provisioning process. This should be applied on the shared storage so needs to be executed only from one node. Add Vodafone logo in the GUI on OpenNMS nodes Cluster installation and configuration www.gfmintegration.com 6

OpenNMS features exploited for monitoring CEM solution Eventd: All HW and applications MIBs have been properly analyzed and an accurate file excel has been written to describe relevant traps to be properly alarmed and deduplicated or cleared. Event XML files have been added or heavily changed to reflect the excel file: CPQHPIM.events.xml (CPQHLTH-MIB, CPQRACK-MIB, CPQRPM-MIB, CPQHOST.MIB, CPQSTSYS.MIB, CPQSINFO.MIB, CPQSTDEQ.MIB, CPQCMC.MIB, CPQSM2.MIB, CPQNIC.MIB, CPQIDA.MIB, CPQFCA.MIB, CPQIODRV.MIB, EVA4400_ABM.MIB) Brocade.fcmgmt.events.xml Cisco.events.xml / Cisco2.events.xml FTM.events.xml (File Transfer Manager application) SAI.events.xml (Serve atonce Intelligence application) Capsd (capsd-configuration.xml), Pollerd (poller-configuration.xml), Collectd (collectd-configuration.xml, snmp-config.xml, datacollectionconfig.xml), Threshd (threshd-configuration.xml, thresholds.xml, programmatic.events.xml), Event Translator (translator-configuration.xml, Service.translator.events.xml ), Provisiond. www.gfmintegration.com 7

What is monitored exactly? Group name in opennms NodeLabel Description - OS services (NTPd, SSH, rgmanager, cman, HP SNMP agents...). specemusnm01p - HW/OS system traps of this server. - Threshold event for RAM, CPU, disks, eth port utilization OpenNMSFTM - OS services (NTPd, SSH, rgmanager, cman, HP SNMP agents). specemusnm02p - HW/OS system traps of this server. - Threshold event for RAM, CPU, disks, eth port utilization - Monitoring FTM services availability (services associated to ftm-rg specemusftp00p like LDAP, GUI service). No traps VIPs SAI specemusmgmtnm00p specemusbo01p specemusbo02p - Monitoring the availability of the OpenNMS VIP. No traps - OS services (NTPd, SSH, HP SNMP agents...). - HW/OS system traps of this server. - Threshold event for RAM, CPU, disks, eth port utilization - OS services (NTPd, SSH, HP SNMP agents...). - HW/OS system traps of this server. - Threshold event for RAM, CPU, disks, eth port utilization specemusrep00p - Monitoring some Cem services (SAI admin) in high availability VIPs specemusadm00p - Monitoring other Cem services (SAI reporting) in high availability - OS services (NTP, SFTP ), Oracle services specemusdb01p - HW/OS system traps of this server - Threshold event for RAM, CPU, disks, eth port utilization Oracle - OS services (NTP, SFTP ), Oracle services specemusdb02p - HW/OS system traps of this server - Threshold event for RAM, CPU, disks, eth port utilization specemusfcsw01p - Monitoring Brocade SAN switches (SNMP, ICMP) FcSwitches - system traps of this device specemusfcsw02p - Threshold event for RAM, CPU, disks, eth port utilization CiscoSwitches EvaStorage Console specemusobadm01p - Monitoring Admin console (SNMP, ICMP) specemussw302001p specemussanadm01p - - Monitoring Cisco switches (SNMP, ICMP) Monitoring Eva Storage (SNMP, ICMP) specemussw302002p - - Threshold event for RAM, CPU, disks, eth port utilization Threshold event for RAM, CPU, disks, eth port utilization www.gfmintegration.com 8

SNMP v2/v3 alarm forwarding & synchronization: Netcool integration alarmtrap (normal forwarding) heartbeat trap OpenNMS syncrequesttrap startsynctrap alarmtrap (synchronization) endsynctrap Netcool www.gfmintegration.com 9

Issues analyzed and addressed in OpenNMS 1.8.10 (as per customer reqs) Forwarding and synchronization alarms according to a Event/alarm filterable criteria Additional opennms.scriptd helper classes developed opennms-services-1.8.10.jar Forwarding traps should support SNMP v2 and v3 but until now Traps are forwarded according to SNMP v1 extension jars developed to support snmp v2c informs and snmp v3 traps org.opennms.lib.snmp.api-2.0.5.jar org.opennms.lib.snmp.joesnmp-2.0.5.jar org.opennms.lib.snmp.snmp4j-2.0.5.jar Provide to Netcool evidence of the deduplication, alarms raised/cleared automatically in OpenNMS, but OpenNMS just forwards the events without reduction-key! New traps to be defined Implement logic for the integration with Netcool according to reqs Heartbeat trap from the active instance OpenNMS to Netcool new opennms mib and events definition configure Bean Shell script Check OpenNMS status via crontab opennms.mib opennmsmib.events.xml scriptd-configuration.xml opennms_status.sh www.gfmintegration.com 10

Opennms scriptd helper classes used in scriptd-configuration.xml EventMatch. Interface that is able to specify criteria to match Events. EventPolicyRule. Its implementation classes allow to decide if an event should be forwarded or not thanks to the following three methods: o adddroprule(eventmatch eventmatch) o addforwardrule(eventmatch eventmatch) o filter(org.opennms.netmgt.xml.event.event event) EventSynchronization. Its implementation class performs synchronization sending all the active alarms defined on opennms. SnmpTrapHelper. This "helper" class provides a convenience interface for generating and forwarding SNMP traps. www.gfmintegration.com 11

OpenNMS MIB The opennms mib version 1.3 was only able to send opennms events as snmp v1 traps. It has been now upgraded (and productized) to fully support Snmp v2c and the following traps to support snmp based alarm synchronization: alarmtrap (oid.1.3.6.1.4.1.5813.1, generic 6, specific 3) - This is the definition of the generic OpenNMS trap with the addiction for alarm information. Two new varbinds have been added: alarmid an alarm identifer used for alarm reduction and correlation, synchronization to specify if the trap comes from a sync request; heartbeattrap (oid.1.3.6.1.4.1.5813.1, generic 6, specifc 4) - Trap sent periodically by OpenNMS to keep alive external SNMP Manager; startsynctrap (oid.1.3.6.1.4.1.5813.1, generic 6, specifc 5) - Synchronization Process is started Trap sent by OpenNMS station; endsynctrap (oid.1.3.6.1.4.1.5813.1, generic 6, specifc 6) - Synchronization Process is successful ended Trap sent by OpenNMS station; syncrequesttrap (oid.1.3.6.1.4.1.5813.1, generic 6, specifc 7) - Trap sent to OpenNMS to start a Synchronization. This was also added in in opennmsmib.events.xml as <uei>uei.opennms.org/traps/syncrequesttrap</uei> www.gfmintegration.com 12

From Eventd to Scriptd www.gfmintegration.com 13

scriptd-configuration.xml (1/3) <?xml version="1.0"?> <scriptd-configuration> <engine language="beanshell" classname="bsh.util.beanshellbsfengine" extensions="bsh"/> <start-script language="beanshell"> import org.opennms.netmgt.scriptd.helper.ueieventmatch; import org.opennms.netmgt.scriptd.helper.ueialarmmatch; import org.opennms.netmgt.scriptd.helper.eventpolicyruledefaultimpl; import org.opennms.netmgt.scriptd.helper.alarmeventsynchronization; import org.opennms.netmgt.scriptd.helper.dbhelper; import org.opennms.netmgt.scriptd.helper.snmptraphelper; import org.opennms.netmgt.snmp.snmptrapbuilder; import org.opennms.netmgt.xml.event.event; log = bsf.lookupbean("log"); snmptraphelper = new SnmpTrapHelper(); internaleventmatch = new UeiEventMatch("~^uei.opennms.org/internal/.*$"); alleventmatch = new UeiEventMatch("~^uei.opennms.org/.*$"); allalarmmatch = new UeiAlarmMatch("~^uei.opennms.org/.*$"); policy = new EventPolicyRuleDefaultImpl(); policy.adddroprule(internaleventmatch); policy.addforwardrule(allalarmmatch); policy.adddroprule(alleventmatch); sync= new AlarmEventSynchronization(); www.gfmintegration.com 14

scriptd-configuration.xml (2/3) void forward(event event, boolean sync) { try { long traptimestamp = 0; SnmpTrapBuilder trap = snmptraphelper.createv2trap(".1.3.6.1.4.1.5813.1.3",long.tostring(traptimestamp)); if (event.alarmdata!= null ) { if (event.alarmdata.alarmtype == 2) { severity = "Cleared"; alarmid=event.alarmdata.clearkey; } else { severity=null; alarmid=event.alarmdata.reductionkey; } We are still forwarding an event, but only event with alarmdata!! We are just getting reductionkey or clearkey from alarmdata object attribute of event class!!! } t_dbid = new Integer(event.dbid).toString(); if (t_dbid!= null ) snmptraphelper.addvarbinding(trap, ".1.3.6.1.4.1.5813.20.1.1.0", "OctetString", "text", t_dbid); else snmptraphelper.addvarbinding(trap, ".1.3.6.1.4.1.5813.20.1.1.0", "OctetString", "text", "null"); if (event.distpoller!= null) snmptraphelper.addvarbinding(trap, ".1.3.6.1.4.1.5813.20.1.2.0", "OctetString", "text", event.distpoller); else <!--add other varbind of the trap--> <!--.--> trap.send("xx.xxx.xxx.xxx", 162, "public"); } catch (e) { } } </start-script> www.gfmintegration.com 15

scriptd-configuration.xml (3/3) <stop-script language="beanshell"> snmptraphelper.stop(); <!--executing a stop script--> </stop-script> <event-script language="beanshell"> event = bsf.lookupbean("event"); event = policy.filter(event); if (event == null) { log.debug("event is filtered: not forwarding"); } else { forward(event,false); <!--forwarding event--> } </event-script> <event-script language="beanshell"> <uei name="uei.opennms.org/traps/syncrequesttrap" /> long traptimestamp = 0; <!--sending start sync trap--> SnmpTrapBuilder trap1 = snmptraphelper.createv2trap(".1.3.6.1.4.1.5813.1.5",long.tostring(traptimestamp)); trap1.send( xx.xxx.xxx.xxx", 162, "public"); for (e: sync.events ) { <!--for each synchronized event (current active alarms)--> e = policy.filter(e); if (e == null) { log.debug("sync event is filtered: not forwarding"); } else { forward(e,true); <!--forwarding active alarm during synchronization session--> } } <!--sending end sync trap--> SnmpTrapBuilder trap2 = snmptraphelper.createv2trap(".1.3.6.1.4.1.5813.1.6",long.tostring(traptimestamp)); trap2.send( xx.xxx.xxx.xxx", 162, "public"); </event-script> www.gfmintegration.com 16

Forwarding Snmp v3 Alarm Traps import org.opennms.netmgt.snmp.snmpv3trapbuilder; void forward(event event, boolean sync) { try { SnmpV3TrapBuilder trap = snmptraphelper.createv3trap(".1.3.6.1.4.1.5813.1.3",long.tostring(traptimestamp)); snmptraphelper.addvarbinding(trap,..) trap.send( xx.xxx.xxx.x", 162, 2, "traptest", "mypassword", "SHA", "mypassword2", "AES"); <!-- the arguments are: IP, port. Authpriv (snmpv3 security level), username, authentication passphrase, authentication protocol, privacy passphrase, privacy encryption protocol --> } www.gfmintegration.com 17

Final considerations Manual cleared alarms on OpenNMS alarm view page cannot be forwarded automatically (event is forwarded to scriptd and not alarms) But synchronization is requested every day and after that the issue is healed on Netcool, where the customer implemented a logic to automatically reconciliate alarms after synchronization I know alarms could be forwarded via REST API, but the customer didn t want to implement this simple client The customized OpenNMS solution is currently working in production without any known issues (even if by using snmp v2) During Testing acceptance phase we didn t get any relevant fault www.gfmintegration.com 18