Collax Monitoring with Nagios Howto This howto describes the configuration of active monitoring on a Collax server. Internally, the system uses Nagios for this purpose. Nagios is used primarily for the self-monitoring of the system. However, other systems in the network can also be monitored. Requirements Collax Business Server Collax Security Gateway Collax Platform Server Objective In a network, the functionality of certain services of the servers and clients is to be monitored. The administrator is to be notified by e-mail in the event of failure of services. Furthermore, the administrator is to be able to see at a glance whether all services are up and the servers can be reached. Task The administrator wants to monitor the availability of a client in his network. Solution By default, monitoring is already activated on the Collax server. If this is not the case, the message "Active network monitoring is not activated" will appear on the dashboard, and the monitoring can be activated with a click. The dashboard presents the overall status at a glance. At a glance, the "Monitoring" section shows whether the services and hardware work correctly. The "Monitoring" dialog shows the status of the active monitoring. Internally, the system uses Nagios (whose Web GUI is displayed here) for this purpose. Various details and statistics can be accessed in the menu on the left-hand side. 2014 Collax GmbH Status: Final Version: 5.8 Date: 24.11.2014
The "Tactical Overview" is an important feature that shows the status of the monitored hosts and services at a glance. Another interesting feature is the "Status Map", in which all hosts can be seen at a glance. Moreover, this overview visualizes the interdependencies among the systems. Monitoring of Other Hosts Collax servers offer active monitoring tests for recording and analyzing services, processes, and states of other systems. In the described scenario, a Collax server acts as a monitoring server that can actively monitor Windows and Linux systems as well as further Collax systems. The configuration of additional hosts and of the services monitored on each host must be performed in the settings of the respective host. This dialog is located under "Services Infrastructure DNS Hosts". Monitoring tests for the host can be activated in the "Network Tests" tab. Click "Add Network Test" to specify the services whose operability is to be monitored. These services will be contacted regularly. An alert will be triggered as soon as a service is recognized as no longer being operable. Note: This monitoring only works for hosts with static IP addresses. Collax Monitoring with Nagios Howto 2/5
Alert times In this list, you can select a time range during which the tests specified below are to be performed and trigger alerts. This is useful if the system is only switched on at certain times, e.g. during office hours. Reachable over If the system is reachable over another system, e.g. over a router, the other host can be selected here. In the event of a failure of the other host, no tests are performed for this system and no alerts are triggered. The system switches to the "unknown" state. When the other host returns, the tests for this system are resumed. Nagios also uses this information for representing the network card. If you leave this field blank, the system will try to find the correct router for the host by means of the routing information. However, this only works if the host can only be reached over one other router. If there are several routers between this system and the host, enter the last known "hop" to the respective host in this field. If the host "X" can be reached over the route "A" "B" "C", enter "C" in this field. Monitoring can also be activated for "C"; in this case, "C" can be reached over "B". NRPE Use of NRPE (Nagios Remote Plugin Executor) technology enables the querying of monitoring-relevant system information (e.g. CPU, RAM, disk information) that could not be viewed with the simpler log monitoring method (e.g. HTTP, DNS). Monitoring of Windows Platforms The NSClient++ tool is required for the monitoring of Windows platforms by NRPE. Configuration of the Collax Server as Monitoring Server Monitoring tests are offered in the administration GUI for monitoring Windows platforms via NRPE. In the "Hosts" dialog, select the host to be monitored and add a monitoring test with the monitored service "NRPE/NSClient++ - Custom Test". Subsequently, set the desired parameters, save the dialog, and activate the settings. A list of possible standard tests and their parameters is provided at the end of the howto. The listed tests serve as templates and can be adapted to your special needs. Installing NSClient++ Suitable software for 32-bit or 64-bit Windows systems can be downloaded directly via the dialog and appears when selecting the monitoring test of the service "NRPE/NSClient++ - Custom Test". Collax Monitoring with Nagios Howto 3/5
To install NSClient++ on the target system, simply click the downloaded MSI package and follow the installation wizard. As the configuration is performed after the installation, no parameters need to be specified for the configuration file in the wizard. Configuration of NSClient++ on Windows Before starting NSClient++, the.ini file must be adapted to the standard installation. To operate NSClient++ securely, it must also know which monitoring servers (parameter: allowed_hosts) are permitted to query which information (modules). This NSC.ini automatically contains the settings for allowed_hosts in conjunction with the NRPE mechanism. To use the template from Collax, the downloaded.ini file can be copied to the installation directory of NSClient++. Apart from the allowed_hosts, the configuration offers preconfigured modules and tests that may be queried by a Collax server. Starting NSClient++ on Windows By means of the installation, NSClient++ is registered as service in Windows. Thus, NSClient++ can be started in the service administration after copying NSC.ini. For NSClient++ queries to work properly over the network, port 5666 must be opened in the firewall on the particular Windows system. For this purpose, Collax provides an NSC.ini that is prepared for use by the configured Collax server for download in the "Monitoring Tests" dialog. NSClient++ Parameters For an initial NRPE/NSClient++ test, you may want to transfer one of the following parameter lines as it is to the parameter line: Drives -c CheckDriveSize -a ShowAll MinWarnFree=10% MinCritFree=5% Drive=c:\ File sizes in C:\Windows -c CheckFileSize -a ShowAll MaxWarn=1024M MaxCrit=4096M File:_WIN=c:\WINDOWS\*.* Collax Monitoring with Nagios Howto 4/5
File size of pagefile.sys -c CheckFileSize -a ShowAll MinWarn=512M MinCrit=1G File=c:\pagefile.sys Check multiple files for version -c CheckFiles -a path=d:\tmp pattern=*.exe "filter=version!= 1.0" "syntax=%filename%: %version%" warn=gt:1 crit==1 Check multiple files for size -c CheckFiles -a path=d:\tmp pattern=*.txt "filter=size gt 20" "syntax=%filename%: %size%" MaxWarn=1 Check event log ID -c CheckEventLog -a filter=new file=application filter+eventid==18456 filter-generated=1h MaxWarn=5 MaxCrit=10 descriptions unique CPU load < 80% -c CheckCPU -a warn=80 crit=90 time=20m time=10s time=4 CPU load like in Linux -c CheckCPU -a warn=100 crit=100 time=1 warn=95 crit=99 time=5 warn=90 crit=95 time=15 Check uptime -c CheckUpTime -a MinWarn=1d MinCrit=12h Test all services to be started automatically except wampmysqld and MpfService -c CheckServiceState -a CheckAll exclude=wampmysqld exclude=mpfservice Check NSCLient++ and cmd.exe processes -c CheckProcState -a ShowAll cmd.exe=stopped NSClient++.exe=started Memory utilization < 80% -c CheckMEM -a MaxWarn=80% MaxCrit=90% ShowAll type=physical These detail pages can be accessed for further application cases for the checks. They contain precise specifications and descriptions of the individual checks. Additionally, the use of CheckWMI and CheckEventLog requires knowledge of Windows management instruments and Windows event logs. CheckWMI http://www.nsclient.org/nscp/wiki/checkwmi CheckFileSize http://www.nsclient.org/nscp/wiki/checkfilesize CheckDriveSize http://www.nsclient.org/nscp/wiki/checkdrivesize CheckFiles http://www.nsclient.org/nscp/wiki/checkfiles CheckEventLog http://www.nsclient.org/nscp/wiki/checkeventlog CheckCPU http://www.nsclient.org/nscp/wiki/checkcpu CheckUpTime http://www.nsclient.org/nscp/wiki/checkuptime CheckServiceState http://www.nsclient.org/nscp/wiki/checkservicestate CheckProcState http://www.nsclient.org/nscp/wiki/checkprocstate CheckMem http://www.nsclient.org/nscp/wiki/ Collax Monitoring with Nagios Howto 5/5