MALAYSIAN PUBLIC SECTOR OPEN SOURCE SOFTWARE (OSS) PROGRAMME COMPARISON REPORT ON NETWORK MONITORING SYSTEMS (Nagios and Zabbix) JANUARY 2010
Phase II -Network Monitoring System- Copyright The government of Malaysia retains the copyright of this document.
Phase II -Network Monitoring System- Table of Contents SECTION 1 : INTRODUCTION...1 EXECUTIVE SUMMARY...2 INTRODUCTION...2 ZABBIX...2 NAGIOS...3 SECTION 2 : PURPOSE OF EVALUATION...4 PURPOSE OF THE EVALUATION...5 SECTION 3 : DISCUSSION OF RESULTS...6 COMPARISON...7 ANALYSIS...9 SECTION 4 : CONCLUSIONS...10 CONCLUSIONS...11 SECTION 5 : REFERENCES...12 REFERENCES...13
Phase II -Network Monitoring System- SECTION 1 : INTRODUCTION
Phase II - Network Monitoring System- EXECUTIVE SUMMARY Any Network Monitoring System must be capable of monitoring thousands of servers and tens of thousands of services. It must also be easy to deploy with as much automation as possible, such that new servers can be provisioned and added to the monitoring pool with little administrative overhead and the capability of scripting changes and commands to the server. INTRODUCTION ZABBIX Zabbix was created by Alexei Vladishev, and currently is actively developed and supported by Zabbix SIA. Zabbix is an enterprise-class open source distributed monitoring solution. Zabbix monitors numerous parameters of a network and the health and integrity of servers. Zabbix uses a flexible notification mechanism that allows users to configure e-mail based alerts for virtually any event. This allows a fast reaction to server problems. Zabbix offers excellent reporting and data visualisation features based on the stored data. This makes Zabbix ideal for capacity planning. Zabbix supports both polling and trapping. All Zabbix reports and statistics, as well as configuration parameters, are accessed through a web-based front end, which ensures that the status of your network and the health of your servers can be assessed from any location. Properly configured, Zabbix can play an important role in monitoring IT infrastructure. This is equally true for small organisations with a few servers and for large companies with multitudes of servers. Zabbix is free of cost. Zabbix is written and distributed under the GPL General Public License version 2. It means that its source code is freely distributed and available for the general public. Page 2 of 13
Phase II - Network Monitoring System- NAGIOS Nagios is a free, open-source web-based network monitor developed by Ethan Galstad. Nagios is designed to run on Linux, but can be also be used on Unix variants. Nagios monitors the status of host systems and network services and notifies the user of problems. In common with many open source utilities, installation requires a degree of system administrator experience. Nagios is definitely not for the novice, unless you are prepared to put the effort in learning the basics. But with a wide range of features, including a number of web interfaces, Nagios is a very useful, feature rich monitoring tool. A large number of plug-ins available from the Nagios Library, means you can design its capabilities to your own requirements. Amongst others, Nagios monitors services such as SMTP, POP3, HTTP, PING and resources such as disk and memory usage, log files, processor load and so on and integrates with the Sensatronics IT Temperature Monitor to allow monitoring and alerting of server room and device temperature to your own parameters. Nagios will also allow scheduling so that for instance if you planned network outage you can suppress host and service notifications. Nagios also allows users the flexibility to develop custom host and service checks. All the plug-ins are available for download from the Nagios library. It is also possible to set up a hierarchy of alerts for instance if alerts are not responded to. The monitoring daemon runs intermittent checks on hosts and services you specify using external plugins which return status information to Nagios. When problems occur Nagios alerts you via email, instant message, SMS. Current status information, historical logs, and reports can all be accessed via a web browser. Nagios runs on Linux and Unix variants. Nagios does not support Microsoft Windows. Page 3 of 13
Phase II -Network Monitoring System- SECTION 2 : PURPOSE OF EVALUATION
Phase II - Network Monitoring System- PURPOSE OF THE EVALUATION We evaluated the two systems based on the following criteria: Ability to scale to thousands of host checks and tens of thousands of service checks Ease of importing host and service checks from a script Ease of modifying a configuration or scheduled checks, particularly the ability to do mass changes to many nodes at once API which can be accessed via a command line Ability to do distributed checks from multiple monitoring servers Reporting capabilities Ability to create Remedy tickets for critical alerts Industry acceptance, availability of expertise, and community support Ability to schedule planned maintenance periods and store work comments for hosts Page 5 of 13
Phase II - Network Monitoring System- SECTION 3 : DISCUSSION OF RESULTS Page 6 of 13
Phase II -Network Monitoring System- COMPARISON Features Nagios Zabbix Developers Ethan Galstad. Nagios Zabbix SIA Operating System Unix / Linux Cross-Platform License GNU General Public License GNU General Public License Configuring Data Store - Stores configuration in flat files with a very simple format - Use database to store configuration definitions, but does not encourage directly modifying it Scalability - Allows to script the addition of new service checks and make mass changes using various scripting languages or basic shell script - Has numerous data collection options - The default method does not scale well at all because it requires fork/exec's on the monitoring server for each check - Direct manipulation of the database does not appear to be a common practice by Zabbix users - Having the configuration in the relational database could be used for ad-hoc reporting as well as updates - Works primarily via an agent that runs on each monitored host - Client collects a large set of information, which is then sent to the server on a regular basis - Ability, via an add-on to setup proxy servers which run host and service checks on behalf of the central server - Allows the system to be scaled horizontally by adding more proxy servers as needed - Ability to run multiple monitoring servers and forward - Zabbix has the ability to run multiple servers and aggregate the results centrally, synchronizes distributed servers, so manual maintenance of multiple configurations is not necessary if this feature is used Page 7 of 13
Phase II -Network Monitoring System- Features Nagios Zabbix Reporting Capability results to a central server that aggregates check results - Keeps a history of alerts and downtimes for all hosts and service checks by default - Extensive reporting and graphing capabilities as part of the standard package Application Interfaces - Various reports of availability can be created for individual servers, host groups, specific checks, etc - Allows administrators to store comments with time stamps, so that administrators can store work history or other information and leave a log of notes for particular hosts. Each of these comments has a time stamp - Default web-based GUI - Does not have time-stamped comments for hosts. Therefore, it is not possible to retain work histories and logs within the system - Primary interfaces is via a web-based GUI - Interface is used to monitor hosts, acknowledge alerts, schedule maintenance, and run reports - All host and service check changes are done via config files on the filesystem using an editor or external tool - Ability to accept commands via a named pipe, which is how all commands are sent to it from the GUI and the numerous third-party add-ons available - Can be fully controlled from a command line via this method, so the server can be controlled entirely from the command line - Limited support for command line or scripted control and changes - Some configuration objects can be modified via xml imports as well, but there are limits Page 8 of 13
Phase II - Network Monitoring System- ANALYSIS Interface Zabbix has much more interactive and mature user interfaces Nagios Add-on with NagiosQL Reporting Zabbix has better reporting built-in, particularly regarding performance data and trending Monitoring Capabilities Nagios can produce many of the same types of graphs, but only via third-party add-ons and external tools Zabbix has monitoring capabilities Nagios does not ship with log monitoring Configuration Zabbix hard to configure Nagios easier to configure Maintenance Zabbix does not support scheduled maintenance at the server level Nagios scheduled maintenance can be assigned to host groups, individual hosts or specific services either through GUI or via pipe interface Comment Zabbix does not allow work logs or comments to be stored with a user-name and time stamp Nagios assigns a user-name and time stamp to each entirely eliminate Foultlog Page 9 of 13
Phase II -Network Monitoring System- SECTION 4 : CONCLUSIONS
Phase II - Network Monitoring System- CONCLUSIONS Either of these systems would provide many of the monitoring features needed by OSCC. There are many additional features of each system that may be useful at some point, but none of these is critical to our decision. The two systems are fundamentally different in that Zabbix is much more of a monolithic system, with more structure and less flexibility, but a very nice interface with additional performance graphing and reporting. Nagios is much more modular and flexible at the command line level, and has a much simpler method of creating new checks and hosts via simple shell scripts and text-based config files. With that in mind, the difference lies in how we want to use the system. Our emphasis is on maintainability and scriptability, and in this regard we believe Nagios has an advantage. Further, both because we have experience with Nagios and because of the simpler nature of the Nagios configuration itself, we anticipate that implementing this system will require less effort than Zabbix. Our expectation is that we can implement basic monitoring of new servers directly into our deployment/provisioning procedure with little or no additional overhead for administrators. Additional custom checks for key servers should be easier to implement in Nagios as well. There is a danger that Nagios' method of keeping thousands of servers and tens of thousands of service checks in flat config files may become large and unwieldy, but we believe that by wrapping the creation of new hosts in simple scripts and driving this from our deployment process, we can mitigate that risk. Prepared by: Khairul Aizat Kamarudzzaman & Norasnida Rusalan Page 11 of 13
Phase II - Network Monitoring System- SECTION 5 : REFERENCES Page 12 of 13
Phase II - Network Monitoring System- REFERENCES 1. http://nagios.org/ 2. http://zabbix.com 3. http://en.wikipedia.org/wiki/nagios 4. http://en.wikipedia.org/wiki/zabbix Page 13 of 13