MONITORING RED HAT GLUSTER SERVER DEPLOYMENTS With the Nagios IT infrastructure monitoring tool

TECHNOLOGY DETAIL MONITORING RED HAT GLUSTER SERVER DEPLOYMENTS With the Nagios IT infrastructure monitoring tool INTRODUCTION Storage system monitoring is a fundamental task for a storage administrator. To do their jobs efficiently, they need a dashboard-style view of status and system health, timely notifications of errors and tools for diagnosis, and fix and recovery from the errors. Furthermore, the monitoring function has to fit into to the broader ecosystem of datacenter systems, tools, and workflows in use. Red Hat Gluster Storage 3 introduces storage monitoring that addresses these administrator needs. The solution integrates the popular open source Nagios monitoring platform with a robust graphical user interface (GUI), storage console, and management capabilities. And when combined with Red Hat Satellite 5, Red Hat Gluster Storage 3 provides an end-to-end management solution capable of installing, configuring, managing, and monitoring storage deployments. This technology detail describes the Red Hat Gluster Storage 3 monitoring functionality with information about the technical system architecture, various deployment scenarios, major features and functions, and resources to learn more. ARCHITECTURE INFORMATION The open source Nagios monitoring platform is at the core of Red Hat Gluster Storage. The Nagios server collects monitoring data from the storage servers and presents real-time information for clusters, volumes, hosts, CPU, and disks. In addition, it collects data at regular intervals for usage on CPU, disks, and volumes and reports them graphically as trends. Nagios performs these operations as series of both: Active checks, where the Nagios server polls the managed entities. Passive checks, where the managed entity notifies the Nagios server of state changes asynchronously. facebook.com/redhatinc @redhatnews linkedin.com/company/red-hat

Figure 1. Functional architecture of Red Hat Gluster Storage monitoring The information collected by the Nagios server is presented via a web server that can be connected to and navigated from any compatible browser. In addition, the Nagios server reports trend information on the Red Hat storage console s usage. The console plots and presents usage information in a Trends tab on the GUI. The Nagios server sends notifications for both changes in state and threshold crossings of system entities via email alerts and SNMP traps. Therefore, Red Hat Gluster Storage s monitoring can be integrated with third-party management solutions. DEPLOYMENT SCENARIOS The storage console and Nagios are included in Red Hat Gluster Storage, which provides a choice of deployment options: CO-HOSTED STORAGE CONSOLE AND NAGIOS SERVER FOR RED HAT GLUSTER STORAGE A dedicated Red Hat Enterprise Linux server hosts Nagios and Red Hat Storage console. The install scripts for the storage console prompts the installer for permission to install the Nagios server. Therefore, administrators will have a dedicated server that can be accessed from any compatible web browser for storage management and monitoring. This configuration is recommended as the best practice for most Red Hat Gluster Storage deployments. 2

NAGIOS SERVER FOR RED HAT GLUSTER STORAGE With this deployment option, the administrator can install and run the Nagios server for Red Hat Gluster Storage on a designated node that is part of the Red Hat Gluster Storage cluster that is being monitored. Alternatively, the administrator can install and run the storage console on a dedicated Red Hat Enterprise Linux server. Note that installing and running the entire storage console on a Red Hat Gluster Storage node is not currently supported. This configuration is best suited for proof-of-concept (POC) environments and small footprint deployments where the overall monitoring load is limited, thus it doesn t compete for resources with the Red Hat Gluster Storage data traffic. STANDALONE NAGIOS SERVER FOR RED HAT GLUSTER STORAGE A dedicated Red Hat Enterprise Linux server hosts the Nagios server for Red Hat Gluster Storage. This configuration is desirable for customers who want to use the Nagios server for Red Hat Gluster Storage but don t want to use the storage console. INSTALLATION AND CONFIGURATION Whatever deployment option you choose, the installation and configuration of the Nagios Server for Red Hat Gluster Storage has been simplified and automated. There are several options available to install and boot Red Hat Gluster Storage, which you ll find detailed in the https://access./ products/red-hat-storage/ documentation/installation guide. For more details, refer to the Red Hat Gluster Storage administration guide on our documentation page. Refer to the Red Hat Gluster Storage administration guide on our documentation page for details. Starting in version 3, the required Nagios plug-ins for active and passive checks on the server nodes are included in the product. Therefore, these plug-ins are installed as part of the Red Hat Gluster Storage installation. In addition, the Red Hat Gluster Storage subscription ensures access to the Nagios server on Red Hat s content delivery platform. As described in an earlier section, you can install the Nagios server in conjunction with the storage console, or on a Red Hat Gluster Storage node, or as a standalone monitoring server on Red Hat Enterprise Linux. Please refer to the Red Hat Storage Installation Guide on our documentation page for more details on how to download and install the Nagios Server. Once the Nagios server and Red Hat Gluster Storages are installed and the cluster is up and running, storage administrators can configure and enable the Nagios service by a few simple steps that include running a configuration discovery script. After a few simple steps, Red Hat Gluster Storage monitoring via Nagios is running and can collect status, health, and threshold crossings information for important system entities. In addition, a new Nagios service called the auto-discovery service runs every 24 hours. This service discovers changes in the Red Hat Gluster Storage cluster configuration, applies the changes to the Nagios server configuration, and notifies the administrator of changes. The Nagios GUI can be accessed via a web interface at http://nagioshosturl/nagios with the default user name and password of nagiosadmin/nagiosadmin. A secure HTTP access (HTTPS) can also be configured. In addition, the Nagios web server can be configured to use LDAP for user authentication and authorization. 3

MONITORED ENTITIES AND WORKFLOWS This section provides an overview of features and functions available with Red Hat Gluster Storage monitoring. OVERVIEW AND DASHBOARD SUMMARY The Nagios Tactical Monitoring Overview page provides a dashboard-style overview of one or more actively managed and monitored Red Hat Gluster Storage trusted storage pools. Administrators are presented with a one-page overview of the status and health of the network, hosts, and all monitored services. The information displayed is appropriately color coded and users can navigate to issues via hyperlinks. In the example shown below, there is one Unhandled critical problem, which administrators can click to get more details (in this case a Volume Quota threshold crossing), acknowledge the problem, and take corrective action (e.g., clean up files to free space) or add more capacity to the volume via the add-brick functionality of the storage console or Red Hat Gluster Storage command line interface (CLI). Figure 2. Dashboard overview of Nagios monitoring functionality on Red Hat Gluster Storage Users that need a map-level overview, trusted pool level, host-level dashboard status, or servicelevel dashboard status can access the respective pages from the navigation bar on the left. 4

Figure 3. Nagios monitoring map overview functionality on Red Hat Gluster Storage Figure 4: Nagios monitoring host status overview on Red Hat Gluster Storage 5

Figure 5. Nagios monitoring service status overview functionality on Red Hat Gluster Storage ALERTS AND NOTIFICATIONS While dashboard style summaries provide administrators with snapshots into system status and health, administrators must also be notified about status changes and threshold crossings. These changes either indicate active faults in the system, which can result in service disruption, or can be indicative of symptoms that might lead to future faults and errors. These changes can be proactively acted upon to prevent future outages. Alerts and notifications are reported via email and SNMP traps. Email alerts can be configured so they re sent to specified individuals or email lists. For example, email alerts on physical entities can be configured to be sent to datacenter and storage administrators while email alerts for volume quota can be configured to be sent to storage administrators only. In addition, the email alerts can be configured for notification periods (e.g., 24x7x365) and severity of notifications (e.g., critical, warning, flapping, etc). The below screenshots illustrate a list of all alerts and contacts notified, and details about the alerts and how to add comments and acknowledge alerts. Alert lists can be filtered by hosts, trusted storage pool (clusters), and service types for better navigation. In addition, administrators can configure scheduled downtime of a service during which Nagios won t send any notifications until the downtime expires. This is particularly useful to disable unnecessary notifications while servicing a managed entity (e.g., replacing a disk). 6

Figure 6. Listing and filtering notifications Figure 7: Notification details 7

Figure 8. Acknowledging notifications Figure 9. Requesting scheduled downtimes 8

TRENDS, GRAPHS, AND REPORTS Nagios for Red Hat Gluster Storage collects utilization information on the cluster level entities such as volumes, quotas, and host-level entities (e.g., CPU, memory, swap, disks, network, and bricks) that can be displayed as graphs. Historical information for up to one year is maintained and can be displayed in intervals of hours, days, week, months, or a custom interval. Such trending information can help system administrators identify changes that signify future problems. Trending also helps with future planning such as capacity management and procurement. Screenshots for volume utilization and CPU utilization are show below as an example. The Nagios GUI provides several reports that can be used for improved system diagnosis. The host state trend report shows the host state history graphically. Similarly, each Nagios service and service alerts can be queried and reported graphically. Reports can be generated for hosts and services that chart their availability historically. Historical ranges of a day, week, month, year, or a custom time range can be selected. Figure 10. Service alert histogram 9

Figure 11. Host state trends NAGIOS SYSTEM MONITORING One of the most powerful features of Nagios is its self monitoring. In addition to all the hosts and services that Nagios monitors, it also monitors the Nagios service itself. This provides an added level of redundancy to system monitoring via the Nagios process information page. This page (figure 12) provides a central command and control of the Nagios processes where you can query the Nagios program information and services enabled, and control the shutdown, restart, and selective control over services as shown in figure 12 below. The check scheduling queue page shows details about the scheduling queue of service checks and provides levers to manage the services individually, as shown in figure 12 below. Finally, the host and service scheduled downtime page allows storage administrators to configure and show administrators set scheduled downtimes for hosts and services. When downtime is set for a host or service, Nagios does not collect statistics and report alerts. This is very helpful for administrators doing planned or unplanned maintenance. 10

Figure 12. Nagios self monitoring Figure 13. Scheduled downtimes 11

TECHNOLOGY DETAIL Monitoring Red Hat Gluster Storage Server END-TO-END SYSTEM MANAGEMENT OF RED HAT GLUSTER STORAGE Storage administrators need a comprehensive management and monitoring solution. They expect to have up-to-date information about systems, tools, processes, and workflows for deployment, installation, provisioning, configuration, monitoring, and trending of their deployed systems. In addition, they expect that the solution can be easily scaled and integrated into existing datacenter systems and processes. FURTHER READING AND NEXT STEPS Red Hat Gluster Storage monitoring with Nagios https:// access./products/ red-hat-storage/documentation installation guide, including information about the deployment of Red Hat Gluster Storage with Red Hat Satellite Red Hat Gluster Storage monitoring with Nagios https:// access./products/ red-hat-storage/documentation administration guide, including information about the integration with Red Hat s storage console Information about https://access./ documentation/en-us/red_ Hat_Satellite/Red Hat Satellite Red Hat Gluster Storage provides the comprehensive solution administrators need, with the monitoring system introduced in the 3.0 release. Furthermore, by choosing Nagios as the platform to build upon, administrators can perform at scale. For example, if there is a hardware-specific Nagios plug-in that is shipped by a third-party resource, Red Hat Gluster Storage administrators can choose to add that plug-in as a service in a few simple steps. In addition, Nagios can report on monitoring statistics from that particular plug-in. Nagios monitoring comes integrated with the Red Hat storage console. Specifically, this console installation comes pre-bundled with Red Hat Gluster Storage and the Nagios server, and can be deployed and installed in one step. In addition, users can see trend graphs on utilization for various hosts and services on Red Hat s storage console. Thus, they have one portal for simplified storage configuration and monitoring. The previously mentioned integration of Red Hat Gluster Storage with Red Hat Satellite (5.6 and later) allows administrators to automate the deployment of the storage software onto physical systems at scale. And the Red Hat Gluster Storage and Nagios monitoring allows for integration into homegrown or third-party system management tools that storage administrators may already be using inside their datacenters. These third-party tools could be built on Nagios in which case, storage administrators could chose to have the Nagios plug-ins directly report into the existing central Nagios server (and not run the Nagios server shipped with Red Hat Gluster Storage). Or, they can have the Nagios Server report SNMP traps into the in-house system when the system is able to consume SNMP traps. INFORMATION ABOUT NAGIOS Evaluate Red Hat Gluster Storage and the Nagios monitoring. Click on the Evaluations tab and then follow the link for evaluation. Contact the author at veshanka@ ABOUT RED HAT Red Hat is the world s leading provider of open source solutions, using a community-powered approach to provide reliable and high-performing cloud, virtualization, storage, Linux, and middleware technologies. Red Hat also offers award-winning support, training, and consulting services. Red Hat is an S&P company with more than 80 offices spanning the globe, empowering its customers businesses. facebook.com/redhatinc @redhatnews linkedin.com/company/red-hat NORTH AMERICA 1 888 REDHAT1 EUROPE, MIDDLE EAST, AND AFRICA 00800 7334 2835 europe@ ASIA PACIFIC +65 6490 4200 apac@ LATIN AMERICA +54 11 4329 7300 info-latam@ #12484567_INC0210625_v2_0215 Copyright 2015 Red Hat, Inc. Red Hat, Red Hat Enterprise Linux, the Shadowman logo, and JBoss are trademarks of Red Hat, Inc., registered in the U.S. and other countries. Linux is the registered trademark of Linus Torvalds in the U.S. and other countries.