Nimsoft Monitor zones Guide v1.3 series
Legal Notices Copyright 2012, CA. All rights reserved. Warranty The material contained in this document is provided "as is," and is subject to being changed, without notice, in future editions. Further, to the maximum extent permitted by applicable law, Nimsoft LLC disclaims all warranties, either express or implied, with regard to this manual and any information contained herein, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Nimsoft LLC shall not be liable for errors or for incidental or consequential damages in connection with the furnishing, use, or performance of this document or of any information contained herein. Should Nimsoft LLC and the user have a separate written agreement with warranty terms covering the material in this document that conflict with these terms, the warranty terms in the separate agreement shall control. Technology Licenses The hardware and/or software described in this document are furnished under a license and may be used or copied only in accordance with the terms of such license. No part of this manual may be reproduced in any form or by any means (including electronic storage and retrieval or translation into a foreign language) without prior agreement and written consent from Nimsoft LLC as governed by United States and international copyright laws. Restricted Rights Legend If software is for use in the performance of a U.S. Government prime contract or subcontract, Software is delivered and licensed as "Commercial computer software" as defined in DFAR 252.227-7014 (June 1995), or as a "commercial item" as defined in FAR 2.101(a) or as "Restricted computer software" as defined in FAR 52.227-19 (June 1987) or any equivalent agency regulation or contract clause. Use, duplication or disclosure of Software is subject to Nimsoft LLC s standard commercial license terms, and non-dod Departments and Agencies of the U.S. Government will receive no greater than Restricted Rights as defined in FAR 52.227-19(c)(1-2) (June 1987). U.S. Government users will receive no greater than Limited Rights as defined in FAR 52.227-14 (June 1987) or DFAR 252.227-7015 (b)(2) (November 1995), as applicable in any technical data. Trademarks Nimsoft is a trademark of CA. Adobe, Acrobat, Acrobat Reader, and Acrobat Exchange are registered trademarks of Adobe Systems Incorporated. Intel and Pentium are U.S. registered trademarks of Intel Corporation. Java(TM) is a U.S. trademark of Sun Microsystems, Inc. Microsoft and Windows are U.S. registered trademarks of Microsoft Corporation. Netscape(TM) is a U.S. trademark of Netscape Communications Corporation. Oracle is a U.S. registered trademark of Oracle Corporation, Redwood City, California. UNIX is a registered trademark of the Open Group. ITIL is a Registered Trade Mark of the Office of Government Commerce in the United Kingdom and other countries. All other trademarks, trade names, service marks and logos referenced herein belong to their respective companies.
Contact Nimsoft For your convenience, Nimsoft provides a single site where you can access information about Nimsoft products. At http://support.nimsoft.com/, you can access the following: Online and telephone contact information for technical assistance and customer services Information about user communities and forums Product and documentation downloads Nimsoft Support policies and guidelines Other helpful resources appropriate for your product Provide Feedback If you have comments or questions about Nimsoft product documentation, you can send a message to support@nimsoft.com.
Contents Chapter 1: zones 1.3 7 zones Overview... 8 Probe Requirements... 9 Chapter 2: Monitoring capabilities 11 Chapter 3: zones Configuration 13 zones GUI... 14 The Toolbar Buttons... 14 The Left Pane... 18 The Right Pane... 20 Connecting to Solaris box... 21 Creating a resource... 22 Adding Monitors (checkpoints)... 23 Adding Checkpoints... 24 Enabling the monitors for QoS and Alarm... 25 Using Templates... 28 Creating a template... 28 Adding a Checkpoint to the Template... 29 Applying a Template to a Zone... 30 Chapter 4: zones Metrics 31 Alert Metrics Default Settings... 32 Contents 5
Chapter 1: zones 1.3 This description applies to zones probe version 1.3 This section contains the following topics: Document History [1.3] (see page 8) zones Overview (see page 8) Probe Requirements (see page 9) Monitoring capabilities (see page 11) Chapter 1: zones 1.3 7
zones Overview Document History [1.3] This table describes the version history for this document. Version Date What's New? 1.3 December 2012 Ability to not use SSH to connect to zones host Default templates 1.2 January 2012 Public/private key pair authentication for SSH login to zones host 1.2 January 2010 Added Java implementation, including: Related Documentation Added scalability and performance improvements Reduced impact on host and zones (minimized the number of commands that the probe will run) Added support for configuration items and metrics (NIS2) Documentation for other versions of the zones probe (../../zones.html) The Release Notes for the zones probe Getting Started with CA Nimsoft Probes Monitor Metrics Reference Information for CA Nimsoft Probes zones Overview The zones probe monitors the health and performance of Solaris Zones virtualization enabled systems. The probe collects and stores data and information from the monitored host system at customizable intervals. You can easily define alarms to be raised and propagated to the Nimsoft Monitor console when specified thresholds are breached. 8 zones Guide
Probe Requirements Probe Requirements The zones probe is supported on the following platforms: Windows: x86: 2003, 2008 and Windows 7; x64: 2003, 2008, 2008R2, Windows 7 Solaris: SPARC 32-bit: 9, 10; SPARC 64-bit: 9, 10 Linux: x86 - Glibc > 2.3; x64 - Glibc > 2.3 The zones probe requires the following: Nimsoft Robot v3.02 or newer. Target system with Sun Solaris 10 (release level - Intel x86: 08/07 and SPARC: 11/06) with zones pre-configured. One of the following: SSH access to the Solaris Global zone SSH access to the individual zones A Nimsoft robot installed on the Solaris Global zone Chapter 1: zones 1.3 9
Chapter 2: Monitoring capabilities The following components entities can be monitored on Hosts and Virtual Machines: CPU Memory Disk Network Resource Pool Resource Control The following components entities may be monitored on zones (non-global) level: Disk Memory Network Resource Pool Resource Control Chapter 2: Monitoring capabilities 11
Chapter 3: zones Configuration Double-click the line representing the probe in Infrastructure Manager to open the zones GUI. The GUI will initially appear with an empty group called Default Group node, a QoS node with some QoS definitions, and a Templates node. Note that you must click the Apply button to activate any configuration modifications. The zones probe does not monitor anything automatically. Initially, you need to define what to monitor. This process includes: 1. Connecting to the Solaris box. 2. Adding monitors (checkpoints). 3. Configuring the checkpoints to send QoS data and alarms if the thresholds specified are breached. This section contains the following topics: zones GUI (see page 14) Connecting to Solaris box (see page 21) Adding Monitors (checkpoints) (see page 23) Using Templates (see page 28) Chapter 3: zones Configuration 13
zones GUI zones GUI The zones window consists of a row of tool buttons and two window panes. In addition, a status bar is located at the bottom of the window which displays the probe version information and when the probe was started. The Toolbar Buttons The configuration tool also contains a row of toolbar buttons: General Setup button Create New Group button Create New Resource button Message Pool Manager button Create New Template button 14 zones Guide
zones GUI General Setup To open the Setup dialog, click the General Setup button. Log Level To minimize disk consumption, set the log level as low as possible during normal operation. Values for Log-level are: 0= Fatal errors 1= Errors 2= Warnings 3= Information 4= Debug information 5= Debug information, extremely detailed Enable GUI Auto Refresh Enables the auto refresh property. Callback Timeout To set the callback timeout. Default is 30 seconds. Max events to fetch To set max events to fetch at a time for a callback. Chapter 3: zones Configuration 15
zones GUI Create New Group button Create New Resource button Message Pool Manager button You can create new groups by selecting the Create New Group button in the toolbar. A new group will appear in the left pane with the name New Group. Right-click the new group and select Rename to change the name of this group. You can create a resource by selecting the Create New Resource button in the toolbar. The Resource dialog will display. The resource is configured as a link to the Solaris 10 server. For details see Creating a Resource. Means the connection to the host is OK. Means the host is not available. It is possible to move a resource from one group to another, using drag and drop. Click the Message Pool Manager button in the toolbar to open the Message Pool. 16 zones Guide
zones GUI The alarm messages for each alarm situation are stored in the Message Pool. Using the Message Pool Manager, you can customize the alarm text, and you may also create your own messages. Note that variable expansion in the message text is supported. If typing a $ in the Alarm text field, a dialog opens with a set of variables: Host Zone The host computer where the alarm condition occurs. The zone you want to monitor. Resource Monitor Descr Value The resource referred to in the alarm message. The monitor (checkpoint) referred to in the alarm message. The description of the monitor. The value used in the alarm message. Chapter 3: zones Configuration 17
zones GUI Oper Unit Create New Template button The operand to be combined with the value and the threshold in the alarm message. The unit to be combined with the value in the alarm message (for example boolean). Templates are useful tools for defining monitors to be measured on the various elements of a system. You can create templates and define a set of monitors belonging to that template. These templates can be applied to a folder or element by dragging and dropping the template on the node in the left pane where you want to measure the monitors defined for the template. You can also drop a template on a resource in the left pane tree structure, and the template will be applied to all elements for the resource. The Left Pane The left pane shows the monitoring groups, the Resources defined, the QoS definitions, and the Templates for defining monitors. Initially the Default Group, standard QoS definitions and default Templates are created and will appear in the pane. Groups - A group contains one or more resources. On this probe you will typically have one resource. Auto Configuration Auto Monitor Resources - A resource is a link to a Solaris 10 server. Means the connection to the host is OK. Means the host is not available. QoS - The standard QoS definitions included with the probe package. These can be selected when editing the monitoring properties for a monitor. Templates - A template defines a set of monitors. This node contains the following default templates: Host Template Zone (VM) Template 18 zones Guide
zones GUI Right-clicking in the left pane opens a pop-up menu with the following commands: New Resource Available only when a group or a resource is selected. Opens the Resource dialog, enabling you to define a new resource to be monitored. New Group Edit Delete Available only when a group or a resource is selected. Creates a new group where you may place resources. The new group appears in the pane with the name New Group. Right-click the new group and select rename to give the group a name of your own choice. Available only when a resource is selected. Lets you edit the properties for the selected resource. Lets you delete the selected element (group, resource or QoS definition). Note that the Default group can not be deleted, but if you remove all elements from the group, it does not appear the next time you restart the probe. QoS definitions When the QoS sub-node is selected in the left-pane. Chapter 3: zones Configuration 19
zones GUI The Right Pane The contents of the right pane depends on what you select in the left pane: Resources when a group is selected in the left pane. Monitors when a resource is selected in the left pane. Note the following icons: Monitor (type event or counter), where no value yet is measured (or no event has occurred). Black: Indicates that the monitor is NOT activated for monitoring. That means that the Enable Monitoring option is not set in the properties dialog for the monitor. Green: Indicates that the monitor is activated for monitoring, and the threshold value defined in the properties dialog for the monitor is not exceeded. Other colors: Indicates that the monitor is activated for monitoring, and the threshold value defined in the properties dialog for the monitor is exceeded. The color reflects the message token selected in the properties dialog for the monitor. QoS definitions when the QoS sub-node is selected in the left-pane. Right-clicking in the right pane gives you the following possibilities: When the QoS definitions are listed in the pane: Right-clicking in the list opens a pop-up menu, giving you the possibility to add (New) or delete a QoS definition. When the resources are listed in the pane: Right-clicking in the list opens a shortcut menu with the following commands. New - Opens the Resource dialog, allowing you to define a new resource. Edit - Opens the Resource dialog for the selected resource, allowing you to modify the properties. Delete - Deletes the selected resource. Activate - Activates the selected resource. Deactivate - Deactivates the selected resource. 20 zones Guide
Connecting to Solaris box When the monitors are listed in the pane Right-clicking in the list opens a small menu, giving you the following options: Edit - Opens the Monitor properties dialog for the selected monitor, allowing you to modify the properties. Delete - Deletes the selected monitor. Refresh - Refreshes the window to display the most current measured values for the monitors. Monitor - Launches the monitor window and starts filling a graph with real time values measured on the selected monitor. The status line of the screen displays: Interval - This interval is set to ten seconds. Each ten seconds the sample value is read from the probe, which means that the check interval set for the resource is vital. The value does not change until the next poll from the probe. Samples - The number of samples received since the monitor window was launched. Average - The average value of the samples received since the monitor window was launched. Value - The most recent value received since the monitor window was launched. The color of the indicator next to the number indicates the alarm state of the monitor. E.g. green means that the alarm threshold defined for the monitor is not breached. Connecting to Solaris box You need to create a resource to be used as a connection to the Zones Virtualization setup on the Solaris system. See Creating a Resource for details. Chapter 3: zones Configuration 21
Connecting to Solaris box Creating a resource To create a new resource: 1. Right click on the Default group in the left pane. 2. Select New Resource. The Resource dialog is displayed. 3. Enter the field information: Hostname or IP address Port Use SSH Active The hostname or IP address of the zones-enabled Solaris server. The port for communication. Whether to use SSH to connect to the zones-enabled Solaris server. By default this is selected. However, if the probe is running on the zones host (and a Nimsoft robot is also installed), you may want to not use SSH and close the SSH port. In this case, uncheck this box. If the box is unchecked, the Port, Username, Password, and Private Key fields are not needed, and any information entered in these fields is ignored. Use this option to activate/deactivate monitoring of the resource. SSH connection timeout (sec) Time to wait for connection to establish. Check interval The check interval defines how often the probe checks the values of the monitors. Username A valid username to be used by the probe to log on to the Solaris Zones server. Password A valid password to be used by the probe to log on the Solaris Zones server. Private Key If the Solaris Zones server requires an SSH private key to log in, then enter the path to the key file on the client robot where the Zones probe is running. You can browse the file system on the robot and pick the key file. Note: If the private key requires a passphrase then enter the key passphrase into the Password field. 22 zones Guide
Adding Monitors (checkpoints) Connect to non-global zones using: Group Select the connection method from the drop down list to connect to the non global zones. Here you can select which group you want the resource to belong to. Normally you just have the Default group. Alarm message Select the alarm message to be sent if the resource does not respond. Note that you can edit the message or define your own ones, using the Message Pool Manager. Test button Press the test button to verify the connection to the host. 4. Click the Test button to verify the connection to the host. A host response message is displayed. Note: The first time you click the Test button after you have created your first resource, an error message may appear. In that case, wait for at least 20 seconds and click the Test button again. 5. Click the Apply button in the probe GUI to activate the new resource configuration. The new resource linking the probe to the Zones server appears under the Default group. Adding Monitors (checkpoints) After the link to the Solaris zones server is established, you can select the monitors (checkpoints). See Adding Checkpoints. You can also add monitors to templates, see Adding Checkpoints to a Template. To activate the configuration click the Apply button. Chapter 3: zones Configuration 23
Adding Monitors (checkpoints) Adding Checkpoints To select a checkpoint to be measured: 1. Select the resource node in the left pane. The checkpoints are listed in the right pane of the probe GUI. 2. Select the checkpoints you want to monitor. Note: You can also add checkpoints to templates. See the section Using Templates for details. 24 zones Guide
Adding Monitors (checkpoints) Enabling the monitors for QoS and Alarm You can see the current values for the monitors in the Values column in the Zones GUI. To enable the probe to send QoS data and/or send alarms on threshold breaches: 1. Double-clicking a monitor in the right pane. Chapter 3: zones Configuration 25
Adding Monitors (checkpoints) This monitor s properties dialog appears. 2. Update the field information as needed, then click Apply to activate the configuration. The fields are: Name Key The name of the checkpoint. The name is retrieved from the Solaris Zones server, but you are allowed to modify the name. This is a read-only field and describes the checkpoint key. Description The description of the checkpoint. The description is retrieved from the Solaris Zones server, but you are allowed to modify it. 26 zones Guide
Adding Monitors (checkpoints) Value Definition Select the value to be used, both for alarming and QoS: You have the following options: Active The current value, meaning that the most current value measured will be used. The delta value (current previous). This means that the delta value calculated from the current and the previous measured sample will be used. Delta per second. This means that the delta value calculated from the samples measured within a second will be used. The average value of the last and current sample: (current + previous) / 2. Select this to activate monitoring of the probe. Enable Monitoring Select this option to enable the monitoring. Note: you can also enable/disable the monitoring from the right pane of the zones GUI. Operator Select the operator to be used when setting the alarm threshold for the measured value. Example: > 90 means alarm condition if the measured value is above 90. = 90 means alarm condition if the measured value is exact 90. Threshold Unit The alarm threshold value. An alarm message is sent if this threshold is exceeded. The unit of the monitored value. (for example %, Mbytes etc.). The field is read-only. Chapter 3: zones Configuration 27
Using Templates Message Token Select the alarm message to be sent if the specified threshold value is breached. These messages are kept in the message pool. The messages can be modified in the Message Pool Manager. Publish Quality of Service Select this option if you want QoS messages to be sent. QoS Name Select the name to be used on the QoS message sent. Using Templates Templates are a useful tool for defining checkpoints to be measured on the various global/non-global zone levels: Create a template and define a set of checkpoints belonging to that template. Drag and drop the template on the host/zone where you want to monitor the checkpoints defined for the template. Right-clicking the Templates node lets you add a template. Right-clicking one of the templates defined lets you edit or delete the selected template. Creating a template To create a template: 1. Right click the Templates node in the left pane and select New. Note: You can edit an existing template by selecting one of the templates defined (found by expanding the Templates node and selecting Edit). 28 zones Guide
Using Templates The Template Properties dialog appears. 2. Type the Name and Description, then click OK. Adding a Checkpoint to the Template There are two different ways to define monitors for a template: Drag checkpoints from the right pane and drop on the template in the left pane. Right-click a checkpoint/ and select Add to Template, then select the template that you want to add the checkpoint. Chapter 3: zones Configuration 29
Using Templates Applying a Template to a Zone Drag and drop the template on the zone where you want to monitor the checkpoints defined for the template. Note that you may also drop the template on a node containing multiple zones. You will then be asked if you want to apply the template to all zones under the node. 30 zones Guide
Chapter 4: zones Metrics The following table describes the checkpoint metrics that can be configured using the zones probe. Monitor Name Units Description QOS_ACT_MSG_QUEUE Number Number of Active Message Queues QOS_ACT_SEMAPHORES Number Number of Active Semaphores QOS_CPU_AVG_TIME Percent CPU Average Time Across All Processors QOS_CPU_MAJOR_FAULT Faults CPU Major Faults per Processor QOS_CPU_PCT_TIME Percent CPU Time per Processor QOS_CPU_UTIL_PCT Percent CPU Utilization QOS_DISC_CHR Percent Cache Hit Percentage QOS_DISK_KBPS QOS_DISK_OPSPS Kilobytes/ Second Operations/ Second Disk Throughput QOS_DISK_SPACE_MB Megabytes Disk Space QOS_DISK_SPACE_PCT Percent Disk Space (%) QOS_LOAD_AVG Load Load Average Disk I/O Operations per Second QOS_MEM_PAGE_SIZE_MB Megabytes Size of a Memory Page QOS_MEM_PAGES_SCANNED Pages Pages scanned QOS_MEM_SHARED Number Number of Active Shared Memory Segments QOS_MEM_SHM_SIZE Megabytes Size of Active Shared Memory Segments QOS_MEM_SIZE_MB Megabytes Total Virtual Memory Size for All Processes in the Zone QOS_MEM_USAGE_MB Megabytes Memory Usage QOS_MEM_UTIL_PCT Percent Percentage of Memory Currently being used by the zone QOS_MEMORY_SWAP_MB Megabytes Swap Memory Usage Chapter 4: zones Metrics 31
Alert Metrics Default Settings Monitor Name Units Description QOS_NETWORK_ERR_PCT Percent Network Error Rate QOS_NETWORK_PKT_IO Packets Network Packet I/O QOS_PROCS Processes Running Processes QOS_RESOURCE_POOL Number Resource Pool QOS_ZONE_STATE State Execution State This section contains the following topics: Alert Metrics Default Settings (see page 32) Alert Metrics Default Settings This probe does not have any alert metric defaults set. 32 zones Guide