Best Practices for IP Node Monitoring

Best Practices for IP Node Monitoring Reachability and responsiveness Overview and concepts Task Why do it? How does it work? Results Configure the IP Node Monitor to automatically and regularly sample the PingRTT and NetStatus attributes of critical IP nodes or hosts. To check that your important IP nodes - other z/os LPARs, routers, gateways, remote FTP and web servers - are reachable and responsive, and to be notified if they aren t. PING from NetMaster for TCP/IP on z/os to remote IP nodes. Alerts when an IP node is unavailable, and/or its round-trip-time is abnormal. See these on the Alert Monitor, or have them forwarded. Icons which can represent one/multiple/all IP nodes, or IP nodes combined with other types of resources. See these on the Graphical Monitor. At a glance comparison of all monitored IP nodes, and stored PingRTT data (up to the last 10 weeks) for each node. See this on the IP Node Monitor. Step 1 Delete less important nodes When you first set up a region, Express Setup automatically discovers IP nodes. IP node discovery uses the primary IP stack - it starts at the addresses in that stack s routing tables, and goes out for 2 or 3 hops. This usually finds a lot of IP nodes, very close to the MVS host. While this is all CA NetMaster Network Management for TCP/IP can reasonably do, there will certainly be plenty of nodes here that do not need monitoring and important nodes more hops out may be missing. Delete all IP Nodes that have been discovered, but don t need monitoring. (You can easily add some back later.) Decide what additional critical IP nodes you need to add. Don t monitor large numbers of non-critical nodes such as ordinary end-user workstations just because you can. To delete an IP node from the IP Node Monitor Use /IPNODE (M.IN) and enter DEL next to the unwanted node name or Use /IPMONG (A.IP.N) and delete the node from its group

Step 2 Decide how to group your IP nodes IP nodes are defined as being members of an IP Node Monitor Group. Each node in a group shares the same monitoring characteristics, including: How often the nodes are checked. What SNMP performance attributes are also monitored, as well as PINGRTT and NETSTATUS. What alerts are raised, and what alert actions are also done. IP Node Groups can be set up for different device types (i.e. CISCO routers have specific SNMP attributes of their own), but the same device type can also have multiple different groups. Some groups may do more performance monitoring, more frequent checking, or different actions when things go wrong. Supplied IP Node Monitor Groups Standard PING & NETSTATUS every 10 minutes, alert if status is TIMEOUT or ERROR LowLevel PING & NETSTATUS every 60 minutes, no alerting CiscoMonIntens CiscoavgBusy5 IfInDiscards CiscobuffNoMem IfInErrors CiscoifInPkts IfOutDiscards CiscoifOutPkts IfOutErrors PING alert if status is TIMEOUT or ERROR CiscoPerfIntens CiscoavgBusy5 IfInDiscards IfOutDiscards PING alert if status is TIMEOUT or ERROR Express Setup places all IP Nodes it finds in the supplied group Standard. If this is not suitable, you can change them to belong to other groups. You could set up separate IP Node Monitor groups based on function, hardware type, importance, the geographical location of the hosts, and so on. Step 3 Add IP Node Monitor Groups To add an IP Node Monitor group Use /IPMONG (A.IP.N)

Step 4 Set alert conditions for IP nodes Setting alert conditions on performance attributes is always optional, but is recommended for IP Nodes. The state of an IP node is judged by the maximum severity of any open alerts it may have. This state determines the color of the IP Node on the IP Node Monitor, and the color of any icon(s) representing this node on the Graphical Monitor. Status of IP Node status OK, no alerts Unknown or NoAttr error status Timeout, Error or SNMPerror status outstanding severity 1 alert or serious error during data sampling outstanding severity 2 alert outstanding severity 3 alert or actual state is INACTIVE outstanding severity 4 alert Color of Line Green White Turquoise Red Yellow Pink Blue Alert conditions are set separately for each IP Node Monitor Group. The same attribute can have different alert conditions in different groups. For each attribute, you can optionally define alert conditions. Numeric attributes such as PINGRTT have four alert conditions: Above and below threshold (absolute value) Above and below baseline (moving average) Since IP nodes are sampled regularly and often at short intervals, HourOfDay is usually a suitable baseline type for IP nodes Each single alert condition can raise one of up to 5 different alerts, of varying severity and text, based on the threshold value or baseline % e.g. When PINGRTT of a node in FTPSERVERS group is > 5000, raise Severity 1 alert with text This node is extremely slow When PINGRTT of a node in FTPSERVERS group is > 100, raise Severity 4 alert with text This node is slightly slow You can also define an automatic alert action, performed when the alert condition is triggered. e.g. run a TRACEROUTE command when a NETSTATUS value is TIMEOUT, or send an email auto_trouble_ticket when a PINGRTT value is excessive.

Step 5 Add IP nodes To add an IP Node to be monitored Use /IPMON (M.IN) and then PF4=Add or Use /IPMONG (A.IP.N) and add the node to a group Use descriptive and meaningful IP Node names. The IP Node name is used only by the IP Node Monitor, and does not need to be the same as the IP address or IP host name. Choose IP Node names that everyone will recognize. Express Setup often uses the IP address as the IP Node name you can rename these. Firewall changes If any monitored IP nodes are outside of a firewall, changes may be required to allow NetMaster for TCP/IP on z/os to ping these nodes. For ping, ICMP echo requests and replies must be allowed to pass through.

SNMP Network, Protocol and Interface Activity Overview and concepts Task - Configure the IP Node Monitor to poll recommended SNMP attributes from critical remote (non-z/os) FTP and application Linux, UNIX & Windows servers. SNMP attributes are supported by the standard MIBS implemented on most servers IF-MIB, IP-MIB, TCP-MIB and UDP-MIB - A co-requisite task is to configure the remote server to respond and send SNMP values back to NetMaster for TCP/IP. Why do it? How does it work? Results To extend the standard reachable and responsive checking of these servers to include possible network congestion and/or hardware failures, which threaten their FTP transfers and other application activity. SNMP GET from NetMaster on z/os to remote IP host. Alerts when network health or application activity indicators are abnormal. Graphical Monitor icons which can represent 1/multiple/all FTP/appl. servers, or servers combined with other types of resources. Stored performance data (up to the last 10 weeks) for each monitored SNMP attribute of each server. Windows, UNIX and Linux servers can of course be monitored by distributed platform SNMP Manager products such as CA ehealth or CA Spectrum. Why monitor them with NetMaster for TCP/IP? Answer 1: to see the results integrated with the normal NetMaster displays, alongside the z/os IP network information. e.g. an abnormal value of an SNMP attribute on a Windows XP Web Server can: - be seen as an alert, on the NetMaster Alert Monitor, and optionally actioned/forwarded - have its historical values and baselines displayed, using the NetMaster IP Node Monitor - turn the server s icon(s) red, on the NetMaster Graphical Monitor Answer 2: to be able to use NetMaster MIBInsight functions: - Ability to add any SNMP attribute implemented on the server as a 'user defined' IP Node Monitor attribute, to supplement the supplied attributes. - Ability to do ad-hoc SNMP GET to retrieve current value of any MIB attribute implemented on the server - not just the MIBS & attributes that you are monitoring. Set up is generally straightforward. For instructions on how to set up polling of SNMP attributes on a remote server, and the recommended attributes to poll, see the Technical Document NetMaster for TCP/IP Performance Data Reference, available from supportconnect.ca.com