Simplifying Systems Management for Computer Associates (CA) NSM r11.1 and r11.2 for Dell Hardware A Dell Technical White Paper By Aruna Jayaprakash and Rakhee Joseph Dell Product Group September 2009
THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND. 2009 Dell Inc. All rights reserved. Reproduction of this material in any manner whatsoever without the express written permission of Dell Inc. is strictly forbidden. For more information, contact Dell. Dell, the DELL logo, OpenManage, PowerEdge, and PowerVault are trademarks of Dell Inc. Microsoft, Windows and Windows Server are registered trademarks of Microsoft in the United States and/or other countries. Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products. Dell disclaims proprietary interest in the marks and names of others. 2
Table of Contents INTRODUCTION... 4 OVERVIEW... 4 Intended Audience... 4 CA NSM OVERVIEW... 4 DELL OPENMANAGE CONNECTION OVERVIEW... 5 Support Matrix... 5 WHAT CONNECTION VERSION 3.3 OFFERS?... 6 Install/Uninstall Connection... 6 Discovery and Grouping... 6 Health Monitoring... 8 TASKS... 9 Application Launch... 9 Acknowledge... 10 View Node... 11 Traps... 12 SUMMARY... 14 REFERENCES AND LINKS... 14 3
Introduction A heterogeneous data center environment creates challenges for organizations seeking to streamline operations, control costs, and reduce operational complexity. Dell is focused on delivering open, flexible, and integrated solutions that help our customers increase productivity, save time and costs, and reduce the complexity of managing disparate IT assets. The OpenManage Connection offering from Dell simplifies management of Dell hardware for CA Network and Systems Management administrators, and is a significant step towards making Dell the easiest platform to manage. Overview The Dell OpenManage Connection (herein after referred as the Connection) for CA Network and Systems Management (NSM) is designed to simplify monitoring and management of Dell hardware by integrating Dell systems management data and applications into CA NSM. Deploying Connection enables administrators to monitor the health of Dell systems and agents by collecting information from widely dispersed enterprise networks, and viewing it in real time through a single console. Connection also enables administrators to launch and use Dell systems management applications, such as Dell OpenManage Server Administrator (OMSA), Dell OpenManage Server Administrator Storage Management (OMSS), and Dell Remote Access Controller (DRAC) console. The Connection reformats events from Dell devices, and enables administrators to view all the events in the enterprise management console of CA NSM. The latest version of Connection is 3.3, and it supports CA NSM r11.1 and CA NSM r11.2; it is available for free downloads from http://support.us.dell.com/support/downloads/format.aspx?releaseid=r233660 Intended Audience This document is intended for administrators who discover and monitor Dell hardware using CA NSM r11.1 and r11.2. It is assumed that the readers are familiar with CA Network Systems Management, and Dell management applications OMSA, OMSS and DRAC. CA NSM Overview CA NSM provides a centralized and unified platform for the management of heterogeneous IT environments. It enables organizations to deploy and optimize a complex, secure, and reliable infrastructure that supports business objectives. The power of CA NSM comes from its extremely flexible approach to systems management, enhanced through broad integration with other products in the CA portfolio, as well as with third party applications. NSM provides two main user interfaces the Management Command Center (MCC) and 2D Map. The MCC is the primary interface for managing the entire enterprise which integrates all Unicenter enterprise and network monitoring functionality into a single console. The 2D Map is a two-dimensional, geographical representation of the structure of the enterprise. 4
For details on the architecture of CA NSM, components, and user interfaces supported in CA NSM r11.1 and CA NSM r11.2 refer to the CA NSM documentation in https://support.ca.com Dell OpenManage Connection Overview The Connection integrates Dell hardware discovery and monitoring with CA NSM by using CA NSM integration interfaces. There are three integration components: WorldView Connection component DSM Connection component Event Management Connection component These components integrate with the WorldView, Agent Technology and Event Management component of CA NSM. Integration with WorldView, Agent Technology and Event Management is achieved using the World View Configuration files (*.wvc), Policy files (*.atp, *.atph), MIB files, Message Record and Actions (MRA) and others. Table 1 below lists some of the files used for integration. Table 1: Integration Files File Type Policy Files World View Configuration Files MIB Files MRA File Name DellStorageManager.atph DellRemoteAccess.atph DellServerAdmin.atp DellOOBDevice.atp DellPetEvents.atp DellServerAdmin.wvc DellStorageManager.wvc DellRemoteAccess.wvc DellOOBDevice.wvc DellAgentPollScope.wvc ServerAdministrator.txt dcstorage.txt rac_host.txt RACevents.txt Support Matrix Listed in table 2 below are the Dell devices/agents supported in Dell OpenManage Connection v3.3 for CA NSM r11.1 and r11.2. 5
Table 2: Supported Devices and Agents Supported Device/Agent PowerEdge /PowerVault Servers Version Generation x8xx, x9xx, xx0x and xx1x Dell OpenManage (OM) 5.3-6.1 Out of Band idrac6 (Monolithic) 1.10 Out-of-band DRAC4 All firmware up to OM 6.1 Out-of-band DRAC5 1.48 In-band RAC (DRAC 5, DRAC 4) All firmware versions up to OM 6.1 Out-of-band DRAC/MC All firmware versions up to OM 6.1 CMC 2.0 What Connection version 3.3 offers? Install/Uninstall Connection In a solution configuration where all the NSM components like MDB, DSM, EM Console and WorldView are installed in a single server, the Connection installer provides an option to select and install all the Connection components. In a solution configuration where the NSM components are distributed, the Connection installer provides an option to select and install only the relevant components. For example, for a system where DSM and EM components of NSM are already installed, the Connection installer provides an option to select both DSM and EM Connection components. In a distributed environment, install Connection first on the system where the WV component of NSM is installed. Then install Connection on the system(s) where the DSM and EM components are installed. During un-installation, first uninstall Connection from the system where the WV component is installed and then uninstall the DSM and EM components. Discovery and Grouping Administrators can discover Dell devices using classic discovery, or continuous discovery, or by manual discovery using the command: dscvrbe -7 <IP address> For discovery and monitoring of Dell servers using Connection, ensure that OMSA is installed on the managed systems. The DSM component of CA NSM uses the Dell defined DSM policy to discover and auto-group the Dell device agents. The device agents that are monitored on a Dell server are OMSA, OMSS, and in-band 6
DRAC and the resource monitored is their global status. Table 3 below lists the information on the Dell devices and their agents. Table 3: Device and Agent Monitoring Information Dell Device Dell Agents Resource Instance PowerEdge/PowerVault OMSA Global Status SystemStateGlobalSytemStatus Server OMSS Global Status Server Administrator (Storage Management) Dell Remote Global Status Dell Remote Access Controller 4/P Access Dell Remote Global Status Dell Remote Access Controller 5 Access Out-of-Band DRAC 4/I Dell RAC Global Status Dell Remote Access Controller 4/I Out-of-Band DRAC 4/P Dell RAC Global Status Dell Remote Access Controller 4/P Out-of-Band DRAC 5 Dell RAC Global Status Dell Remote Access Controller 5 Out-of-Band DRAC/MC Dell RAC Global Status Dell Remote Access Controller / Modular Chassis Out-of-Band idrac6 (Monolithic) Dell RAC Global Status Dell Remote Access Controller 6 Integrated CMC Dell CMC Global Status Chassis Management Controller When Connection discovers any Dell device, it creates a Business Process View (BPV) called Dell Managed Systems. It classifies the discovered Dell devices into three groups under the Dell Managed Systems BPV: DellOOB RAC Modular Systems Monolithic Systems DRAC devices (DRAC4, DRAC5, and Monolithic idrac6) that are discovered out-of-band are classified under the DellOOB RAC group. Under the Modular Systems group, each chassis will be created with its chassis service tag name and the modular systems. DRAC/MC and Dell CMC are listed in this group under the chassis service tag name of the chassis to which they belong. All Dell Monolithic Systems that are discovered will be grouped under the Monolithic Systems group. Figure 1 is a screen shot of the topology view available in the management command center that lists the three groups. 7
Figure 1: Grouping - Tree view of grouping in MC Health Monitoring The Dell defined DSM Policy files allows the DSM component of NSM to monitor the Dell agents using regular status polls and by receiving its traps. The traps are interpreted as defined in the Dell policy, and the status change is displayed in the MCC, Node View, and 2D Map. The DSM proactively polls each agent, and any status change is propagated to MCC, Node View, and 2D Map. In the DSM Discovery Pollset Values there are poll set values available for the Dell agent classes - DellOOBDevice (for out-ofband DRAC) and DellServerAdmin (for all agents of the Dell server). Figure 2 displays an example of a health status for a monolithic server. 8
Figure 2: Health Status Critical status of OMSA in a Dell Monolithic Server Tasks The WorldView component of Connection defines when and how the systems management applications, Acknowledge, View Node, Object View, etc. will be launched as a context menu for Dell devices. Application Launch Application launch is available for the OMSA, OMSS and DRAC (OOB DRAC and In-band DRAC) agents from the MCC, 2D-Map and NodeView. Figure 3 displays an example of the application launch window. 9
Figure 3: Launch of DRAC Console as a menu from Dell RAC agent of OOB DRAC Acknowledge Acknowledge is a feature provided by CA NSM that Connection adds as a menu item, and is available in the Node view and for the Global Status node and leaf node of all Dell agents in MCC. When the user acknowledges a node that is in a warning or critical status, the color and status of the node will change in all the views to reflect that it is now in an acknowledged state. Figure 4 displays an example of an acknowledge menu item for an OMSA agent. 10
Figure 4: Acknowledge Menu item for OMSA View Node Connection adds View Node as a menu item that can launch the Node View. The Node View is a DSM utility used by an operator to navigate down to a problem in the environment once it has been surfaced by one of the MCC BPVs. The Node View helps to visualize the environment from the node level on down to leaf objects, or monitored resources and instances of those resources. The Node View also provides the functionality to acknowledge alerts, set resources to an unmanaged state, and to view state change events on a resource by resource basis. 11
In the MCC, the Node View can be launched from the Dell agents, the Global Status node and its leaf node, and in 2D maps is can be launched from the agents. Figure 5 displays an example of the Node View launched from an OMSS agent. Figure 5: Node View Launched from OMSS agent of Dell Monolithic Server Traps The Connection formats the OMSA, OMSS, DRAC and Dell PET traps. These traps are displayed in the EMC with the associated severity (Normal/Warning/Critical), and a change in status will be propagated to the corresponding object in MCC, 2D Map, and Node View. All the critical traps will be displayed in a red color in the EMC. Since Dell PET alerts come directly from the hardware and not through an agent, a status change will not be propagated to the corresponding system node in MCC and 2D map. Table 4 lists examples of formatted OMSA, OMSS, DRAC and PET traps. Figure 6 displays an example of CA NSM EMC with traps generated. Table 4: Formatted Traps Scenario Format Example The Server Administrator sends this message to the CA NSM [nodeclass, Operating System, previous state, Current state, event Message text, eventid] Host:Windows2000 _Server Windows2000_ Server ServerAdministra tor Trap Agent:ServerAdmi 12
Enterprise Management Console as a result of a system board fan threshold change from warning to normal. The OMSS agent sends this message to the CA NSM Enterprise Management Console when a Virtual Disk is created. PET Traps sent to the CA NSM Enterprise Management Console when the System Temperature goes to Warning. DellOOB devices such as DRAC4, 5, MC, idrac send this message to the CA NSM Enterprise Management Console. [nodeclass, Operating System, previous state, Current state, event Message text, eventid] Dell:BMC BMC PET Trap Agent:BMC Unknown <SEVERITY> <TRAP DESCRIPTION> Dell Event ID:<TRAPID#> serverhostname: <serverhostname> [nodeclass, Dell OOB, Previous state, Current state, event Message text,eventid] nistrator Warning Up Fan sensor returned to a normal value Sensor location: ESM MB Fan1 RPM Chassis location: Main System Chassis Previous state was: Non-Critical (Warning) Fan sensor value (in RPM): 4740 Dell Event ID: 1102 Host:WindowsServer_2008 WindowsServer_2008 DellStorageManager Trap Agent:DellStorageManager Unknown Unknown Virtual disk created Dell Event ID: 1201 Dell:BMC BMC PET Trap Agent:BMC Unknown Warning Under-Temperature Warning (Lower non-critical, going low) Dell Event ID:65792 serverhostname:m710 OtherDevices:DellOOB Dell OOB DellOOBDevice Trap Agent:DellOOBDevice Unknown Unknown RAC Authentication failures during a time period have exceeded a threshold Dell Event ID: 1002 Figure 6: Example of CA NSM EMC with traps generated from Dell devices 13
Summary OM Connection can help administrators increase their efficiency and flexibility, particularly in large heterogeneous environments. This document provides information on using Connection to help administrators easily and effectively discover and monitor Dell Devices. Connection 3.3 provides new features that provide an enhanced user experience such as the updated installer, easy management through automated grouping of devices, support of Dell out-of-band devices, improved performance by consolidating several policy files, and reducing the number of trap listeners per host. References and Links 1. Dell OpenManage Connection Documentation: http://support.dell.com/support/edocs/software/smconect/ca_connections/ca_3.3/index.htm 2. Dell OpenManage Documentation: http://support.dell.com/support/edocs/software/smsom/ 3. Dell OpenManage Server Administrator Documentation: http://support.dell.com/support/edocs/software/svradmin/ 4. Dell OpenManage Server Administrator Storage Administrator Documentation: http://support.dell.com/support/edocs/software/svradmin/ 5. Dell Remote Access Controllers: http://support.dell.com/support/edocs/software/smdrac3/ 6. CA NSM documentation: http://support.ca.com 14
Authors: Aruna Jayaprakash is a Software Validation Lead Engineer in Dell Product Group Business, Software Validation in Bangalore and has worked at Dell for over 5 years, specializing in Enterprise Systems Management. She has a Masters in Software Engineering from Birla Institute of Technology & Science, Pilani, India. Rakhee Joseph is a Software Validation Engineer in Dell Enterprise Software Validation. She has a Bachelors Degree in Computer Science and Engineering from Mahatma Gandhi University, Kerala, India and has worked at Dell for over three years. 15