opensm2 Enterprise Performance Monitoring December 2010
Agenda opensm2 Overview INSPECTOR ANALYZER 1
opensm2 Provides a consistent solution for the enterprise-wide performance management Supports the server systems Microsoft Windows, Linux, Solaris, VMware ESX Server, Xen and BS2000/OSD and the storage systems Symmetrix, CLARiiON and FibreCAT and all SNMP-enabled systems Offers a comprehensive online monitoring with configurable alarm management Enables detailed bottleneck analyses as well as forecasts of future performance requirements Gives you the capability to optimize the performance of your systems 2
Architecture opensm2 comprises the components INSPECTOR for simultaneous online monitoring of several systems ANALYZER for the offline analysis of monitoring files 3
Monitored data collection opensm2 collects a large variety of monitored data from Windows systems via the WMI interface Linux and Xen systems with SSH Solaris systems via different system interfaces (kstat, /proc, statvfs) VMware ESX Server systems via the VI web service BS2000/OSD systems with the SM2 monitor Storage systems via StorMan Other systems via SNMP The monitored data can be collected remotely ("agentless") from all systems except Solaris and BS2000/OSD The monitored data is available online and can be saved to a monitoring file for subsequent analysis 4
Windows monitored data opensm2 collects measurement variables from the following areas: Windows systems via the WMI interface ("agentless") Utilization of processors in user and privileged mode Utilization of logical partitions of disks Utilization of physical disks Data about the communication of the server with the network Length of server work queues Utilization of file system cache Utilization of physical and virtual memory Number of processes, threads, events, semaphors and mutexes Utilization of the paging file Number of system calls Number of in-/outputs Process and workload specific data Accounting and processor utilization data of job objects Information about active processes of job objects Information about the print queue Transfer rates of TCP/IP net connections Information about Microsoft applications (IIS, Exchange, SQL Server) 5
Linux monitored data opensm2 collects measurement variables from the following areas: CPU utilization in user and system mode and IO waiting time Number of processes and context switches Number of waiting processes Number of interrupts Memory utilization Number of in- and outputs Transfer rates of devices Page-in and -out activities Swap-in and -out activities Free memory pages and disk blocks in the swap area Network data Process- and Workload-specific data (e.g. CPU and memory utilization, IOs, page fault rate, system calls) 6
Solaris monitored data opensm2 collects measurement variables from the following areas: CPU utilization in user and system mode and IO waiting time Activities of block-oriented devices Utilization of system buffers in main memory Number of system calls Page-in and -out activities Memory allocation by the system kernel Number of locks and the resulting waiting times Message and semaphor activities Length and utilization of waiting queues Swap-in and -out activities Free memory pages and disk blocks in the swap area Number of entries and size of system tables Utilization of file systems Process- and Workload-specific data (e.g. CPU and memory utilization, IOs, page fault rate, system calls) Network data 7
VMware ESX Server monitored data opensm2 collects measurement variables of the entire server and the individual virtual machines from the following areas: Number of virtual machines Number of real and virtual processors Utilization of processors total and by virtual machines Memory usage by VMkernel and virtual machines Size and usage of swap area Import and export activities (swapping), Accesses to and data transfer rates of disks Data transfer rates of network 8
Xen monitored data opensm2 collects measurement variables of the entire server and the individual virtual machines from the following areas: Number of virtual machines Number of real and virtual processors Utilization of processors total and by virtual machines Memory usage by virtual machines Data transfer rates of virtual block devices Data transfer rates of network 9
BS2000/OSD monitored data opensm2 collects measurement variables from the following areas: CPU utilization Number of in- and outputs Memory and address space utilization Channel utilization Disk utilization Number of tasks per task type Category and taskspecific resource consumption File access frequency Catalog access frequency Service times of devices Utilization of pubsets UTM applications BCAM connections ISAM pools DAB caches File migrations by HSMS Host communication with MSCF Partitions and emulated volumes of GS TCP/IP connections Hypervisor and guest system activities in VM2000 systems Symmetrix systems UDS/SQL and SESAM/SQL 10
Storage systems monitored data opensm2 collects measurement variables from the following areas: Access rates (read/write) Data transfer rates (read/write) IO response times (read/write) SNMP-enabled systems opensm2 collects measurement variables from the MIB-2 groups ip, tcp and udp 11
Monitoring of applications Applications can be monitored by means of workload classes Definition of workload classes Processes of a selected application can be assigned to a workload class WORKLOAD_CLASS=<name>,UID=<uid-list>,CMD=<cmd-list> Workload class-specific data, e.g. CPU consumption IOs memory utilization page fault rate system calls 12
Process hit lists Generating a list which shows the processes with the highest CPU consumption for each monitoring cycle Identification of the processes that are involved in load peaks 13
INSPECTOR Survey of functions Simultaneous online monitoring of several systems Presentation of monitored data in pre-defined reports with a graphical or tabular layout Definition of new reports and linking of variables by means of formulas Monitoring of performance data by means of user-defined rules Flexibly configurable and automatic reactions for exceptional situations triggering an audible alarm starting a procedure notification by e-mail or SMS sending an SNMP trap* Subagent for SNMP management* * BS2000/OSD und Solaris 14
INSPECTOR Monitoring screen 15
INSPECTOR Rule-based Monitoring Rule: IF (lower limit < value(meas. variable 1) upper limit) : AND (lower limit < value(meas. variable n) upper limit) THEN execute actions SMS E-Mail Monitoring data Procedure Rule Actions SNMP 16
ANALYZER Survey of functions Easy-to-use graphical interface for the analysis of monitoring files Generation of expressive charts from the performance data Support of bottleneck and trend analyses Identification of the processes involved with system high load Identification of overloaded resources by automatic production of hit lists of the monitored objects with the highest values Automation of analyses by using macros Selection of different statistics (mean value, frequencies, minimum, maximum, quantile) Simple transfer of analysis results into Microsoft programs (Excel, Word, PowerPoint) and the Intranet/Internet Storage of compressed performance data in long-term files 17
ANALYZER Analysis results 18
opensm2 an easy-to-use, cut to customer requirements and economical solution for the monitoring of server and storage systems 19