DIAGNOSING PERFORMANCE ISSUES WITH PROSPHERE: AN INTRODUCTION TO USE CASES AND ARCHITECTURE

Similar documents
Lab Validation Report

EMC Data Protection Advisor 6.0

PROSPHERE: DEPLOYMENT IN A VITUALIZED ENVIRONMENT

vrealize Operations Manager Customization and Administration Guide

VCS Monitoring and Troubleshooting Using Brocade Network Advisor

IBM Tivoli Storage Productivity Center (TPC)

CA Virtual Assurance/ Systems Performance for IM r12 DACHSUG 2011

EMC Virtual Infrastructure for SAP Enabled by EMC Symmetrix with Auto-provisioning Groups, Symmetrix Management Console, and VMware vcenter Converter

can you improve service quality and availability while optimizing operations on VCE Vblock Systems?

EMC AVAMAR INTEGRATION WITH EMC DATA DOMAIN SYSTEMS

Aternity Desktop and Application Virtualization Monitoring. Complete Visibility Ensures Successful Outcomes

Greenplum Database (software-only environments): Greenplum Database (4.0 and higher supported, or higher recommended)

VMware Site Recovery Manager with EMC RecoverPoint

Dell SupportAssist Version 2.0 for Dell OpenManage Essentials Quick Start Guide

EMC DATA PROTECTION ADVISOR

Symantec Virtual Machine Management 7.1 User Guide

Proactive Performance Management for Enterprise Databases

EMC VFCACHE ACCELERATES ORACLE

EMC PERFORMANCE OPTIMIZATION FOR MICROSOFT FAST SEARCH SERVER 2010 FOR SHAREPOINT

Using VMware VMotion with Oracle Database and EMC CLARiiON Storage Systems

IMPROVING VMWARE DISASTER RECOVERY WITH EMC RECOVERPOINT Applied Technology

IBM Tivoli Storage Productivity Center

Server & Application Monitor

Simplified Management With Hitachi Command Suite. By Hitachi Data Systems

Spotlight on Messaging. Evaluator s Guide

VBLOCK SOLUTION FOR SAP APPLICATION SERVER ELASTICITY

AX4 5 Series Software Overview

are you helping your customers achieve their expectations for IT based service quality and availability?

eg Enterprise v5.2 Clariion SAN storage system eg Enterprise v5.6

EMC Documentum Interactive Delivery Services Accelerated: Step-by-Step Setup Guide

Configuring Load Balancing for EMC ViPR SRM

Using VMWare VAAI for storage integration with Infortrend EonStor DS G7i

VBLOCK SOLUTION FOR SAP: SIMPLIFIED PROVISIONING FOR OPERATIONAL EFFICIENCY

Heroix Longitude Quick Start Guide V7.1

EMC VIPR SRM: VAPP BACKUP AND RESTORE USING EMC NETWORKER

Using VMware ESX Server with IBM System Storage SAN Volume Controller ESX Server 3.0.2

BROCADE PERFORMANCE MANAGEMENT SOLUTIONS

EMC BACKUP-AS-A-SERVICE

EMC ViPR SRM. Alerting Guide. Version

EMC Data Domain Management Center

APPLICATION MANAGEMENT SUITE FOR SIEBEL APPLICATIONS

CONFIGURATION BEST PRACTICES FOR MICROSOFT SQL SERVER AND EMC SYMMETRIX VMAXe

Application Discovery Manager User s Guide vcenter Application Discovery Manager 6.2.1

EMC Smarts Integration Guide

Dell Enterprise Reporter 2.5. Configuration Manager User Guide

Monitoring IBM HMC Server. eg Enterprise v6

What s New in VMware vsphere 4.1 Storage. VMware vsphere 4.1

CA NSM System Monitoring Option for OpenVMS r3.2

IBM Tivoli Composite Application Manager for WebSphere

Overview. performance bottlenecks in the SAN,

SapphireIMS 4.0 BSM Feature Specification

EMC CLARiiON PRO Storage System Performance Management Pack Guide for Operations Manager Published: 04/14/2011

Solution Overview VMWARE PROTECTION WITH EMC NETWORKER 8.2. White Paper

Evaluation of Enterprise Data Protection using SEP Software

Navisphere Quality of Service Manager (NQM) Applied Technology

Diagnostics and Troubleshooting Using Event Policies and Actions

MRV EMPOWERS THE OPTICAL EDGE.

EMC Storage Management Overview

CA Spectrum. Virtual Host Manager Solution Guide. Release 9.3

OnCommand Insight 6.4

Load Manager Administrator s Guide For other guides in this document set, go to the Document Center

EMC ViPR Controller. User Interface Virtual Data Center Configuration Guide. Version REV 01

HP Operations Smart Plug-in for Virtualization Infrastructure

vcenter Operations Management Pack for SAP HANA Installation and Configuration Guide

NetApp SANtricity Management Pack for Microsoft System Center Operations Manager 3.0

Default Thresholds. Performance Advisor. Adaikkappan Arumugam, Nagendra Krishnappa

simplify monitoring Environment Prerequisites for Installation Simplify Monitoring 11.4 (v11.4) Document Date: January

EMC Unisphere for VMAX Database Storage Analyzer

IBM Information Server

Building the Virtual Information Infrastructure

Monitor the Cisco Unified Computing System

XpoLog Center Suite Log Management & Analysis platform

INTEGRATING CLOUD ORCHESTRATION WITH EMC SYMMETRIX VMAX CLOUD EDITION REST APIs

MRV EMPOWERS THE OPTICAL EDGE.

HP IMC User Behavior Auditor

Network Performance Management Solutions Architecture

EMC Virtual Infrastructure for Microsoft Applications Data Center Solution

SapphireIMS Business Service Monitoring Feature Specification

NetIQ AppManager for Self Monitoring UNIX and Linux Servers (AMHealthUNIX) Management Guide

Vblock Systems hybrid-cloud with Cisco Intercloud Fabric

Distributed File System Replication Management Pack Guide for System Center Operations Manager 2007

IBM Tivoli Composite Application Manager for WebSphere

vrealize Operations Manager User Guide

How To Connect Virtual Fibre Channel To A Virtual Box On A Hyperv Virtual Machine

EView/400i Management Pack for Systems Center Operations Manager (SCOM)

EMC UNISPHERE FOR VNXe: NEXT-GENERATION STORAGE MANAGEMENT A Detailed Review

BMC Service Assurance. Proactive Availability and Performance Management Capacity Optimization

New Features in PSP2 for SANsymphony -V10 Software-defined Storage Platform and DataCore Virtual SAN

XpoLog Center Suite Data Sheet

Installation Guide NetIQ AppManager

Evolution from the Traditional Data Center to Exalogic: An Operational Perspective

CA Database Performance

STEELCENTRAL APPINTERNALS

HP Operations Agent for NonStop Software Improves the Management of Large and Cross-platform Enterprise Solutions

technical brief Multiple Print Queues

VEEAM ONE 8 RELEASE NOTES

Virtualized Exchange 2007 Local Continuous Replication

Resonate Central Dispatch

Minder. simplifying IT. All-in-one solution to monitor Network, Server, Application & Log Data

Oracle Enterprise Manager

Transcription:

White Paper DIAGNOSING PERFORMANCE ISSUES WITH PROSPHERE: AN INTRODUCTION TO USE CASES AND ARCHITECTURE Abstract With advanced end-to-end troubleshooting capabilities, ProSphere performance management provides the user with the ability to resolve performance problems faster. Its proactive abilities allow users to monitor the performance of the components, using threshold alerting to call attention to situations that are indicative of problems. This document identifies use cases that are addressed and the architecture that enables the required functionality. September 2011

Copyright 2011 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. VMware is a registered trademark of VMware, Inc. in the United States and/or other jurisdictions. All other trademarks used herein are the property of their respective owners. Part Number h8935 2

Executive Summary ProSphere provides the next generation of solutions for performance management for Storage Resource Management from physical to virtual to cloud deployments. Its primary goal is to deliver faster problem resolution than previous SAN performance management products. It provides application focused end-to-end performance troubleshooting, from the host through the fabric to the storage array; allow users quick identification of where problems reside and allows them to focus their efforts on fixing the underlying cause. The performance management monitoring capability of hosts, switches, and arrays proactively identifies performance troubles, using userdefined alerts with user-definable thresholds. These alerts allow ProSphere to identify problems before the user has the ability to log them. Because ProSphere continually collects SAN performance data and provides charting and trending capabilities, one can easily compare current and historical performance data. This allows the user to be able to readily consider trends and use this information to determine whether particular problem instances are aberrations, or part of a recurring pattern due to regular workload patterns. In addition to its own inherent capabilities, ProSphere also integrates with Symmetrix Performance Analyzer (SPA). This integration provides the user easy access to SPA s capabilities, so that array-based performance problems can be addressed. Audience This white paper is intended for those responsible for performance management of Storage Area Networks, so that they might understand how ProSphere allows for faster resolution of performance problems, through: end-to-end performance troubleshooting monitoring and proactive problem detection the ability to analyze and detect intermittent problems Goals of Performance Management Measurement, Alerting and Problem Resolution ProSphere helps the SAN Administrator acquire a comprehensive view of the performance of the data center infrastructure including hosts, switches, fabrics, and arrays. Analysis of key performance metrics allows for problem isolation and tuning, resolving performance problems quickly and efficiently. In order to get maximum value out of SAN investments, the performance of the components must be measured. Measurement of host, switch, fabric, and arrays provides objective information to determine the value of one particular configuration over another. 3

Measurement provides the information necessary to: Improve performance Allocate or re-allocate resources Determine how SAN configurations and topologies compare with each other Indicate performance trending (e.g. remaining the same, improving or declining) Identify whether or which hardware and topologies are producing results that are cost effective and efficient A performance problem is a gap between desired and actual results. Performance improvement is any effort that closes the gap between actual and desired results. This process of measuring performance often requires the use of statistical evidence to determine progress toward specific defined data center objectives. Defining performance thresholds provides the user with a basis to judge whether performance objectives are being obtained and is key to indicating what actions must be taken. Specifications of desired outcomes are necessary for meaningful evaluation. Defining performance in terms of desired results is how administrators make their configurations operational. Performance reporting and variance analyses must be accomplished regularly, as workloads in the SAN change frequently and reporting enables timely corrective action to be taken, based on those changes. Vigilance in reviewing results and attending to alerts are necessary for effective management control. ProSphere captures the performance data and processes it in such a way as to make it readily available to the user for consideration. This allows the user the ability to consider the correlation to system problems and eventually drives towards problem resolution. What ProSphere Provides ProSphere collects, correlates, and graphically presents performance information for the user s storage infrastructure (including VMware virtually provisioned servers and virtually provisioned storage). 4

Figure 1 - ProSphere Performance Charts Extensive analytics and built-in intelligence helps the user determine which Key Performance Indicators (KPIs) need to be examined, so one can quickly focus on the source of the problem. Data is collected continuously to provide historical context, so one can quickly determine if a slowdown is the normal product of workload patterns, and allow for the setting of thresholds that alert the administrator when performance strays from the norm. Traditional performance monitoring applications provide too many metrics and little or no information on the relationship of physical SAN components. Add in virtualization technologies at the server and storage levels and it makes it difficult for a storage administrator to determine the root cause of a performance issue. ProSphere provides an end-to-end relationship combined with key metrics to help the storage administrator quickly identify performance issues and the impact at the physical and virtual layers. This release is designed to provide end-to-end performance troubleshooting, from the virtual and physical host through the fabric to the array. Where a problem resides can be quickly identified and efforts made to fix it. 5

In addition to its inherent capabilities, ProSphere is also integrated with Symmetrix Performance Analyzer (SPA) to provide a performance management solution that is both broad in identifying performance problems across the whole SAN, as well as deep in allowing the user to drill into performance problems related to the Symmetrix. ProSphere performance management is designed to be proactive, allowing one to define alerts based on user-definable thresholds so that problems can be resolved before issues impact users. End-to-end performance troubleshooting ProSphere provides real-world performance problem solving capabilities that allow users easy movement from one stage of problem determination to the next. To help with that flow, ProSphere pinpoints a relatively small set of meaningful, key performance indicators, rather than every available performance metric. Monitoring and proactive problem detection Once discovery has completed, the user turns on performance data collection per host or group of hosts. In addition to collecting information on the hosts, the system intelligently enables collection for all of the elements in the IO paths that extend from the hosts. 6

Figure 2 - Discovered Hosts ProSphere can collect array data in 15-minute intervals and switch data in 5-minute intervals. Host data collection supports two tiers, the first in which performance data on as many as 1000 hosts can at collected in 5 minute intervals and the second in which an additional 1500 physical host, as well as, ESX servers can be collected at a 15 minute interval. In addition to the data collected by ProSphere itself, data from the past month of ControlCenter Performance Manager (daily collections) can be migrated. Simplicity as a Driving Principle In order to address the SAN administrators need for effective performance management, ProSphere has been designed from the bottom up to be simple to use. This simplicity starts with data collection and allows the user easy identification of the hosts from which the performance data is to be collected. Data collection can be specified for either individual hosts or for a group of hosts, and ProSphere will automatically collect performance data for the entire path, including switches, array ports, and the associated LUNs. 7

Figure 3 - SAN Topology View The simplicity provided extends to SAN configuration navigation. In this example, one can see how easy it is to find a host and view the end-to-end topology. First, the user searches for the host. A compact, high-level topology map is presented, showing just the SAN elements connected to the host. Irrelevant information is filtered out. If users wish to see more detail in the connectivity path, they can select the object of interest to get more information. The steps are few and simple: Type a few letters of the host name View the compact map that is presented showing the host and connected fabric and arrays Expand the host, fabric, or array as required. 8

Architecture Component Overview Figure 4 - ProSphere Performance Management Architecture The architectural diagram in Figure 4 shows how ProSphere components interact to provide a performance management solution. The Console UI interface displays topology maps and performance charts, with this data being surfaced by the ProSphere Application. ProSphere collected performance data is made available from 9

the Discovery Engine (though not shown, ControlCenter data is imported from the ControlCenter performance database). Data Collection and Sources The data orchestration sub-component of ProSphere is responsible for the scheduling, queue management and policy management for the data collection interface. The intention is to have two loosely coupled, cohesive physical services, which define this sub-system. The first service is responsible for the assignment of job scheduling, job policy management and job status. The second service is responsible for queue management and job assignment. Both leverage a common persistent data share, which contain job policy and job status information. Collected Data ProSphere data collection policies minimally contain: Collection Type Objective(s) Start time Interval End time This allows data to be collected based on a schedule (start and stop time), with a frequency defined by the need for a detailed collection. Based on the collection type, a number of data sources are used to provide a collection of time-series performance data. Host Agent-less Collection WMI on Windows SSH for Unix and Linux Virtual Center for VMware vsphere data Switch Collection SMI-S for Brocade SNMP for Cisco Array Collection SMI-S for Array data SPA for Symmetrix performance data (Element Manager Integration) The following shows information that is available for exposure to the user in charts and provides the basis for alerting. The following is a list of metrics collected by ProSphere. 10

Symmetrix Metrics FE Port Throughput in KB per second Percent busy Device Sampled average read time (ms) Sampled average write time (ms) KBs read per sec KBs written per sec HA KBs transferred per sec derived System I/O operations per second derived Reads per second Writes per second KBs read per second Kbytes written per second Sequential reads per second Host director I/O operations per second derived Reads per second Writes per second Slot collisions per second Disk read commands per second Write commands per second Detecting and analyzing intermittent problems ProSphere allows application problems to be related to host, switch, fabric, array bottlenecks, and helps in determining how storage resources can be rebalanced for utilization that is more effective. Alerts ProSphere alerting provides a default set of performance metrics alerts for discovered configuration items (CIs) in the SAN environment. The default threshold values (in percentage) of these performance metrics are 85% and 90% - for warning and critical alerts, respectively. The user can configure these threshold values to suit the requirements of the unique environment using the Set Performance Thresholds button in the All Alerts view. This allows the setting of a threshold to specify the level at which an alert should occur. 11

To set custom threshold values for performance metrics of discovered configuration items: select operations in the ProSphere area navigation bar to view the All Alerts view of the Alerts area. Then select Set Performance Thresholds, after which the Performance Thresholds dialog box appears. Figure 4 - Set Performance Thresholds Dialog Box The default threshold values of Warning (percentage) and Critical (percentage) can be modified as required (the permitted range is 1 to 99). Select OK to save the modified threshold values or select Cancel to close the dialog box without saving the modified threshold values. Selecting Reset reverts to the default values and disables the performance metrics. Integration with the Symmetrix Performance Analyzer ProSphere performance management utilizes EMC Symmetrix Performance Analyzer (SPA) to support detailed understanding of the backend performance of Symmetrix arrays. SPA is a graphical performance tool that builds upon Symmetrix performance monitoring activities. It helps in long-term planning and decisions, diagnostic drill down to identify root-cause of performance issues (providing detailed diagnostics at the device level) and enhanced real-time monitoring (5-10 second sample period) of select key performance indicators Once the ProSphere user has determined the potential for a problem in a Symmetrix, most likely the SPA application will prove useful in finding the root cause of the problem. ProSphere has the ability to launch directly into SPA in a context that allows the user to continue to investigate the problem. This allows the user to identify the correct SPA instance to continue the investigation (there could be multiple SPA instances in a data center, each handling a different set of Symmetrix). Because of a data center configured trust relationship between ProSphere and SPA, the user does 12

not have to resubmit credentials to SPA, as the user s identity is passed from ProSphere to SPA. Since these two products conform to the same GUI design conventions, they have a very similar look and feel, providing consistent behavior and design of functions. Performance Issues The following are some of the scenarios that ProSphere can help the SAN Administrator address. When a SAN Administrator: is assigned a help desk reported performance problem for a host, the administrator can use ProSphere to display graphs showing the performance of the host for the relevant time. In addition, the administrator can select storage attached to that host and view graphs showing storage performance. Then the administrator can compare the storage performance to that of the host to determine each components contribution to the performance issue. is assigned a help desk reported performance problem for an application and that problem seems to be related to storage the administrator can identify the host volumes being used and compare the performance of those volumes to other volumes that share specified storage resources, such as directors and ports, in order to understand which other applications could be impacted by a storage bottleneck believes that a SAN object might have a performance problem. In order to compare the current performance of an object with its prior performance for the same for the same hours of the day for a prior day the administrator can use the historical reports. is viewing the performance of a Symmetrix and quickly needs to understand how array backend performance influences overall performance. The administrator can use ProSphere to launch SPA with single sign-on. Conclusion ProSphere performance management provides the user with the ability to perform end-to-end diagnostic analysis of Symmetrix and CLARiiON SAN environments, based on simplified user task focused search and relationship based one-click navigation. A tier-based approach to the collection of data allows ProSphere to support selective 5-minute intervals, using an agent-less performance data collection approach for UNIX, Linux and Windows hosts, while still achieving scalability in large data centers. 13

Threshold-based alerting and near real-time display of performance data on all objects provides the user with the ability to take expeditious action on performance problems. Integration with SPA, allows for easily accessible investigation of Symmetrix related performance problems, which provides a deep analysis capability to augment the broad capability provided by ProSphere itself. 14