Top 10 Tips for z/os Network Performance Monitoring with OMEGAMON. Ernie Gilman IBM. August 10, 2011: 1:30 PM-2:30 PM.



Similar documents
Top 10 Tips for z/os Network Performance Monitoring with OMEGAMON Ernie Gilman

Top 10 Tips for z/os Network Performance Monitoring with OMEGAMON Session 11899

Introduction to Mainframe (z/os) Network Management

IP Monitoring on z/os Requirements and Techniques

z/os V1R11 Communications Server System management and monitoring Network management interface enhancements

Building Effective Dashboard Views Using OMEGAMON and the Tivoli Enterprise Portal

How To Manage Performance On A Network (Networking) On A Server (Netware) On Your Computer Or Network (Computers) On An Offline) On The Netbook (Network) On Pc Or Mac (Netcom) On

Nalini Elkins' TCP/IP Performance Management, Security, Tuning, and Troubleshooting on z/os

Virtualization: TCP/IP Performance Management in a Virtualized Environment Orlando Share Session 9308

TCP Performance Management for Dummies

Performance Monitoring of Heterogeneous Network Protocols in a Mainframe Environment

Solving complex performance problems in TCP/IP and SNA environments.

NetView for z/os V6.1 Packet Trace Analysis

CA NetMaster Network Management for TCP/IP

z/vm and Linux on zseries Performance Monitoring An Update on How and With What Products

Transport Layer Protocols

Nalini Elkins Introduction to TCP/IP Diagnostics (Web-based Seminar)

Understanding The Impact Of The Network On z/os Performance

SmartCloud Analytics Log Analysis

SOA management challenges. After completing this topic, you should be able to: Explain the challenges of managing an SOA environment

Forecasting Performance Metrics using the IBM Tivoli Performance Analyzer

Get clear vision and an improvedoutlook. Monitor your entire mainframe network through a browser-based GUI.

TCP/IP Support Enhancements

Ethernet. Ethernet. Network Devices

Computer Networks. Chapter 5 Transport Protocols

IP SLAs Overview. Finding Feature Information. Information About IP SLAs. IP SLAs Technology Overview

Lecture 8 Performance Measurements and Metrics. Performance Metrics. Outline. Performance Metrics. Performance Metrics Performance Measurements

IT Analytics and Big Data - Making Your Life Easier

CA NetSpy Network Performance r12

VMWARE WHITE PAPER 1

Monitoring Load-Balancing Services

SNMP OIDs. Content Inspection Director (CID) Recommended counters And thresholds to monitor. Version January, 2011

Network performance and capacity planning: Techniques for an e-business world

Troubleshooting Tools

Linux MDS Firewall Supplement

ICOM : Computer Networks Chapter 6: The Transport Layer. By Dr Yi Qian Department of Electronic and Computer Engineering Fall 2006 UPRM

IBM Tivoli Monitoring for Network Performance

Exploiting IT Log Analytics to Find and Fix Problems Before They Become Outages

The Ecosystem of Computer Networks. Ripe 46 Amsterdam, The Netherlands

WHITE PAPER September CA Nimsoft For Network Monitoring

Distributed Denial of Service Attack Tools

Using IPM to Measure Network Performance

NMS300 Network Management System

Overview of the z/os Load Balancing Advisor: Making External IP Load Balancers Sysplex Aware

21 Things You Didn t Used to Know About RACF

Tivoli IBM Tivoli Web Response Monitor and IBM Tivoli Web Segment Analyzer

Monitoring Remote Access VPN Services

Performance Analytics with TDSz and TCR

IBM Tivoli Composite Application Manager for WebSphere

N02-IBM Managed File Transfer Technical Mastery Test v1

Managing Latency in IPS Networks

CA Insight Database Performance Monitor for DB2 for z/os

Application Note. Windows 2000/XP TCP Tuning for High Bandwidth Networks. mguard smart mguard PCI mguard blade

Minimal network traffic is the result of SiteAudit s design. The information below explains why network traffic is minimized.

WHITE PAPER OCTOBER CA Unified Infrastructure Management for Networks

CA TPX Session Management r5.3

LESSON Networking Fundamentals. Understand TCP/IP

Procedure: You can find the problem sheet on Drive D: of the lab PCs. 1. IP address for this host computer 2. Subnet mask 3. Default gateway address

z/os Firewall Technology Overview

PANDORA FMS NETWORK DEVICE MONITORING

Measuring IP Performance. Geoff Huston Telstra

Best of Breed of an ITIL based IT Monitoring. The System Management strategy of NetEye

Load Balance Router R258V

WhatsUpGold. v NetFlow Monitor User Guide

OS/390 Firewall Technology Overview

Application Note. Cell Janus Load Balancing Algorithms Technical Overview

CHAPTER 1 WhatsUp Flow Monitor Overview. CHAPTER 2 Configuring WhatsUp Flow Monitor. CHAPTER 3 Navigating WhatsUp Flow Monitor

AS/400e. TCP/IP routing and workload balancing

z/os V1R11 Communications Server system management and monitoring

PANDORA FMS NETWORK DEVICES MONITORING

Question: 3 When using Application Intelligence, Server Time may be defined as.

CA CPT CICS Programmers Toolkit for TCP/IP r6.1

Network and Services Discovery

Configuring SNA Frame Relay Access Support

Classic IOS Firewall using CBACs Cisco and/or its affiliates. All rights reserved. 1

11.1. Performance Monitoring

[Prof. Rupesh G Vaishnav] Page 1

Introduction to the new mainframe Chapter 12 - Network Communications on z/os

Outline. TCP connection setup/data transfer Computer Networking. TCP Reliability. Congestion sources and collapse. Congestion control basics

7750 SR OS System Management Guide

Sample Network Analysis Report

Pump Up Your Network Server Performance with HP- UX

Announcements. Midterms. Mt #1 Tuesday March 6 Mt #2 Tuesday April 15 Final project design due April 11. Chapters 1 & 2 Chapter 5 (to 5.

Abstract. Introduction. Section I. What is Denial of Service Attack?

High Availability Guide for Distributed Systems

Configuring Health Monitoring

Application Monitoring Maturity: The Road to End-to-End Monitoring

Programming Assignments for Graduate Students using GENI

TECHNICAL NOTE. FortiGate Traffic Shaping Version

Transaction Monitoring Version for AIX, Linux, and Windows. Reference IBM

OSBRiDGE 5XLi. Configuration Manual. Firmware 3.10R

IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Internet Information Services Agent Version Fix Pack 2.

Technical Support Information Belkin internal use only

Network Management and Debugging. Jing Zhou

Lauraʹs Corner The CLEVER Solution: Working with Encrypted Data

! encor en etworks TM

TCP Offload Engines. As network interconnect speeds advance to Gigabit. Introduction to

Iperf Tutorial. Jon Dugan Summer JointTechs 2010, Columbus, OH

Accelerating Development and Troubleshooting of Data Center Bridging (DCB) Protocols Using Xgig

Firewall Defaults, Public Server Rule, and Secondary WAN IP Address

Transcription:

Top 10 Tips for z/os Network Performance Monitoring with OMEGAMON Ernie Gilman IBM August 10, 2011: 1:30 PM-2:30 PM Session 9917

Agenda Overview of OMEGAMON for Mainframe Networks FP3 and z/os 1.12 1. OSA Express and Interface status and utilization 2. Show all IP Stacks in one view 3. TCP/IP Connection backlog and rejections 4. High TCP/IP Connection Rates 5. High TCP/IP Connection response times 6. Zombie Connections 7. Single Application Focused Network Monitor 8. TN3270 Response time analysis 9. FTP Logon and transfer failures 10.EE and HPR performance issues Page 2

Preface This presentation will highlight best practices from customers who use OMEGAMON XE for Mainframe Network to monitor their z/os Networking environment. Some of these best practices will include hundreds of new metrics that came in with OMEGAMON MFN FP3 and new interface details from IBM Communications Server for z/os 1.12 I created a custom Navigator view to highlight the best practices discusses in this presentation Page 3

Overview Applications Sockets Sockets TCP/ UDP Sockets IP ICMP IP Stack Gateway/Devices Interfaces OSA-Express OSA-Express Solutions that Monitors IBM z/os Network performance IBM OMEGAMON XE for Mainframe Networks V4R2 Performance monitoring FP3 added hundreds of new metrics, see slide at the end for details IBM NetView for z/os V6R1 TEP Feeds DVIPA Extensive monitoring TCP/IP Connection awareness Inactive Connections with termination code Real Time Packet and OSA Traces with on the fly analysis Page 4

z/os Network Performance Data Collection Points Collected using the TCP/IP NMI: Applications, Connections, TCP/IP Memory Statistics, IPSec FTP Sessions and Transfers, TN3270 Server Sessions, Interfaces (z/os 1.12) Collected using the VTAM API: VTAM Summary, CSM Buffer Pools Enterprise Extender (EE), High Performance Routing (HPR) Collected using SNMP: Interfaces (z/os 1.11 or before) OSA OMEGAMON XE for Mainframe Networks NLDM API TCP/IP NMI VTAM API z/os Communications Server SNMP Collected using the Session Awareness and trace API: SNA Session Awareness and Trace Move from SNMP and NETSTAT COMMANDS to the NMI API Less overhead Scalable 5 Page 5

Tivoli Enterprise Portal (TEP) Highlights Common user interface Manage z/os system and distributed resources - single user interface. Displays real time and historical,and alerts at the same time All customization and admin through the same interface Define thresholds and generate events Out of the box Best Practices Workspaces Situations - Problem Signatures and Expert Advice (ALERTS) Create your own views and situations To match responsibility and skill level 6 Page 6

Tivoli Enterprise Portal Situations and Thresholds View Thresholds TIP Highlight metrics in a table you are viewing Will bring out hidden problems Help collate metrics to probable cause of problem Situations Notification, though the TEP or OMNIBUS or even an email - 7 Page 7

Situation Expert Advice 16.16 16.16 Initial Situation Values: Captures metrics at timed triggers Current Situation Values: Compare current metrics with initial metrics Expert Advice: Provides suggested actions Take Action: Issue commands from TEP Reflex Automation: Automatically issue commands Event Forwarding: Can forwards alerts to OMNIBUS 8 Page 8

OSA-Express and Interfaces Interface Status (Enhancement with z/os 1.12) Situation on OSA Interface Status. Alert of OSA Interface down. Check Microcode level Adjust MTU size to reduce fragmentation zenterprise Monitor new OSA types Put all OSA and interfaces on one workspace Gateway/Devices IP Stack Interfaces OSA-Express OSA-Express Out of Box Expert Advice Interfaces Device Inactive High bandwidth Utilization No traffic QDIO is Accelerating Bytes Max Staging queue depth DLC Read deferrals DLC Read Exhausted Out of Box Expert Advice OSA Express High BUS Utilization Channel utilization Missed packets Not Stored Frames Next a customized OSA WORKSPACE Page 9

Tip#1 Put all OSA and Interfaces Cross LPAR View Next, DLC Queuing Details Page 10

Channel DLC Read/Write Workload Queue Priorities Tuning OSA Interfaces is now simpler with z/os 1.12 Next, Monitoring Multiple IP Stacks Page 11

Tip#2 Put all IP Stacks for all LPARs in one view Create Cross LPAR Stack view Number of Connections Errors Monitor DVIPA from NetView Next a Connection Backlogs Out of Box Expert Advice IP Stack Input Discards Output Discards Page 12

Exceeding Backlog Connection Limit Application will not be aware that connections are being rejected Backload Limit Backlog Connections Rejected Application Out of Box Expert Advice Application problems Backlog Connections Rejected Connections backlogged Not Accepting Connections Rejecting new connections Next a Connection Backlog WORKSPACE Page 13

Tip#3 Monitor Connection Backlog Rejections Next a high connect / disconnect rate Application TCP Listeners workspace TCP Listeners workspace Page 14

Tip#4 Monitor High Connection Rates A high connection rate with very few active connection may indicate a poor design and can cause excess z/os overhead. Requests Connections Application Page 15

TCP/IP Connections Response times Retransmissions Discards Response time Variance Response time Bytes Buffered Congestion Window size reset to 0 Next : Visualization correlation of response time issues Out of Box Expert Advice Connections Retransmissions Discards Out of order segments Round trip time Round trip time variance Retransmissions, discards and out of order segments Window Size set to zero (local or remote) Can indicate congestion Inbound and outbound bytes buffered (new) Can be used to avoid major outage! Create situation off TCP Connections Workspace 16 Page 16

Tip#5 Correlate Response Times with Errors View all connections with high response times across all LPARS Turn on history to see when problems are occurring See if poor response time correlates with any errors. Notice Inbound Bytes Buffered Next : Why filters can be so dramatic Page 17

Filter Data from all z/os LPARS z/os OMEGAMON MFN Agent TEPS FILTER Filter can be at TEPS Server or z/os Agent Filtered Query is dynamically pushed to z/os Agents Performance gain is dramatic and immediate Next: Finding missing connections Page 18

Where are my Connections? OMEGAMON for MFN default filters can hide problems It is simple is to override these filters. Next: Let's see filters can find zombie connections Page 19

Tip#6 Create custom query for zombie connections Connection that do not get dropped Connections could start to backlog Connections with no traffic for a long time. Filter out addresses like Loopback Also create situation on Bytes buffered Next: Filter for zombies Page 20

How to create a filter in Query Editor Connections with no activity for > 10 Mins (600 seconds) Filter out addresses such as loopback Filter will automatically be pushed out to the MFN Agents Page 21

Tip#7 Create Application Network Filtered Views Examples: Connect:Direct MQ CICS WebSphere Application Client Client Client Address Space CPU Storage Priority Application Listener Connection Rejections Backlog limit Number of Connections Connection Rate Next: Create filtered view on application TCP/IP Connections Response times Buffers queued Connection hangs Congestion Reset windows Retries, Out of order Listener and connection feeds from OMEGAMON MFN Inactive connections feeds from NetView for Termination reasons Address Space feeds from OMEGAMON on z/os Page 22

Application Monitor View Here is one filtered on MQ Next: TN3270 Page 23

TN3270 High Response Time Exception View TN3270 Response time SNA time is application IP time is the network Drill down to TCP connection to see errors Response time distribution by buckets Sliding window buckets set in Comm Server TN3270 Server status from NetView SNA Response Time (Application issue) IP Response Time Drill to connection Out of Box Expert Advice TN3270 Response times Average IP response times Average SNA response times Average total response times High number of long responses Next: Let s see a how this looks Page 24

Tip#8 TN3270 High Response Time Exception View Next: FTPs Page 25

Tip#9 FTP View of Logon and Transfer Problems USS Storage FTP TCP/IP FTP Sessions FTP Transfers FTP Transfers FTP Session logon failure reason codes FTP Transfers, How long it took Drill down to connection for performance issues 26 Page 26

Correlate FTP duration with bytes transmitted Out of Box Expert Advice FTP Session Errors: Network or socket errors Error reply from server Invalid sequence from client Resource shortage (storage ) FTP Transfer Errors File, system or network errors Page 27

Summary of Enterprise Extender and HPR CICS IMS WAS... UDP Enterprise Extender HPR APPN Sessions Expert Advice HPR Low initial throughput rate Path Switches ARB Mode Red, Persistent* EE High retransmissions Out of Sequence buffers UDP High Discard Rates Long Queues exceeding Queue Limit* Network Priority Mappings UDP Ports 12000 12001 12002 12003 12004 SNA LL2 Network High Medium Low Path Switch Timeout (LIVTIME 10 Seconds) 1 Minute TP(2) 2 Minutes TP(1) 3 Minutes TP(0) 8 Minutes *Simple to create Page 28

Tip#10 Show EE and HPR on one View for all LPARs Page 29

TCP/IP and VTAM Address Spaces, Storage and Buffers Out of Box Expert Advice VTAM and TCP Address Space CSA%, ECSA %, CPU %, Paging Rate, and Buffer Pool shortage 30 Page 30

Review 1. OSA Express and Interface status and utilization 2. Show all IP Stacks in one view 3. TCP/IP Connection backlog and rejections 4. High TCP/IP Connection Rates 5. High TCP/IP Connection response times 6. Zombie Connections 7. Single Application Focused Network Monitor 8. TN3270 Response time analysis 9. FTP Logon and transfer failures 10.EE and HPR performance issues Page 31

Additional Information: OMEGAMON Recommended Maintenance: https://www-304.ibm.com/support/docview.wss?uid=swg21290883 OMEGAMON XE for MFN V4R2 Fix pack 3 4.2.0.3-TIV-KN3-IF0001 Matching PTFs UA58835, UA59029, UA59138, UA59709 IBM Tivoli Monitoring V6.2.2 Fix Pack 6.2.2-TIV-ITM-FP0005 Matching PTFs UA60941, UA60942,UA60943, UA60944 Pulse 2011 Session 1254: Solving Application and Network issues using IBM s OMEGAMON Mainframe Networks by James T Sherpey II Page 32

Top 10 Tips for z/os Network Performance Monitoring with OMEGAMON Ernie Gilman IBM Session 9917 Page 33