z/os Unix System Services Dumps - Dump Debugging for Dummies



Similar documents
z/os 1.8 Erfahrungsbericht

Utility Mainframe System Administration Training Curriculum

z/os Diagnostic Data Collection and Analysis

The z/os GRS Serialization "Jack of All Trades" Tools for (Performance/Contention/Monitoring...)

Introduction to the new mainframe Chapter 4: Interactive facilities of z/os: TSO/E, ISPF, and UNIX

z/os Management Facility (z/osmf) V1.13 Implementation and Configuration Session: 09733

CANZLOG: Consolidated Logging for the New and Experienced User

Using the z/os SMB Server. to access z/os data from Windows. -- Hands-On Lab Session

H211L Bulk File Transfer (BFX)

CA Datacom Task Storage Options. User Key ECSA and Data Space

z/os V1R11 Communications Server Simplification and usability syslogd enhancements

z/os V1R11 Communications Server System management and monitoring Network management interface enhancements

New SMTP client for sending Internet mail

CICS Transactions Measurement with no Pain

Outline: ISA Tools for WebSphere Comments: on z/os

Changing Your Cameleon Server IP

CA Deliver r11.7. Business value. Product overview. Delivery approach. agility made possible

ERserver. iseries. Work management

Configuring System Message Logging

Enterprise Content Management System Monitor 5.1 Agent Debugging Guide Revision CENIT AG Author: Stefan Bettighofer

CA Database Management Solutions for IMS for z/os

CA SYSVIEW Performance Management

SCS3205/4805 Quick Start Guide

Symantec NetBackup AdvancedDisk Storage Solutions Guide. Release 7.5

ELEC 377. Operating Systems. Week 1 Class 3

3.1 Connecting to a Router and Basic Configuration

COMMANDS 1 Overview... 1 Default Commands... 2 Creating a Script from a Command Document Revision History... 10

Also on the Performance tab, you will find a button labeled Resource Monitor. You can invoke Resource Monitor for additional analysis of the system.

WINDOWS PROCESSES AND SERVICES

CA ARCserve Backup for Windows

id_prob_result_coredump_aix.ppt Page 1 of 15

Configuring and Tuning SSH/SFTP on z/os

CA Insight Database Performance Monitor for DB2 for z/os

System i and System p. Customer service, support, and troubleshooting

Java on z/os. Agenda. Java runtime environments on z/os. Java SDK 5 and 6. Java System Resource Integration. Java Backend Integration

Troubleshooting Failover in Cisco Unity 8.x

ASG-Rochade Backing up Rochade Databases Quick Start Guide

MCTS Guide to Microsoft Windows Server 2008 Applications Infrastructure Configuration (Exam # )

CA Workload Automation Agent for Databases

Audit Trail Administration

Second Edition (May 1984)

Tuning WebSphere Application Server ND 7.0. Royal Cyber Inc.

CA TPX Session Management r5.3

Enhanced Diagnostics Improve Performance, Configurability, and Usability

IBM Security QRadar Version (MR1) WinCollect User Guide

User's Guide - Beta 1 Draft

Tracking Network Changes Using Change Audit

Remote Supervisor Adapter II. User s Guide

CA ARCserve Backup for Windows

BCPii for Dummies: Start to finish installation, setup and usage

Distributed Locking. User Guide. 2006, 2016 Zumasys, Inc.

Configuring Keystroke with KeyPay

CA SiteMinder. Policy Server Management Guide. r6.0 SP6. Second Edition

KofaxReporting. Administrator's Guide

Lab 5.5 Configuring Logging

Workflow Templates Library

RACF PERFORMANCE TUNING

Tivoli Access Manager Agent for Windows Installation Guide

Administering the Network Analysis Module. Cisco IOS Software. Logging In to the NAM with Cisco IOS Software CHAPTER

Using the CoreSight ITM for debug and testing in RTX applications

OPEN APPLICATION INTERFACE (OAI) INSTALLATION GUIDE NEC

HP POLYSERVE SOFTWARE

Automating Operations on z/vm and Linux on System z

IBM SDK, Java Technology Edition Version 1. IBM JVM messages IBM

This presentation explains how to integrate Microsoft Active Directory to enable LDAP authentication in the IBM InfoSphere Master Data Management

Managing Software and Configurations

IBM Academic Initiative

Running a Workflow on a PowerCenter Grid

Exploiting z/os Tales from the MVS Survey

Remote Access Server - Dial-Out User s Guide

Troubleshooting. System History Log. System History Log Overview CHAPTER

SANbox Manager Release Notes Version Rev A

Logging. Working with the POCO logging framework.

Interactive System Productivity Facility (ISPF)

Configuring System Message Logging

Click to edit Master title style. User Experience with zhpf

Application Backup and Restore using Fast Replication Services. Ron Ratcliffe March 13, 2012 Session Number 10973

Chapter 6, The Operating System Machine Level

Unicenter Workload Control Center r1 SP4. Server Status Troubleshooting Guide

z/os UNIX Systems Services Security Best Practices

JobScheduler Events Definition and Processing

IP Monitoring on z/os Requirements and Techniques

CHAPTER 15: Operating Systems: An Overview

Site Monitor. Version 5.3

Computer Associates BrightStor CA-Vtape Virtual Tape System Software

Configuring a Cisco 2509-RJ Terminal Router

Hardware Information Managing your server, adapters, and devices ESCALA POWER5 REFERENCE 86 A1 00EW 00

ORACLE INSTANCE ARCHITECTURE

Chapter 3 Application Monitors

11.1. Performance Monitoring

Application Note. Running Applications Using Dialogic Global Call Software as Windows Services

How To Connect To A Ppanasonic Pbx On A Pc Or Mac Or Ipa (For A Pc) With A Usb Or Ipo (For Mac) With Pbq (For Pc) On A Usb (For Pb

Integrating Autotask Service Desk Ticketing with the Cisco OnPlus Portal

WEBLOGIC ADMINISTRATION

IBM Software Services for Lotus Consulting Education Accelerated Value Program. Log Files IBM Corporation

Version 5.0. MIMIX ha1 and MIMIX ha Lite for IBM i5/os. Using MIMIX. Published: May 2008 level Copyrights, Trademarks, and Notices

Top 10 Tips for z/os Network Performance Monitoring with OMEGAMON Session 11899

Transcription:

Maintenance and Technical Support Technical Support Competence Center z/os Unix System Services Dumps - Dump Debugging for Dummies Matthias Korn z/os Virtual Frontend / Unix System Services EMEA Level 2 IBM Deutschland GmbH korn@de.ibm.com 99. z/os Guide Lahnstein 16.März 2011 2011 IBM Corporation

What are we talking about today? The two categories of dumps How to capture an unformatted dump IPCS powerful tool to read unformatted dumps IPCS First steps to navigate IPCS Next steps to navigate IPCS Useful general commands to gather information BPXI070E at shut down Finding the root using a SLIP dump Hiper Apar OA34226 What does a dump show in this case? OMVS Debug HTML Update 2 z/os Unix System Services Dump Debugging 15. Mär 2011

The two categories of dumps There are two categories of dumps: Formatted dumps SYSABEND, SYSUDUMP, SNAP dumps Unformatted dumps SVC dumps, SYSMDUMP abend dumps, stand-alone dumps 3 z/os Unix System Services Dump Debugging 15. Mär 2011

How to capture an unformatted dump System abends i.e. AbendEC6, abend0c4, abend878 dump captured by recovery routines Slip i.e. reason code slip trap under USS slip processing gets control due to the defined conditions slip schedules SVC dump, captures trace records dynamic dump i.e. console dump dump captured via DUMP command no trigger necessary used for permanent situations and comparisons 4 z/os Unix System Services Dump Debugging 15. Mär 2011

How to capture an unformatted dump (cont.) SADUMP program standalone dump program loaded as part of a restart SADUMP captured in hang / loop situations SYSMDUMP DD card dump captured in connection with LE runtime options such as TER(UADUMP), ABT(ABEND), TRAP(ON) RECFM=FBS, LRECL=4160 5 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS powerful tool to read unformatted dumps problem state key 8 program running in TSO/E users address space operates interactively and in batch environments a TSO/E command processor is the base of IPCS TSO/E 'IPCS' command activates the IPCS command processor all commands to perform IPCS functions are sub-commands of the IPCS command for interactive use, IPCS uses ISPF dialog support to run as a full screen application this application uses the IPCS command processor 6 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS (cont.) helps you to format and read component traces, GTF traces format and analyze unformatted dumps Format and display control blocks is able to identify jobs with error return codes resource contentions control block overlays 7 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS First steps to navigate What kind of dump do we have? What was the dump written for? Which slip trap caused the dump to be captured? 8 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS Primary Option Menu 9 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS Selecting the source 10 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS STATUS (IP ST) 11 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS LIST SLIPTRAP (IP L SLIPTRAP) 12 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS Next steps to navigate Which address spaces have been dumped? What are the corresponding jobnames? Has the dump completely been written or is it partial? 13 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS CBF RTCT (IP CBF RTCT) F ASTB 14 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS SELECT ALL 15 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS LIST E0. LENGTH(16) BLOCK(0) Lists the SDRSN SDUMP PARTIAL DUMP REASON CODE control block If all requested bytes are x'0', the dump is complete. Otherwise SDRSN control block in z/os MVS Data Areas Volume 5 (MCSCSA SNAPX) needs to be reviewed for the actual reason. 16 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS Useful general commands Which trace data are available? Does any resource contention exist? How many real storage is available / in use? Which events (abends) have been logged? What can be determined about OMVS? 17 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS VERBX MTRACE The MTRACE verb exit displays the master trace table which corresponds to the syslog of your image. The status of it can be determined via 'D TRACE' and changed via 'TRACE MT' operator command. 18 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS SYSTRACE ASID(1) TIME(LOCAL) The SYSTRACE IPCS command displays the system trace table and formats system trace entries for each address space. The status of it can be determined via 'D TRACE' and changed via 'TRACE ST' operator command. 19 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS ANALYZE RESOURCE Shows contentions against system resources such as OMVS latches 20 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS RSMDATA SUMMARY Shows real storage definitions and utilization 21 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS VERBX LOGDATA Shows the instorage logrec buffers. It invokes the EREP program to format the logrec records. 22 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS OMVSDATA Formats OMVS relevant information about processes, threads, files and file systems managed by OMVS and serviced by HFS, ZFS, NFS, TFS. The dump needs to contain the OMVS address space and OMVS data spaces Options: IP OMVSDATA IP OMVSDATA PROCESS IP OMVSDATA FILE IP OMVSDATA STORAGE IP OMVSDATA IPC IP OMVSDATA COMMUNICATION Report Types: SUMMARY DETAIL EXCEPTION 23 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS OMVSDATA PROCESS Displays a Unix System Services process summary report including PID, associated user ID, ASID, parent process ID and status (i.e. zombie). 24 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS OMVSDATA PROCESS DETAIL Displays a detailed report about each process dubbed to Unix System Services including its different threads (TCBs), active system calls, open file descriptors and sent / received sysplex work. 25 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS OMVSDATA PROCESS DETAIL (cont.) Displays a detailed report about each process dubbed to Unix System Services including its different threads (TCBs), active system calls, open file descriptors and sent / received sysplex work. 26 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS OMVSDATA PROCESS DETAIL (cont.) 27 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS OMVSDATA FILE Displays a report of all mounted file systems known to that system the dump was taken for including file system name, mount point, latch number, token to internal control blocks representing the file system. 28 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS OMVSDATA FILE DETAIL Displays a report of all active files in the system. An active file is either open or has recently been referenced. The 'File Serial Number' and the 'Device Number' uniquely identify a file (directory, regular file, character special, FIFO, symbolic link). 29 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS OMVSDATA STORAGE Displays a report of all active cell pools in use by z/os Unix. The report contains information about common storage and data space resident cell pools as well as private storage resident cell pools. 30 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS CTRACE COMP(SYSOMVS) FULL LOCAL Formats out the OMVS component trace. The trace data reside in SYSZBPX1 data space, which makes it necessary to always include the OMVS dataspaces into a dump. The trace is at least active in MINIMUM mode for OMVS related problems it is always recommended to activate the trace. For details see the USS Diagnosis HTML file. 31 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS CTRACE QUERY(SYSOMVS) FULL LOCAL Displays the status of the OMVS ctrace at the time when the dump was captured. 32 z/os Unix System Services Dump Debugging 15. Mär 2011

BPXI070E at shut down using a slip dump Symptoms: *BPXI066E OMVS SHUTDOWN COULD NOT MOVE OR UNMOUNT ALL FILE SYSTEMS BPXM054I FILE SYSTEM OMVS.ETC.MSYX FAILED TO UNMOUNT. RET CODE = 00000072, RSN CODE = 058800AA BPXM054I FILE SYSTEM SYS1.ROOT.MSYX.OMVSSIDA FAILED TO UNMOUNT. RET CODE = 00000072, RSN CODE = 058800AA *195 BPXI070E USE SETOMVS ON ANOTHER SYSTEM TO MOVE NEEDED FILE SYSTEMS, THEN REPLY WITH ANY KEY TO CONTINUE SHUTDOWN 33 z/os Unix System Services Dump Debugging 15. Mär 2011

BPXI070E at shut down (cont.) TSO BPXMTEXT 058800AA BPXFSUMT 03/05/08 JRFsParentFs: The file system has file systems mounted on it. Action: An unmount request can be honored only if there are no file systems mounted anywhere on the requested file system. Use the F BPXOINIT,FILESYS=DISPLAY,ALL command for a shared file system configuration or the D OMVS,FILE command for a non-shared file system configuration to determine which file systems are mounted on the requested file system. Unmount them before retrying this request. 34 z/os Unix System Services Dump Debugging 15. Mär 2011

BPXI070E at shut down (cont.) SLIP SET,IF,A=SYNCSVCD,RANGE=(10?+8C?+F0?+1F4?), DATA=(13R??+1B0,EQ,058800AA),DSPNAME=('OMVS'.*), SDATA=(ALLNUC,PSA,CSA,LPA,TRT,SQA,LSQA,RGN,SUM), JL=OMVS,AL=(H,P,S,CU),END 35 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS LIST SLIPTRAP (IP L SLIPTRAP) 36 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS OMVSDATA FILE F OMVS.ETC.MSYX 37 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS OMVSDATA FILE F '/MSYX/etc' 38 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS OMVSDATA FILE F '/MSYX/etc' 39 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS OMVSDATA FILE F '/MSYX/etc' 40 z/os Unix System Services Dump Debugging 15. Mär 2011

BPXI070E at shut down Conclusions File system SYS1.ROOT.MSYX.OMVSSIDA mounted at /MSYX failed to unmount because of OMVS.ETC.MSYX still mounted at /MSYX/etc both file systems are owned by system number 02 OMVS.ETC.MSYX failed to unmount because of: OMVS.CRON.MSYX mounted at /MSYX/etc/cron OMVS.SPOOL.CRONLOG.MSYX mounted at /MSYX/etc/spool/cron/cronlog OMVS.SPOOL.MSYX mounted at /MSYX/etc/spool all 3 file systems are remotely owned by system 04 Who are systems 02 and 04? 41 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS BPXWNXMB Formats out the NXMB control block which represents the OMVS XCF group members table Checks if the system is a member of a shared file system environment Gives back information about all members, their state, system name and number as well as the active BPXMCDS couple data set definitions 42 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS BPXWNXMB 43 z/os Unix System Services Dump Debugging 15. Mär 2011

IPCS BPXWNXMB 44 z/os Unix System Services Dump Debugging 15. Mär 2011

BPXI070E at shut down Conclusions File systems: OMVS.CRON.MSYX mounted at /MSYX/etc/cron OMVS.SPOOL.CRONLOG.MSYX mounted at /MSYX/etc/spool/cron/cronlog OMVS.SPOOL.MSYX mounted at /MSYX/etc/spool are remotely owned by system MSYS while their parent file system is owned by system MSYX. Due to an unknown reason the ownership has changed. Questions: When has the change occurred? What are the AUTOMOVE settings for these 3 file systems? 45 z/os Unix System Services Dump Debugging 15. Mär 2011

BPXI070E at shut down Conclusions Answers: An internal control block contains a time stamp when the owner of the file system changed the last time. The slip matched at shut down at 06:57:05.980519 local time. The last owner change happened at 06:56:48.120832 local time / same day. These file systems are mounted with AUTOMOVE=Y while the parent is mounted AUTOMOVE=U. 46 z/os Unix System Services Dump Debugging 15. Mär 2011

Hiper Apar OA34226 ORPHANED PPRA SIGNAL LATCHES *MASTER* MEMTERM ABEND0C4 BPXPRTRM SYS.BPX.AP00.PRTB1.PPRA.LSN Shut down of a system (SYS1) in a shared file system environment Latch contention on a different system (SYS2) Reinitialization of SYS1 into the shared file system environment impossible due to latch contention on SYS2 SYS2 performed MemberGoneRecovery for SYS1 contention on the mount latch due to an orphaned PPRA latch 'D OMVS,W' command just shows mount latch activity dump necessary 47 z/os Unix System Services Dump Debugging 15. Mär 2011

Hiper Apar OA34226 D OMVS,W BPXO063I 01.46.02 DISPLAY OMVS 886 OMVS 0010 ACTIVE OMVS=(A0,00,R0,A1) MOUNT LATCH ACTIVITY: USER ASID TCB REASON AGE ------------------------------------------------------------- HOLDER: OMVS 0010 009FC3E8 MemberGone Rcvry 00.00.15 IS DOING: BRLM Wait <----------- misleading! FILE SYSTEM: OESYS.WILY.PRODPLEX.INTRO810.ZFS WAITER(S): OMVS 0010 009A0160 FileSys Unmount 00.00.03 48 z/os Unix System Services Dump Debugging 15. Mär 2011

Hiper Apar OA34226 D GRS,C D GRS,C ISG343I 05.30.00 GRS STATUS LATCH SET NAME: SYS.BPX.AP00.PRTB1.PPRA.LSN CREATOR JOBNAME: OMVS CREATOR ASID: 0010 LATCH NUMBER: 2056 REQUESTOR ASID EXC/SHR OWN/WAIT WORKUNIT TCB ELAPSED *MASTER* 0001 EXCLUSIVE OWN 009DBE88 Y 16:53:59 OMVS 0010 SHARED WAIT 009FC3E8 Y 03:44:12 LATCH SET NAME: SYS.BPX.A000.FSLIT.FILESYS.LSN CREATOR JOBNAME: OMVS CREATOR ASID: 0010 LATCH NUMBER: 2 REQUESTOR ASID EXC/SHR OWN/WAIT WORKUNIT TCB ELAPSED OMVS 0010 EXCLUSIVE OWN 009FC3E8 Y 03:44:12 OMVS 0010 EXCLUSIVE WAIT 009A0160 Y 03:44:00 OMVS 0010 EXCLUSIVE WAIT 009D04E0 Y 03:38:42 49 z/os Unix System Services Dump Debugging 15. Mär 2011

Hiper Apar OA34226 What shows the dump? IPCS ANALYZE RESOURCE RESOURCE #0012: NAME=SYS.BPX.A000.FSLIT.FILESYS.LSN ASID=0010 Latch#=2 RESOURCE #0012 IS HELD BY: JOBNAME=OMVS ASID=0010 TCB=009FC3E8 DATA=EXCLUSIVE RETADDR=BD24A324 REQID=001000003D011540 RESOURCE #0012 IS REQUIRED BY: JOBNAME=OMVS ASID=0010 TCB=009A0160 DATA=EXCLUSIVE RETADDR=BD28CD70 REQID=001000001976B8D0 50 z/os Unix System Services Dump Debugging 15. Mär 2011

Hiper Apar OA34226 What shows the dump? IPCS ANALYZE RESOURCE (cont.) RESOURCE #0011: NAME=SYS.BPX.AP00.PRTB1.PPRA.LSN ASID=0010 Latch#=2056 RESOURCE #0011 IS HELD BY: JOBNAME=*MASTER* ASID=0001 TCB=009DBE88 DATA=EXCLUSIVE RETADDR=BD421A06 REQID=01E4080841AE2300 RESOURCE #0011 IS REQUIRED BY: JOBNAME=OMVS DATA=SHARED ASID=0010 TCB=009FC3E8 RETADDR=BD421B3A REQID=001000003D011540 51 z/os Unix System Services Dump Debugging 15. Mär 2011

Hiper Apar OA34226 What shows the dump? latch is represented by a LQE (Latch Queue Element) within a latch set (LSET). LSET and LQE live in the creators private storage (OMVS) LQE contains a time stamp when the latch was obtained 52 z/os Unix System Services Dump Debugging 15. Mär 2011

Hiper Apar OA34226 What shows the dump? 53 z/os Unix System Services Dump Debugging 15. Mär 2011

Hiper Apar OA34226 What shows the dump? IPCS LTOD formats out TOD (time of day) stamps IPCS LTOD C7486B4077C11780 Shows, the latch was obtained on 4 th of February 2011, while the contention was reported and the dump taken on the 20th. Why it wasn't released? What happened to the holder? 54 z/os Unix System Services Dump Debugging 15. Mär 2011

Hiper Apar OA34226 What shows the dump? CTRACE COMP(SYSOMVS) LOCAL FULL OPTIONS((EXCEPTION)) gathers exceptional information that are written to a different ctrace buffer OMVS ctrace does not need to be switched on shows that the TCB in MASTER address space abended at the time when the latch was obtained. OMVS recovery routines did not release the latch latch got into an orphaned state 55 z/os Unix System Services Dump Debugging 15. Mär 2011

Hiper Apar OA34226 What shows the dump? CTRACE COMP(SYSOMVS) LOCAL FULL OPTIONS((EXCEPTION)) F '02/04/2011' F 9DBE88 56 z/os Unix System Services Dump Debugging 15. Mär 2011

Hiper Apar OA34226 Conclusions USS recovery routine BPXPRTRM was redesigned to ensure latches are released if itself abends during recovery / memory / process termination a dump is always necessary to decide whether the latch is orphaned a latch purge tool is available and can be sent out on demand can avoid an ipl CALLRTM can be tried as well Cannot be made available in general because of data integrity reasons new message BPXM123E is issued if a latch is held by a single task for more than 5 minutes (starting with z/os 1.12) would in this special case point to the PPRA latch held before the contention due to MemberGoneRecovery at a scheduled IPL 57 z/os Unix System Services Dump Debugging 15. Mär 2011

Almost done... Any wishes with regards to topics for the next guide? Any concerns / questions? Thank you for your attention! 58 z/os Unix System Services Dump Debugging 15. Mär 2011