1 OS Thread Monitoring for DB2 Server Minneapolis March 1st, 2011 Mathias Hoffmann ITGAIN GmbH mathias.hoffmann@itgain.de
2 Mathias Hoffmann Background Senior DB2 Consultant Product Manager for SPEEDGAIN for DB2, which is the DB2 LUW performance management software of ITGAIN Currently involved in performance tuning projects in Germany Worked extensively with DBMS and Datawarehouses in the distributed world Origins on z/os Personal Data Working for ITGAIN since 2002, Germany Department Manager for Operational and Professional Services Master of Mathematics
3 Agenda Motivation DB2 Internals The Process Model Implementation <V9.5 and V9.5 OS Process and Thread Monitoring Matching of OS and DB2 Monitoring
4 Agenda Motivation DB2 Internals The Process Model Implementation <V9.5 and V9.5 OS Process and Thread Monitoring Matching of OS and DB2 Monitoring
5 Motivation Q: Why bother about OS processes and threads? A: Knowledge about interaction of DB2 and its components helps to understand and solve lots of problem situations. Since V9.5 I can t see my processes anymore!
6 Motivation list applications doesn t show which resources are used. Is an application waiting (e.g. for I/O) or doing work? We need a mapping from DB2 resources to the corresponding OS resources. That s why we need OS monitoring.
7 Motivation Real-life Scenarios Scenario 1: Slowdown in query response time (Feb. 2011) General slowdown in query response time, but list applications didn t show anything out of the ordinary running at the time. A deeper look at the running processes showed that db2loggw (database log writer) was waiting for I/O almost all the time. Thats how we could draw attention to the I/O subsystem and figure out that the storage team had made a configuration change earlier that morning. Scenario 2*: Slowdown in query response time Same situation as scenario 1, but different root cause. In this case the db2rebal process was running which rebalances the data when adding a DMS container, resulting in a huge load on the storage subsystem. Scenario 3*: Infrequent cleaning of buffer pool pages Again there was a problem with response times. Various times throughout the day there was a slowdown, but snapshots didn t show anything abnormal. By examining the CPU usage of the processes * + we were able to determine that the I/O cleaners (db2pclnr) were consuming over 90% of the CPU time. By tuning the I/O cleaners they achieved over 50% higher throughput. * www.ibm.com/developerworks/data/library/techarticle/0304chong/0304chong.html
8 Agenda Motivation DB2 Internals The Process Model Implementation <V9.5 and V9.5 OS Process and Thread Monitoring Matching of OS and DB2 Monitoring
9 The DB2 Process Model http://publib.boulder.ibm.com/infocenter/db2luw/v9r5/topic/com.ibm.db2.luw.admin.perf.doc/doc/c0008930.html
10 The DB2 Process Model Alternate Presentation I http://www.ibm.com/developerworks/data/library/techarticle/0304chong/0304chong.html
11 The DB2 Process Model Alternate Presentation II
12 The DB2 Backup Process Model http://www.ibm.com/developerworks/data/library/techarticle/dm-0501zikopoulos/
13 Agenda Motivation DB2 Internals The Process Model Implementation <V9.5 and V9.5 OS Process and Thread Monitoring Matching of OS and DB2 Monitoring
Implementation of the DB2 Process Model <V9.5 and V9.5 14 In general: Each database task is typically performed by one separate engine dispatchable unit (EDU). DB2 Version < 9.5: EDUs are implemented using separate processes in Linux and UNIX, and using operating system threads inside the main engine process in Windows. DB2 Version 9.5: EDUs are implemented using operating system threads inside the main engine process on Linux, UNIX and Windows.
15 Multiprocessing vs. Multithreading http://www.fmc-modeling.org/category/projects/apache/amp/a_1_unix_processes.html
16 Threading Advantages Configuration Simplified Memory configuration Automatic Agent configuration Performance Context switching between threads is generally faster than between processes No need to switch address space Less cache pollution Resources Operating system threads can share resources and therefore require less context than processes Memory savings Shares address space, context information (such as uid, file handle table, etc.) Significantly fewer system file descriptors used All threads in a process can share the same file descriptors No need to have each agent maintain its own file descriptor table! Disadvantage: Monitoring threads is more difficult than monitoring processes.
Relative throughput on Linux x64 Copyright 2011 ITGAIN GmbH Per-agent Memory Footprint (MB) lower is better 17 Threading Advantages Increased throughput by 14 % on Linux x64 internal OLTP workload Savings of up to 1 MB per agent due to new threaded architecture
18 Agenda Motivation DB2 Internals The Process Model Implementation <V9.5 and V9.5 OS Process and Thread Monitoring Matching of OS and DB2 Monitoring
19 OS Process and Thread Monitoring ps aux ps auxr Lists all processes Lists all running processes (Linux)
20 OS Process and Thread Monitoring ps llfp <pid> ps m o THREAD p <pid> Linux Outputs threads running inside a UNIX process ps el o THREAD,tid,status,state Linux Outputs all threads (and processes) ps em o THREAD UNIX with status information
21 Process / Kernel States (Extract) State D R S T W X Z Description Uninterruptible sleep (Usually I/O) Running or runnable (on run queue) Interruptable sleep (waiting for an event to complete) Stopped (either by a job control signal or because it is being traced Paging (not valid since the 2.6.xx kernel) Dead (should never be seen) Defunct ( zombie process, terminated but not reaped by its parent) Linux State R S W Z Description Running or runnable AIX Sleeping (waiting on I/O completion, a pipe, memory, etc) Swapped Canceled
22 OS Process and Thread Monitoring vmstat iostat sar / top / topas / nmon / Memory Utilization Disk Utilization System activity information
23 Monitoring DB2 Processes db2_local_ps ps fu <instance_owner> db2ptree, /usr/ucb/ps Outputs all of the DB2 processes running under an instance on UNIX / Linux Server-side processes on Solaris <V9.5 V9.5
24 Monitoring DB2 Processes Get active threads for a running instance
25 Monitoring DB2 EDUs db2pd -edus Outputs all EDUs in the instance
26 Monitoring DB2 Processes Windows db2stat Outputs all of the DB2 processes running under an instance on Windows
27 Overview DB2 Processes since V9.5 Name Description Platform db2acd Autonomic computing daemon Linux, UNIX db2ckpwd Password checker (process owner is root) db2fmcd Fault monitor coordinator daemon UNIX db2fmd Fault monitor daemon UNIX db2fmp db2sysc Fenced mode process for executing UDFs and fenced SPs system controller (db2syscs.exe on Windows) Linux, UNIX db2vend Fenced vendor process Linux, UNIX All All db2wdog Watchdog process that handles abnormal terminations (process owner is root) Linux, UNIX
28 Overview DB2 EDUs (Extract) since V9.5 Name Description Level db2ipccm IPC communication manager Instance db2tcpcm TCP communication manager Instance db2dlock Local deadlock detector Database db2pclnr Bufferpool page cleaner Database db2pfchr Bufferpool prefetcher Database db2stmm Self-tuning memory manager Database db2agent Coordinator agent Application db2agntp Working subagent Application db2agnta Idle subagent Application db2bm Backup and restore buffer manipulator Per-Request db2med Backup and restore media controller Per-Request
29 Agenda Motivation DB2 Internals The Process Model Implementation <V9.5 and V9.5 OS Process and Thread Monitoring Matching of OS and DB2 Monitoring
30 Application Monitoring db2 list application show detail application status in sqllib\sql_mon.h
31 Process Monitoring before V9.5 list applications show detail pid/thread pid ps aux grep <pid>
32 Process Monitoring before V9.5 Matching of applications in status UOW Executing to running processes:
33 Thread Monitoring since V9.5 list applications show detail pid/thread EDU ID db2pd -edus Kernel TID TID ps el o THREAD,tid,status,state grep <TID>
34 Thread Monitoring since V9.5 Identify the threads status corresponding to a DB2 application (Version 1, Linux) 1. Get pid/thread for the application : 40 2. Get Kernel Thread ID for the EDU : 9200 3. Get status for TID : S
35 Thread Monitoring since V9.5 Identify the threads status corresponding to a DB2 application (Version 2, AIX) 1. Get pid/thread for the application : 7033 2. Get Kernel Thread ID for the EDU : 2289675 3. Get PID for DB2 system controller process : 950320 4. Get threads running inside the system controller process 5. Get status for TID : S
36 CPU Parallelism If intra-partition parallelism is activated (DBM CFG: INTRA_PARALLEL = YES), the coordinator agent will use subagents to do the work Use application snapshot to identify the subagents 1. list applications show detail Application Handle : 15010 2. get snapshot for application agentid 15010 EDUs 82 Coordinator Agent 171 Subagent 1 166 Subagent 2 3. get status for all (sub-)agents
37 Linux DB2 Thread Monitor A handy script to monitor DB2 EDUs (Linux).
38 Linux DB2 Thread Monitor The script outputs EDUs which are waiting for I/O or using CPU
39 Summary The goal is to match the information DB2 provides with those from OS thread monitoring. Use OS thread monitoring to answer questions like What resources is a DB2 application using? Is my application currently doing work or waiting for itself on other resources? What resources is the DB2 engine using which cannot be seen with a list applications? Analyze the thread status to get deeper insight of an applications activity. Specially on AIX use vmstat, iostat, etc. to analyze the system utilization and identify if your system is CPU or I/O bound. Since DB2 V9.5 the analysis has to be done on thread level, before V9.5 on process level.
40 Q & A Q: Does the new process implementation impact the database s applications? A: No, despite better performance it is transparent to applications....
41 Contact Information Mathias Hoffmann Product Manager of SPEEDGAIN for DB2 Fon +49 (0) 175 263 3636 mathias.hoffmann@itgain.de ITGAIN GmbH Vahrenwalder Straße 269a 30179 Hannover / Germany Fon +49 (0) 511 96 66 817 www.itgain.de ITGAIN Inc. 1170 Howell Mill Road, Suite 300 Atlanta, GA 30318 / USA Fon +1 800 618 1686 info@it-gain.com ITGAIN Inc.