Moving beyond hardware
These slides represent the work and opinions of the author and do not constitute official positions of any organization sponsoring the author s work This material has not been peer reviewed and is presented here with the permission of the author. The author assumes no liability for any content or opinion expressed in this presentation and or use of content herein.
Michael Medin I work with Integration Not system administration Currently at Oracle Corporation But do not worry! I am NOT here to sell you anything I am also A developer An avid fan of open source The author of NSClient++ Trying to convince everyone monitoring is good!
Introduction Integration Applications Monitoring and Application Servers JMX JMX Tools Nagios perspective Zenoss perspective Using JMX Exploring JMX Monitoring JMX Extending JMX
Who monitors their applications today? Who uses JMX Who uses WMI Who uses SNMP Who uses custom built things What else do you use?
What do you monitor? Application Errors? Memory? System Memory? Application Memory? Component memory? Threads? Database pools? Internal Errors? Messages? Non events? Performance? System Performance? Database Performance? Application Performance? Component Performance?
My field of expertiese
In this case: FTP Web Services Databases Application servers IPC But often: Files Queues EDIFact EBXML Email messages Network VPN External components etc Metrics: Disks System Load Memory utilisation Database Utilization Performance Processes Business Process Messages
How would you approach something like this? A lot of custom scripts? Use the supplied GUI? Some basic standard and custom things? Maybe simpler to skip monitoring?
Nagios (op5) Grid Control Scripts Files Databases (oracle) Applications FTP Queues Web services Disks Database Utilization Database System Load Performance Application servers Memory utilization BPEL Console BAM (Business Application Monitor) B2B Home Brew Processes Services (some) Business Process EDIFact EBXML RosettaNet Email messages etc
The Tech Support Team Nagios Scripts The DBA Grid Control Scripts The ICC Operations Team Home brew (usually part of the application) BPEL Console BAM B2B
Many systems means; Many places to look Many people to ask NOT many people who knows it is broken (Often no one at all) Hard to; Correlate data Be proactive Find performance bottlenecks Find missing files Debug system wide applications
We spend millions of euro to build one integrated system for business messages. Why not save money using one integrated system for monitoring?
In other words, the boring stuff
Still not convinced?
A net work administrator changes a firewall rule Preformed during the service window The service windows is 23:00 -> 02:00 (Friday) The network admin Checks the monitoring and checklists Everything is green! Goes home happy (grabs a beer on the way home) The integration staff is not on call (only 8-17) An integrations service breaks (as a database pool gets corrupted) No one sees this (no one is checking the BPEL Console ) This is discovered on Monday at 09:00 Why?
What can you monitor?
Application memory Make sure the application has available memory (not the same as the system memory usage) Threads Make sure we have active threads But not too many Sessions Make sure the application is used But not to much Response Time How long are requests taking HTTP, EJB, Database, etc etc
Database pool Make sure the application can access the database Make sure the database is being used Transactions How long did transactions take How many transactions are committed How many are rolled back Queues How many messages are waiting (are the queues filling up) And this is just a few examples Your applications needs will be different
What is an application?
It depends on who you ask 20 years ago I would have said: This assembler thing I wrote 10 years ago I would have said: This C++ thing we wrote. Now I usually answer this J2EE system This means Java (and.net) And they usually run in an application server Java Inside a Java J2EE Application server JBoss, WLS, WebSphere, OC4J, Orion, etc etc....net Inside.Net framework Microsoft.Net Framework
Main tasks are to: Manage applications Provide resources Deliver utility functions Manage means to manage; State Resources Configuration Connections (database etc) Pools (thread, memory, database, etc) Caches etc Wait Is this not what we want to monitor?
JDBC Database connections (and pools) JMS Queues (and such) JTA Transactions <Random acronym here> Whatever buzzword is popular today And the best things is: All this is provided and managed by the application server
Out of the box we can monitor: Database connections Database pools Queues Transactions Etc etc etc etc etc etc (yes, there is a lot more) Almost all resources are managed! So we can out of the box monitor most things Now if only there was a way to monitor this
The default and standard way is: JMX Other options; SNMP Web Services Scripts Various proprietary solutions (often provided by the Application Server) Custom Written things But JMX is the standard way in Java And it works everywhere!
.Net provides the same thing WMI (Windows Management Instrumentation) (but not as extensive)
Java Managment Extensions
Java Management Extensions API for management communication Think of it as snmp or Wmi but for Java
JMX Consists of: Agent A repository of MBeans Client The user (in your case check_jmx) Mbeans Providers (in your case server components) Which can be monitored
Managed Bean Accessed through the Agent You can: Probe information Change parameters Execute administrative functions Subscribe to events There is a lot more But this is all you really need to know
Attributes Set Write attributes Get Read attributes Operations Execute Run commands Notifications Subscribe Subscribe to events Nagios Check commands Read attributes Event handlers Execute operations Write attributes Passive Checking (NSCA) Notifications
A big list of objects. Will change live as the application runs BUT not always the same repository For instance: WLS uses their own
A managed object Has metadata (descriptions) Identified by: A name Usually key/value pairs For instance: Name=Something, Type=Something, etc etc A part of your application (server) Delivering live metrics
The JMX specification does NOT support remote checking! Usually this is circumvented by using RMI The sad word here is Usually! Remember There are other options then JMX For instance SNMP agent
Ho to monitor Monitoring
Nagios is script driven Means you have a check_xxx for everything check_cpu check_mem check_nrpe check_... Application Server Exposes Jmx (Java) WMI (.Net) So you need a check_wmi and a check_jmx!
Generic JMX Plugin Jmxquery/Syabru/* (check_jmx) check_jmx4perl JBoss Plugin JBossNagiosPlugin Monju Others (non JMX) JSend NSCA
Written in python/java Remote or local JMX connection Forks java (once per check) Only > check Does not support custom protocols Without modification
Notes: Always use vvvv when testing Or no errors will be displayed Command syntax: check_jmx -U <URL> -O <Object> -A <Attribute> -w <warning> -c <critical> Sample command (from -help): check_jmx -U service:jmx:rmi:///jndi/rmi://localhost:8888/jmxrmi -O java.lang:type=memory -A NonHeapMemoryUsage -K used -w 100000000 -c 128000000
Command: jbossjmx_plugin Written in Perl Only local checks (ie. NRPE/SSH) Forks Java (once per check) Only > checks Presumably works with other servers but requires JBoss
Notes: Version is only to find Jboss Command syntax: jbossjmx_plugin <JBoss Server URL> <JBoss Version 3 4> <JBoss MBean Object Name> <JBoss MBean Attribute> <Warn Threshhold> <Critical Threshhold> Sample command: jbossjmx_plugin 127.0.0.1 3 jboss.jca:name=oracledsn,service=managedconnectionpool InUseConnectionCount 20 50
Written in Python (check) and Java (servlet) Based on a servlet (agent) No forking (requires deploy on a webserver) Only > checks Pretty hard to write queries Presumably works with non Jboss servers
Notes: Very complex and hard to use Command syntax:./check_jboss.sh 'nagios/jmx? obj_name=<object name>&attribute=$<attribute name>&is_int=<integer>&warn=<warn value>&crit=<crit value>' Sample Command: check_jboss.sh nagios/jmx?obj_name=jboss.system:type=serverinfo &attribute=freememory&is_int=true&warn=10000000 0&crit=50000000
Written in Perl (check) and Java (servlet) Based on a servlet (agent) No forking (requires deploy on a webserver)
Notes: URL is NOT! JMX Url (it is java component) Command Syntax: check_jmx4perl --url <j4p URL> --name <alias> --mbean <Mbean> --attribute <atribute> --path <part> --critical <crit>--warning <warn> Sample Command: check_jmx4perl --url http://localhost:8888/j4p --name memory_used --mbean java.lang:type=memory --attribute HeapMemoryUsage --path used --critical 10000000 --warning 5000000
Jmx4perl The good: Powerfull syntax Nagios standard No forking The Bad: Requires a custom component on the server No security model check_jmx The good: Java Standard Standard security model The bad: Very basic usage Requires Java fork on check
Check_perfect_jmx Local Java agent (one process) With a simple socket/nrpe/* interface Which connects to Java agents via RMI/* And checks remotly
The Zenoss perspective
First off I have only looked into this briefly! Never managed to figure out how Zenoss works So I could be wrong Relays on Java Runs in the background (as a demon locally) One demon per target server Does not support events
Finding things to monitor
JConsole Comes with Java (1.5 and later) Can attach to local processes Without any configuration Can attach to remote processes Given a connection string Third party browsers Some Application Servers have them builtin
default repository (18) WLS repository (4500)
Mbean name: com.bea:name=soa_server1,type=serverlifecycleruntime Breaking it down we get com.bea Namespace Name=soa_server1 Property (name) Type=ServerLifeCycleRuntime Property (type) This looks like a query but it is not a query! com.bea:type=serverlifecycleruntime (does not work)
A quick demo
Some useful things to monitor
Heap space (what usually fills up) java.lang:type=memory getheapmemoryusage() Memory Pools java.lang:type=memorypool,name=pool's name Eden Space (Heap Memory) Pool from which memory is initially allocated for most objects Survivor Space (Heap Memory) Pool containing objects that have survived GC of eden space. Tenured Generation (Heap Memory) Pool containing objects that have existed for some time in the survivor space. Permanent Generation (Non-Heap) Holds all the reflective data of the virtual machine itself, such as class and method objects. With JVMs that use class data sharing, this generation is divided into read-only and read-write areas.
Thread Pools java.lang:type=threading ThreadCount Number of current threads PeakThreadCount Maximum thread count
Will be different depending on your application server as will most other things unfortunately
What your applications relay on is nothing I know you need to talk to your application developers! But the why to do it is JMX! (or WMI for windows )
Monitoring YOUR applications
Writing custom Mbeans is very simple Always better then a home brew solution Ask your developers
Simple example: public interface HelloMBean { public String getname(); } public class Hello implements HelloMBean { public String getname() { return Hello World ; } } class Sample { public static void main(string[] args) { MBeanServer mbs = ManagementFactory.getPlatformMBeanServer(); ObjectName name = new ObjectName("com.example:type=Hello"); mbs.registermbean(new Hello(), name); } }
Wrapping up
Application Monitoring is fun JMX is pretty simple JMX is very powerful JMX is better JMX is the standard For windows replace JMX with WMI above! Talk to your developers!
Questions/Thoughts/Ideas?
michael@medin.name http://www.linkedin.com/in/mickem http://www.medin.name Information about NSClient++ http://nsclient.org Slides, and examples at: http://nsclient.org/nscp/conferances/omc-2009/