Java Memory Leaks Detecting Memory leaks across multiple VMs Albert Mavashev, CTO Nastel Technologies, Inc. amavashev@nastel.com
2 Memory Perspectives OS Memory Perspective VM Memory Perspective Java Heap GC, Heap Used %, Max Heap Memory Used, Threads, Handles OS Memory Footprint May expand independently of JVM heap VM Memory footprint Subject to GC Activity Expansion in Heap will cause expansion in OS foot print Expansion in OS footprint does not always mean expansion in heap
Example: Max heap allocated upfront OS Memory Perspective VM Memory Perspective 90% Heap Free 10% Heap Used OS Memory Footprint Constant no growth VM Memory footprint Heap Used, Heap Free are the only fluctuating parameters Memory Used, Threads, Handles
Example: Floating Heap OS Memory Perspective Heap Allocated VM Memory Perspective 45% 45% Heap Available for expansion 10% Heap Used OS Memory Footprint Expand or contract with heap usage VM Memory footprint Heap Used, Heap Free are the only fluctuating parameters Memory Used, Threads, Handles
Resource Leaks: Typical symptoms OutOfMemory exceptions Not always a leak: could be due to heap sizing issues Threads reporting exceptions are not always what is leaking memory Increasing GC activity Watch GC frequency, GC duration Obvious: Increasing heap usage Not so obvious Thread leaks, handle leaks, JDBC statement leaks ClassLoader leaks
Typical causes of Java leaks Programming errors, bugs (within VM) Unchecked array, list, hash map growth Not closing JDBC Prepared Statements Not closing Sockets, File handles Thread leaks, handle leaks Class loader leaks Outside of VM perimeter Resources allocated outside JVM Handle, threads, semaphores, memory
Class Loader leaks Typically caused by application re deployment Old class loader is discarded Holds on to Class definitions, static fields New Class Loader instantiated and new set of classes are re loaded Old Class Loader New Class Loader UserObject.class Leak UserObject.class Object instances Object instances
Typical Remedies Find the root cause and fix the code Often difficult to troubleshoot Hard to trace leaks within third party libraries Restart JVM(s) to reset resource usage Failover to the secondary instance Application server clustering/failover Viable only if leaks are non aggressive (slow growing) Early detection/warning of possible leaks Time buffer to decide on proper remedy Allows time for diagnostics with minimal downtime Avoid crisis situations, performance degradation associated with resource exhaustion
Detecting Resource Leaks across multiple VMs Determine detection model for a single VM Apply the model across multiple VMs (by induction) Monitor OS resource perimeter Memory foot print, handles, threads Monitor VM resource perimeter Heap allocated, max, used GC activity, frequency duration For each resource metric (memory, GC, handles, threads, etc.) Measure momentum (rate and size of advances vs. declines) Momentum Oscillator Resource leaks can not be determined simply by exceeding some predefined threshold 45% 45% 10% 40% 40% 20% 35% 60% 5% 30% 20% 50%
Leaking Chart Pattern Detecting Resource Accumulation VM Heap Usage % VM Heap Usage %
Leaking Chart Pattern Detecting GC Duration (ms.) GC Duration
Detecting Resource Leaks using Heap not yet exhausted Momentum Oscillator Leak pattern detected Momentum Oscillator: values between 0 100, difference between the sum of all recent gains and losses in the underlying metric. Value of 50 means that the net difference of gains and losses is zero 0 net gain and loss. Momentum Oscillator Trending higher
Example: Memory Diagnostics
Early Warning System for JVMs: Summary Monitor critical resource indicators Memory usage, GC activity, Handles, Threads For each one measure momentum oscillator Fluctuates between 0 100, where anything above 60 would indicate advances outpacing declines by a significant margin Higher number indicates how aggressive the resource leak is, 80 would indicate a very aggressive resource growth, regardless of the actual usage numbers for each metric Alert, notify/act when one or more momentum oscillators breaches a threshold Time to avoid downtime, provision, diagnose Time to avoid crisis mode, performance degradation
Visit us at: For more information www.nastel.com Questions: info@nastel.com Twitter: twitter.com/nastel FaceBook: facebook.com/nasteltechnologies LinkedIn: linkedin.com/companies/nastel technologies Phone: +1.800.580.2344 15