A Practical Method to Diagnose Memory Leaks in Java Application Alan Yu



Similar documents
How To Use Java On An Ipa (Jspa) With A Microsoft Powerbook (Jempa) With An Ipad And A Microos 2.5 (Microos)

Oracle JRockit Mission Control Overview

How To Improve Performance On An Asa 9.4 Web Application Server (For Advanced Users)

Tuning WebSphere Application Server ND 7.0. Royal Cyber Inc.

THE BUSY DEVELOPER'S GUIDE TO JVM TROUBLESHOOTING

TDA - Thread Dump Analyzer

How to use IBM HeapAnalyzer to diagnose Java heap issues

Practical Performance Understanding the Performance of Your Application

Tool - 1: Health Center

Mission-Critical Java. An Oracle White Paper Updated October 2008

BEAJRockit Mission Control. Using JRockit Mission Control in the Eclipse IDE

What s Cool in the SAP JVM (CON3243)

Java Performance. Adrian Dozsa TM-JUG

Jonathan Worthington Scarborough Linux User Group

PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design

Zing Vision. Answering your toughest production Java performance questions

HeapStats: Your Dependable Helper for Java Applications, from Development to Operation

NetBeans Profiler is an

Identifying Performance Bottleneck using JRockit. - Shivaram Thirunavukkarasu Performance Engineer Wipro Technologies

Java's garbage-collected heap

Holly Cummins IBM Hursley Labs. Java performance not so scary after all

Memory Profiling using Visual VM

An Oracle White Paper September Advanced Java Diagnostics and Monitoring Without Performance Overhead

Instrumentation Software Profiling

Oracle Corporation Proprietary and Confidential

Using jvmstat and visualgc to Solve Memory Management Problems

A STATISTICAL APPROACH FOR IDENTIFYING MEMORY LEAKS IN CLOUD APPLICATIONS

Lecture 10: Dynamic Memory Allocation 1: Into the jaws of malloc()

Trace-Based and Sample-Based Profiling in Rational Application Developer

Java VM monitoring and the Health Center API. William Smith

Memory Management for Android Apps Patrick Dubroy May 11, 2011

Advanced Performance Forensics

IBM Tivoli Composite Application Manager for WebSphere

Mobile Performance Management Tools Prasanna Gawade, Infosys April 2014

Java Debugging Ľuboš Koščo

Azul Pauseless Garbage Collection

IBM Software Group. SW5706 JVM Tools IBM Corporation 4.0. This presentation will act as an introduction to JVM tools.

Garbage Collection in NonStop Server for Java

IBM Tivoli Composite Application Manager for WebSphere

Transaction Performance Maximizer InterMax

Java Garbage Collection Basics

Web Application Testing. Web Performance Testing

THE BUSY JAVA DEVELOPER'S GUIDE TO WEBSPHERE DEBUGGING & TROUBLESHOOTING

Introduction to Spark and Garbage Collection

enterprise professional expertise distilled

Java Monitoring. Stuff You Can Get For Free (And Stuff You Can t) Paul Jasek Sales Engineer

Oracle Solaris Studio Code Analyzer

The Design of the Inferno Virtual Machine. Introduction

Java Performance Tuning

Eclipse Memory Analyzer and other Java stuff

Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters

Enterprise Manager Performance Tips

Garbage Collection in the Java HotSpot Virtual Machine

Persistent Binary Search Trees

Java Troubleshooting and Performance

EMF Compare. EMF Compare. Summary : Table des mises à jour Version Date Auteur(s) Mises à jour v1.0 06/10/11 Laurent Goubet Initial Version

Memory Management in the Java HotSpot Virtual Machine

Enterprise Application Performance Monitoring with JENNIFER

<Insert Picture Here> Java Application Diagnostic Expert

Troubleshoot the JVM like never before. JVM Troubleshooting Guide. Pierre-Hugues Charbonneau Ilias Tsagklis

Monitoring and Diagnosing Production Applications Using Oracle Application Diagnostics for Java. An Oracle White Paper December 2007

2015 ej-technologies GmbH. All rights reserved. JProfiler Manual

Production time profiling On-Demand with Java Flight Recorder

Performance Best Practices Guide for SAP NetWeaver Portal 7.3

JVM Performance Study Comparing Oracle HotSpot and Azul Zing Using Apache Cassandra

Monitoring and Managing a JVM

JBoss Data Grid Performance Study Comparing Java HotSpot to Azul Zing

Memory Allocation. Static Allocation. Dynamic Allocation. Memory Management. Dynamic Allocation. Dynamic Storage Allocation

Performance Testing and Optimization in Web-Service Based Applications

CSE 403. Performance Profiling Marty Stepp

JProfiler: Code Coverage Analysis Tool for OMP Project

Software Assurance Marketplace Use Case

Performance Monitoring and Tuning. Liferay Chicago User Group (LCHIUG) James Lefeu 29AUG2013

Monitoring Java enviroment / applications

FIGURE Selecting properties for the event log.

Effective Java Programming. measurement as the basis

IBM WebSphere Server Administration

11.1 inspectit inspectit

Debug 101-Using ISA Tools for Apps in WebSphere Application Server z/os

How To Test A Web Server

Eclipse Visualization and Performance Monitoring

Designing with Exceptions. CSE219, Computer Science III Stony Brook University

WebSphere Architect (Performance and Monitoring) 2011 IBM Corporation

Performance Testing. Based on slides created by Marty Stepp

WebSphere Performance Monitoring & Tuning For Webtop Version 5.3 on WebSphere 5.1.x

Application Performance in the Cloud

CF8 Server Monitor Introduction Produced April 15, 2008

Online Recruitment System 1. INTRODUCTION

Deploying the BIG-IP LTM with the Cacti Open Source Network Monitoring System

Performance Improvement In Java Application

Oracle WebLogic Server 11g: Monitor and Tune Performance

Performance Monitoring API for Java Enterprise Applications

Angelika Langer The Art of Garbage Collection Tuning

Discovering Performance Bottlenecks with the SAP JVM Profiler and SAP Memory Analyzer

Automated Faultinjection Series - Risk Management and Implementation

Tomcat Tuning. Mark Thomas April 2009

Oracle Database 12c: Performance Management and Tuning NEW

The Fundamentals of Tuning OpenJDK

WebSphere Server Administration Course

Outline: ISA Tools for WebSphere Comments: on z/os

Transcription:

A Practical Method to Diagnose Memory Leaks in Java Application Alan Yu 1. Introduction The Java virtual machine s heap stores all objects created by a running Java application. Objects are created by the programmers, but never freed explicitly by the code. Garbage collection is the process of automatically freeing objects that are no longer referenced by the program. The garbage collector eliminates the memory related errors, such as dangling pointers and the memory leaks caused by the lost pointer. However, a memory leak may still occur when a Java program maintains references to objects that are no longer needed, preventing the garbage collector from reclaiming the space. In the worst case, unnecessary references refer to a growing data structure, parts of which are no longer in use. These types of leaks can eventually cause the program to run out of memory and crash. In longrunning programs, small leaks can also cause significant performance issues after days or weeks. With the automatic garbage collection, memory leaks are relatively difficult to diagnose since the programmers have less control on the memory allocation. A common garbage collector will also move the objects and change their address to avoid heap fragmentation which adds the difficulties to track a particular object instance. As we will see in Section 2, a number of tools exist that help the user look inside the black box to determine the root cause of a leak. But using these tools to solve memory leaks in large java application is a little tricky. The author of paper [1] summarized a few difficulties they encountered when diagnosing leaks in large Java applications. Perturbation: Acquiring full heap dumps can cause a system with a large heap size to pause for tens of seconds. Tracking the call stack of every allocation will introduce unacceptable overhead, reducing the throughput of the application by five to ten times. For servers these slowdowns or pauses can cause timeouts, significantly changing the behavior of the application. Noise: Given a persisting object, it is difficult to determine whether it has legitimate reason for persisting. For example, caches and resource pools intentionally retain objects for long periods of time, even though the objects may no longer be needed. Data Structure Complexity: Knowing the type of leaking object that predominates, often a low-level type such as String, does not help explain why the leak occurs. Presented with the context of low-level leaking objects, it is easy to get lost quickly in extracting a reason for leakage. To address these three problems, a method is proposed in this article. It provides a step-by-step guide to diagnose memory leaks by using commonly available tools.

2. Tools There are a number of tools available to diagnose memory issues with Java applications. Here are a few that we find useful. 2.1 JDK Troubleshooting Tool Various diagnostic and monitoring tools are shipped with Java Platform, Standard Edition Development Kit (JDK) [2]. The jmap and jhat utilities are generally used in analyzing memory issues. The jmap command-line utility prints memory related statistics for a running VM or core file. It could also be used to dump the Java heap in binary HPROF format to a specified file. The jhat tool provides a convenient means to browse the object topology in a heap snapshot. The tool parses a heap dump in binary format, for example, a heap dump produced by jmap. The tool provides a number of standard queries to find unnecessary object retention. 2.2 YourKit Java Profiler Yourkit is a smart and powerful tool for CPU and memory profiling [3]. It integrates several useful functions to gather and analyze the heap information. YourKit is a commercial product and can be purchased from www.yourkit.com. 2.3 IBM HeapAnalyzer IBM HeapAnalyzer [4] is a graphical tool for discovering possible Java heap leaks. HeapAnalyzer allows the finding of a possible Java heap leak area through its heuristic search engine and analysis of the Java heap dump in Java applications. It analyzes Java heap dumps by parsing the Java heap dump, creating directional graphs, transforming them into directional trees, and executing the heuristic search engine. 2.4 Memory Analyzer Tool (MAT) The Eclipse Memory Analyzer is a fast and feature-rich Java heap analyzer that helps you find memory leaks and reduce memory consumption [5]. Like IBM HeapAnalyer, MAT is a convenient tool in examining the heap dump with GUI. 3. Method 3.1 Steps A typical procedure when diagnosing memory leaks may include the following steps: a) Spot the memory leaks signature. b) Look for a set of candidate data structures/objects that are likely to have problems. c) Identify the root cause in the code. Memory leaks are relatively easy to spot. Turning on the GC log and monitoring the heap size after each round of GC is the most common way to do so. Normally, with the memory leaks problem, a downwardsawtooth pattern of free space (every collection frees less and less) will be observed [1, 6] until the application runs into out-of-memory exceptions. It would also be much helpful if such pattern could be reproduced by a set of particular operations (such as a few test cases). Finding the leak candidates are often a little more difficult. Section 3.2 will discuss several methods with more details.

The last task to locate the bug in code is even much harder for the person other than the developers who own the real code. The heap dump usually does not contain the allocation information of each object unless explicitly enabled. Enabling allocation tracking will introduce heavy overhead especially for the large application. Moreover, it swamps the user with too much low-level detail about individual objects, which this requires a lot of expertise and time to analysis the heap dump. A more light-weight approach is introduced in section 3.3 which provides sufficient information for the developers to find the root cause of a leak. 3.2 Finding Leak Candidates There are several different ways to determine the leak objects. One of the most common methods is to require a heap dump manually or automatically when an out-ofmemory error occurs. The engineer could use some offline tools [3, 4, 5] to analyze the dump file and find out the object(s) with the biggest retained size. The retained heap of an object X is the sum of the sizes of all objects kept alive by X. The main disadvantage of this method is it can cause some false-negative results because of the existence of noise as described in Section 1. Another approach is based on heap differencing [3, 8]. The basic idea is to take two snapshots of the heap, before and after the problem operation. The user could then differentiate between the old objects which existed before the operation, and the new objects which were created during this operation and cannot be released at the end. The drawback of this method is there might be a large amount of objects (false-positive) created during the period which require a lot of time to examine. This article proposed a new method. This method relies on the simple assumption that the numbers of the leaking objects will continuously increase in the long-term [7]. To find these objects, a series of heap histogram (more than 3) is captured during the run-time. Acquiring heap histogram is usually a much more light-weight operation than acquiring the full heap dump. The heap histogram contains the information about the numbers of living instances of each class. The growing trend of each class could be calculated as rank by the Least Square Method [9]. Classes above a rank threshold (R thres ) are reported as leaking. The advantage of this method is the false positive and false negative rate could be controlled by choosing a moderate threshold. Also, a set of leaking related classes would be reported, not just one or two of the dominating objects. This makes it easier for further analysis. Lastly, acquiring heap histogram and calculating the rank is quite easy, and could be accomplished automatically in real-time. 3.3 Finding Allocation Sites The target is to find out the head of a data structure which is leaking in one or more ways. In this case, a heap is necessary to know the reference relationship between the objects. The key idea of the algorithm is to transverse the object graph until figure out a boundary of the leak candidates which is detected the previous step. This process could be started with any class from the candidate set. If the classes who reference that leak candidate are not inside the candidate set, they are the boundary classes. If not, mark these classes as visited and repeat the process on these classes until you reach the boundary classes. Sometimes, the boundary classes are low-level types and should be backtracked further until it encounters a high-level type with a meaningful package prefix. For example, many of the memory leak

bugs are caused by misuse of containers. However, the detected boundary class such as java.util.hashmap tells the programmer little to locate a bug. On the contrary, the classes with the com.documentum package prefix are more informative for a programmer to investigate into the class source code directly. 4. Example The proposed method is evaluated with a real bug from JIRA. From GC log, one could easily spot the memory leaks as showed in the following figure. To diagnose the problem with the proposed technique, ten snapshots of heap histogram information were collected in running the leaking operations for a few hours. The following table lists the number of detected leaking candidates with a moderate threshold (R thres =100). In this case, even a small number of snapshots could generate quite selective report. Totally 42 classes are suggested as leaking candidates by 6 snapshots. Number of snapshots 3 4 5 6 10 Number of candidates 47 45 45 42 42 The next step is to identify the leaking data structure. Starting from a randomly selected candidate and backtracking via the reference graph in a few minutes, the boundary class java.util.concurrent.concurrenthashmap$hashentry[] was detected successfully. Certainly, the class name indicates little about the leak cause, so will be dug further. In a short time, a class with a meaningful name, com.emc.documentum.fs.rt.context.impl.simplefixeddelaycleanupcache was found to hold these unnecessary objects. This information provides the details needed for the developer to identify the root cause and fix the leak.

5. Conclusion Java memory leak is a serious issue that will degrade system performance and may even make the server crash. Detecting and diagnosing memory leak is one of the duties in the performance test cycle, especially when running longevity tests. An easy and repeatable method is introduced in this article. This method generally has a very low overhead and is demonstrated to be effective with an example of the real product. I think this approach is useful when diagnosing memory issues causing performance problems. 6. Reference [1] Nick Mitchell and Gary Sevitsky. LeakBot: An Automated and Lightweight Tool for Diagnosing Memory Leaks in Large Java Applications. [2] Sun JDK Troubleshooting Tool. http://docs.oracle.com/javase/ [3] Yourkit. http://www.yourkit.com/ [4] IBM HeapAnalyzer https://www.ibm.com/developerworks/mydeveloperworks/groups/service/html/communityview?com munityuuid=4544bafe-c7a2-455f-9d43-eb866ea60091 [5] Memory Analyzer (MAT). http://eclipse.org/mat/ [6] How to Fix Memory Leaks in Java. http://olex.openlogic.com/wazi/2009/how-to-fix-memory-leaksin-java/ [7] Maria Jump and Kathryn S. McKinley. Cork: Dynamic Memory Leak Detection for Java [8] Wim De Pauw and Gary Sevitsky. Visualizing Reference Patterns for Solving Memory Leaks in Java [9] Least squares, regression analysis and statistics. http://en.wikipedia.org/wiki/least_squares#cite_note-brertscher-0