IBM Software Group Java Garbage Collection Best Practices for Sizing and Tuning the Java Heap Chris Bailey WebSphere Support Technical Exchange
Objectives Overview Selecting the Correct GC Policy Sizing the Java heap Questions/Answers WebSphere Support Technical Exchange 2
Garbage Collection Performance GC performance issues can take many forms Definition of a performance problem is user centric User requirement may be for: Very short GC pause times Maximum throughput A balance of both First step is ensure that the correct GC policy has been selected for the workload type Helpful to have an understanding of GC mechanisms Second step is to ensure heap sizing is correct Third step us to look for specific performance issues WebSphere Support Technical Exchange 3
IBM Software Group Selecting the Correct GC Policy WebSphere Support Technical Exchange
Understanding Garbage Collection Responsible for allocation and freeing of: Java objects, Array objects and Java classes Allocates objects using a contiguous section of Java heap Ensures the object remains as long as it is in use or live Determination based on a reference from another live object or from outside of the Heap Reclaims objects that are no longer referenced Ensures that any finalize method is run before the object is reclaimed WebSphere Support Technical Exchange 5
Object Allocation Requires a contiguous area of Java heap Driven by requests from: The Java application JNI code Most allocations take place in Thread Local Heaps (TLHs) Threads reserve a chunk of free heap to allocate from Reduces contention on allocation lock Keeps code running in a straight line (fewer failures) Meant to be fast Available for objects < 512 bytes in size Larger allocates take place under a global heap lock These allocations are one time costs out of line allocate Multiple threads allocating larger objects at the same time will contend WebSphere Support Technical Exchange 6
Object Reclamation (Garbage Collection) Occurs under two scenarios: An allocation failure An object allocation is requested and not enough contiguous memory is available A programmatically requested garbage collection cycle call is made to System.GC() or Runtime.GC() the Distributed Garbage Collector is running call to JVMPI/TI is made Two main technologies used to remove the garbage: Mark Sweep Collector Copy Collector IBM uses a mark sweep collector or a combination for generational WebSphere Support Technical Exchange 7
Global Collection Policies Garbage Collection can be broken down into 2 (3) steps Mark: Find all live objects in the system Sweep: Compact: Reclaim unused heap memory to the free list Reduce fragmentation within the free list All steps are in a single stop-the-world (STW) phase Application pauses whilst garbage collection is done Each step is performed as a parallel task within itself Four GC Policies, optimized for different scenarios -Xgcpolicy:optthruput optimized for batch type applications -Xgcpolicy:optavgpause -Xgcpolicy:gencon -Xgcpolicy:subpools optimized for applications with responsiveness criteria optimized for highly transactional workloads optimized for large systems with allocation contention WebSphere Support Technical Exchange 8
Parallel GC (optthruput) Parallel Mark Sweep Collector, with compaction avoidance Created to make use of additional processors on server systems Designed to increase performance for SMP and not degrade performance for uni-processor systems Optimized for Throughput Best policy for batch type applications Consists of a single flat Java heap: 0 GB 2 GB LOA Heap Base Heap Size Heap Limit WebSphere Support Technical Exchange 9
GC Helper Threads Parallelism achieved through the use of GC Helper Threads Parked set of threads that wake to share GC work Main GC thread generates the root set of objects Helper threads share the work for the rest of the phases Number of helpers is one less than the number of processing units So helper threads and main GC thread equals the number of processing units Configurable using -Xgcthreads WebSphere Support Technical Exchange 10
Parallel Mark/Parallel Sweep view of GC WebSphere Support Technical Exchange 11
Concurrent GC (optavgpause) Reduces and makes more consistent the time spent inside Stop the World GC Reduction usually between 90 and 95% Achieved by carrying out some of the STW work whilst application is running 1.4.2: 5.0: Concurrent Marking Concurrent Marking and Concurrent Sweeping Slight overhead on thruput for greatly reduced STW times Policy is ideal for systems with responsiveness criteria eg. Portal applications WebSphere Support Technical Exchange 12
Parallel and Concurrent Mark/Sweep Concurrent Kickoff WebSphere Support Technical Exchange 13
Concurrent Mark hidden object issue Higher heap usage WebSphere Support Technical Exchange 14
Concurrent Mark hidden object issue Higher heap usage Dangling pointer! because not all garbage removed WebSphere Support Technical Exchange 15
Generational and Concurrent GC (gencon) Similar in concept to that used by Sun and HP Parallel copy and concurrent global collects by default Motivation: Objects die young so focus collection efforts on recently created objects Divide the heap up into a two areas: new and old Perform allocates from the new area Collections focus on the new area Objects that survive a number of collects in new area are promoted to old area (tenured) 0 GB 2 GB Allocate Survivor Nursery (new) Space Heap Base Tenured (old) Space LOA Heap Size Heap Limit Ideal for transactional and high data throughput workloads WebSphere Support Technical Exchange 16
Nursery (new) Space Copy Collection Nursery/Young Generation Allocate Survivor Space Survivor Allocate Space Nursery is split into two spaces (semi-spaces) Only one contains live objects and is available for allocation Minor collections (Scavenges) move objects between spaces Role of spaces is reversed Movement results in implicit compaction WebSphere Support Technical Exchange 17
Subpooling (subpool) Goals: Reduce allocation lock contention by distributing free memory into multiple lists Reduce allocation contention through use of atomic operations instead of a heap lock Prevent premature garbage collections by using a best fit (or closer to best fit) policy instead of address ordered Ideal for very large SMP systems where large amounts data is being allocated where there is heap lock contention WebSphere Support Technical Exchange 18
Looking for Heap Lock Contention All locks can be profiled using Java Lock Analyzer (JLA) http://www.alphaworks.ibm.com/tech/jla (AlphaWorks) Provides time accounting and contention statistics for Java and JVM locks Functionality includes: Counters associated with contended locks Total number of successful acquires Recursive acquires times a thread acquires a lock it already owns Number of times a thread blocks because a monitor is already owned Cumulative time the monitor was held. WebSphere Support Technical Exchange 19
JLA Sample Report System (Registered) Monitors %MISS GETS NONREC SLOW REC TIER2 TIER3 %UTIL AVER-HTM MON-NAME 87 5273 5273 4572 0 710708 18487 1 95408 JITC Global_Compile lock 9 6870 6869 631 1 113420 2976 0 11807 Heap lock 5 1123 1123 51 0 11098 286 1 248385 Binclass lock 0 1153 1147 5 6 1307 33 0 47974 Monitor Cache lock 0 46149 45877 134 272 36961 877 1 6558 JITC CHA lock 0 33734 23483 19 10251 6544 150 1 17083 Thread queue lock 0 5 5 0 0 0 0 0 9309689 JNI Global Reference lock 0 5 5 0 0 0 0 0 9283000 JNI Pinning lock 0 5 5 0 0 0 0 0 9442968 Sleep lock 0 1 1 0 0 0 0 0 0 Monitor Registry lock 0 0 0 0 0 0 0 0 0 Evacuation Region lock 0 0 0 0 0 0 0 0 0 Method trace lock 0 0 0 0 0 0 0 0 0 Classloader lock 0 0 0 0 0 0 0 0 0 Heap Promotion lock Java (Inflated) Monitors %MISS GETS NONREC SLOW REC TIER2 TIER3 %UTIL AVER-HTM MON-NAME 15 68 68 10 0 2204 56 2 11936405 test.lock.testlock1@a09410/a09418 2 42 42 1 0 186 5 0 300478 test.lock.testlock2@d31358/d31360 0 70 70 0 0 41 1 0 7617 java.lang.ref.referencequeue$lock@920628/920630 WebSphere Support Technical Exchange 20
JLA: Fields in the report WebSphere Support Technical Exchange 21
Choosing the Right GC Policy Four GC Policies, optimized for different scenarios -Xgcpolicy:optthruput optimized for batch type applications -Xgcpolicy:optavgpause optimized for applications with responsiveness criteria -Xgcpolicy:gencon -Xgcpolicy:subpools optimized for highly transactional workloads optimized for large systems with allocation contention How do I know whether to use optavgpause or gencon? Monitor GC activity Look for certain characteristics WebSphere Support Technical Exchange 22
Monitoring GC Activity Use of Verbose GC logging only data that is required for GC performance tuning Graph Verbose GC output using GC and Memory Visualizer (GCMV) from ISA Activated using command line options -verbose:gc -Xverbosegclog:[DIR_PATH][FILE_NAME],X,Y where: [DIR_PATH] is the directory where the file should be written [FILE_NAME] is the name of the file to write the logging to X is the number of files to Y is the number of GC cycles a file should contain Performance Cost: (very) basic testing shows a 2% overhead for GC duration of 200ms eg. if application GC overhead is 5%, it would become 5.1% WebSphere Support Technical Exchange 23
Important Characteristics for Choosing GC Policy Rate of Garbage Collection High rates of object burn point to large numbers of transitional objects, and therefore the application may well benefit from the use of gencon Large Object Allocations? The allocation of very large objects adversely affects gencon unless the nursery is sufficiently large enough. The application may well benefit from optavgpuse Large heap usage variations The optavgpause algorithms are best suited to consistent allocation profiles Where large variations occur, gencon may be better suited Rule of thumb: if GC overhead is > 10%, you ve most likely chosen the wrong one WebSphere Support Technical Exchange 24
Rate of Garbage Collection optavgpause gencon Gencon could handle a higher rate of garbage collection Completing the test quicker Gencon had a smaller percentage of time in garbage collection Gencon had a shorter maximum pause time WebSphere Support Technical Exchange 25
Rate of Garbage Collection Gencon provides less frequent long Garbage Collection cycles Gencon provides a shorter longest Garbage Collection cycle WebSphere Support Technical Exchange 26
Large Object Allocations (Very) Large Object allocations affects the gencon GC policy If object is larger than the Nursery size, the object is immediately tenured Removes the benefit of generational heaps Still has the additional overhead of running generational If object is fits in the nursery but fills it, frequent nursery collects will have to occur Too frequent nursery collects mean objects are likely to survive and need copying Copying is an expensive process If (Very) Large Objects are being used, a sufficiently large enough nursery is required WebSphere Support Technical Exchange 27
IBM Software Group Sizing the Java Heap WebSphere Support Technical Exchange
Sizing the Java Heap Maximum possible Java heap sizes The correct Java heap size Fixed heap sizes vs. Variable heap sizes Heap Sizing for Generational GC WebSphere Support Technical Exchange 29
Maximum Possible Heap Size 32 bit Java processes have maximum possible heap size Varies according to the OS and platform used Determined by the process memory layout 64 bit processes do not have this limit Limit exists, but is so large it can be effectively ignored Addressability usually between 2^44 and 2^64 Which is 16+ TeraBytes WebSphere Support Technical Exchange 30
Java Process Memory Layout An Operating System process like any other application: Subject to OS and architecture restrictions 32bit architecture has an addressable range of: 2^32 which is 0x00000000 0xFFFFFFFF 0 GB which is 4GB 1 GB 2 GB 3 GB 4 GB 0x0 0x40000000 Not all addressable space is available to the application The operating system needs memory for: The kernel The runtime support libraries 0x80000000 0xC0000000 0xFFFFFFFF Varies according to Operating System How much memory is needed and where that memory is located WebSphere Support Technical Exchange 31
Memory Available to the Java Process On Windows : 0 GB 2 GB 4 GB 1 GB 3 GB Operating System Space 0x0 0x40000000 0x80000000 Libraries 0xC0000000 0xFFFFFFFF On AIX : 0 GB 2 GB 4 GB 1 GB 3 GB Kernel Libraries 0x0 0x40000000 0x80000000 0xC0000000 0xFFFFFFFF WebSphere Support Technical Exchange 32
Java Process Restrictions Not all Java Process space is available to the Java application The Java Runtime needs memory for: The Java Virtual Machine Backing resources for some Java objects This memory area as well as some other allocations, is part of the Native Heap Memory not allocated to the Java Heap is available to the native heap Available memory space Java heap = native heap Effectively, the Java process maintains two memory pools WebSphere Support Technical Exchange 33
The Native Heap Allocated using malloc() and therefore subject to memory management by the OS Used for Virtual Machine resources, eg: Execution engine Class Loader Garbage Collector infrastructure Used to underpin Java objects: Threads, Classes, AWT objects, ZipFiles Used for allocations by JNI code WebSphere Support Technical Exchange 34
Native Heap available to Application On Windows 0 GB 2 GB 4 GB 1 GB 3 GB Java Heap Native Heap Operating System Space 0x0 0x40000000 VM Resources 0x80000000 Libraries 0xC0000000 0xFFFFFFFF On AIX (1.4.2 with small heaps) 0 GB 2 GB 4 GB 1 GB 3 GB Kernel Java Heap Native Heap Libraries 0x0 0x40000000 0x80000000 0xC0000000 0xFFFFFFFF VM Resources WebSphere Support Technical Exchange 35
Layout with Large Java Heaps on AIX Applies to heaps > 1GB in size and Java 5.0 Java heap becomes allocated using mmap() Segments used start at 0xC and work downwards understanding memory layout important for monitoring 0 GB 2 GB 4 GB 1 GB 3 GB 0x3 0x7 0xD Kernel Native Heap Java Heap Libraries 0x0 0x40000000 0x80000000 0xC0000000 0xFFFFFFFF VM Resources WebSphere Support Technical Exchange 36
Memory Layout for Linux Linux : 0 GB 2 GB 4 GB 1 GB 3 GB Java Heap 0x40000000 0x0 VM Resources z/os : Native Heap 0x80000000 TASK_SIZE Kernel 0xC0000000 0xFFFFFFFF PAGE_OFFSET 0 GB 1 GB 2 GB 0x0 Java Heap 0x40000000 VM Resources 0x7FFFFFFF WebSphere Support Technical Exchange 37
Theoretical and Advised Max Heap Sizes The larger the Java heap, the more constrained the native heap Advised limits to prevent native heap from becoming overly restricted, leading to OutOfMemoryErrors Platform AIX Linux Windows z/os Additional Options automatic Hugemem Kernel /3GB Maximum Possible 3.25 GB 2 GB 3 GB 1.8GB 1.8GB 1.7GB Advised Maximum 2.5GB 1.5GB 2.5GB 1.5GB 1.8GB 1.3GB Exceeding advised limits possible, but should be done only when native heap usage is understood Native heap usage can be measured using OS tools: Svmon (AIX), PerfMon (Windows), RMF (zos) etc WebSphere Support Technical Exchange 38
Moving to 64bit Moving to 64bit remove the Java heap size limit However, ability to use more memory is not free 64bit applications perform slower More data has to be manipulated Cache performance is reduced 64bit applications require more memory Java Object references are larger Internal pointers are larger Major improvements to this in Java 6.0 due to compressed pointers WebSphere Support Technical Exchange 39
The correct Java heap size GC will adapt heap size to keep occupancy between 40% and 70% Heap occupancy over 70% causes frequent GC cycles Which generally means reduced performance Heap occupancy below 40% means infrequent GC cycles, but cycles longer than they needs to be Which means longer pause times that necessary Which generally means reduced performance The maximum heap size setting should therefore be 43% larger than the maximum occupancy of the application Maximum occupancy + 43% means occupancy at 70% of total heap Eg. For 70MB occupancy, 100MB Max heap required, which is 70MB + 43% of 70MB WebSphere Support Technical Exchange 40
The correct Java heap size Heap Size Too Frequent Garbage Collection 70% Memory Heap Occupancy 40% Long Garbage Collection Cycles Time WebSphere Support Technical Exchange 41
Fixed heap sizes vs. Variable heap sizes Should the heap size be fixed? i.e. Minimum heap size (-Xms) = Maximum heap size (-Xmx)? Each option has advantages and disadvantages As for most performance tuning, you must select which is right for the particular application Variable Heap Sizes GC will adapt heap size to keep occupancy between 40% and 70% Expands and Shrinks the Java heap Allows for scenario where usage varies over time Where variations would take usage outside of the 40-70% window Fixed Heap Sizes Does not expand or shrink the Java heap WebSphere Support Technical Exchange 42
Heap Expansion and Shrinkage Act of heap expansion and shrinkage is relatively cheap However, a compaction of the Java heap is sometimes required Expansion: for some expansions, GC may have already compacted to try to allocate the object before expansion Shrinkage: GC may need to compact to move objects from the area of the heap being shrunk Whilst expansion and shrinkage optimizes heap occupancy, it (usually) does so at the cost of compaction cycles WebSphere Support Technical Exchange 43
Conditions for Heap Expansion Not enough free space available for object allocation after GC has complete Occurs after a compaction cycle Typically occurs where there is fragmentation or during rapid occupancy growth (i.e., application startup) Heap occupancy is over 70% Compaction unlikely More than 13% of time is spent in GC Compaction unlikely WebSphere Support Technical Exchange 44
Conditions for Heap Shrinkage Heap occupancy is under 40% And the following is not true: Heap has been recently expanded (last 3 cycles) GC is a result of a System.GC() call Compaction occurs if: An object exists in the area being shrunk GC did not shrink on the previous cycle Compaction is therefore likely to occur WebSphere Support Technical Exchange 45
Introduction to Xmaxf and Xminf The Xmaxf and Xminf settings control the 40% and 70% occupancy bounds -Xmaxf: the maximum heap space free before shrinkage (default is 0.6 for 40%) -Xminf: the minimum heap space before expansion (default is 0.3 for 70%) Can be used to move optimum occupancy window if required by the application eg. Lower heap utilization required for more infrequent GC cycles Can be used to prevent shrinkage -Xmaxf1.0 would mean shrinkage only when heap is 100% free Would completely remove shrinkage capability WebSphere Support Technical Exchange 46
Introduction to Xmaxe and -Xmine The Xmaxe and Xmine settings control the bounds of the size of each expansion step -Xmaxe: the maximum amount of memory to add to the heap size in the case of expansion (default is unlimited) -Xmine: the minimum amount of memory to add to the heap size in the case of expansion (default is 1MB) Can be used to reduce/prevent compaction due to expansion Reduce expansions by setting a large -Xmine WebSphere Support Technical Exchange 47
GC Managed Heap Sizing Heap Size Expansion (>= -Xmine) To Frequent Garbage Collection -Xminf Memory Heap Occupancy -Xmaxf Long Garbage Collection Cycles Time WebSphere Support Technical Exchange 48
Fixed or Variable?? Again, dependent on application For flat memory usage, use fixed For widely varying memory usage, consider variable Variable provides more flexibility and ability to avoid OutOfMemoryErrors Some of the disadvantages can be avoided: -Xms set to lowest steady state memory usage prevents expansion at startup -Xmaxf1 will remove shrinkage -Xminf can be used to prevent compaction before expansion -Xmine can be used to reduce expansions WebSphere Support Technical Exchange 49
Heap Sizing for Generational GC Options Are: Fix both nursery and tenured space Nursery Tenured Allow them to expand/contract General Advice: Fix the new space size Size the tenured space as you would for a flat heap WebSphere Support Technical Exchange 50
Sizing the Nursery Copying from Allocate to Survivor or to Tenured space is expensive Physical data is copied (similar to compaction with is also expensive Ideally survival rates should be as low as possible Less data needs to be copied Less tenured/global collects that will occur The larger the nursery: the greater the time between collects the less objects that should survive However, the longer a copy can potentially take Recommendation is to have a nursery as large as possible Whilst not being so large that nursery collect times affect the application responsiveness WebSphere Support Technical Exchange 51
Summary GC Policy should be chosen according to application scenario Java heap should ideally be sized for between 40 and 70% occupancy Min=Max heap size is right for some applications, but not for others WebSphere Support Technical Exchange 52
Additional WebSphere Product Resources Discover the latest trends in WebSphere Technology and implementation, participate in technically-focused briefings, webcasts and podcasts at: http://www.ibm.com/developerworks/websphere/community/ Learn about other upcoming webcasts, conferences and events: http://www.ibm.com/software/websphere/events_1.html Join the Global WebSphere User Group Community: http://www.websphere.org Access key product show-me demos and tutorials by visiting IBM Education Assistant: http://www.ibm.com/software/info/education/assistant View a Flash replay with step-by-step instructions for using the Electronic Service Request (ESR) tool for submitting problems electronically: http://www.ibm.com/software/websphere/support/d2w.html Sign up to receive weekly technical My support emails: http://www.ibm.com/software/support/einfo.html WebSphere Support Technical Exchange 53
Additional Java Product Resources Obtain Java Documentation: https://www.ibm.com/developerworks/java/jdk/docs.html Download the IBM Java SDKs: https://www.ibm.com/developerworks/java/jdk/index.html Find and download Java tooling: http://www.ibm.com/software/websphere/events_1.html Troubleshoot Java with the IBM Guided Activity Assistant: http://www-01.ibm.com/support/docview.wss?uid=swg27010135 Troubleshoot Java with the Guided Troubleshooting InfoCenter http://publib.boulder.ibm.com/infocenter/javasdk/tools/topic/com.ibm.java.doc.tools.welc ome/tools/welcome/welcome.html Discuss IBM Java: http://www.ibm.com/developerworks/forums/forum.jspa?forumid=367 WebSphere Support Technical Exchange 54
Questions and Answers WebSphere Support Technical Exchange 55