Garbage Collection in NonStop Server for Java Technical white paper Table of contents 1. Introduction... 2 2. Garbage Collection Concepts... 2 3. Garbage Collection in NSJ... 3 4. NSJ Garbage Collection Performance... 8 Frequency of minor GC... 9 Frequency of full GC... 9 Collection duration... 9 Test Run #1: Effect of heap size on GC duration... 10 Test Run #2: Effect of object size on GC duration... 10 Test Run #3: Effect of object lifetime on GC duration... 11 Key Take-away:... 11 5. NSJ Garbage Collection Performance... 12 6. Guidelines for Tuning Garbage Collection... 13 7. References... 14
1. Introduction This paper provides an overview of how garbage collection works in NonStop Server for Java (NSJ) and also provides some representative data regarding garbage collection performance in NSJ. This paper is organized as follows: Section 2 introduces some basic terms and concepts associated with garbage collection. Section 3 provides an overview of the garbage collection mechanism in NSJ. Section 4 discusses garbage collection performance on NSJ based on test results. Section 5 enumerates the main NSJ runtime options relevant to garbage collection. Section 6 discusses some high level recommendations to optimize garbage collection performance on NSJ. Section 7 provides references for further reading. 2. Garbage Collection Concepts Unlike languages such as C++ where memory management is the programmer s responsibility, Java automates the memory management process with the help of a program called a Garbage Collector that runs within the Java Virtual Machine (JVM). A Garbage Collector is responsible for: Allocating memory from the Java heap 1 during object creation Ensuring that any objects that are reachable from references in executing code remain in the heap Reclaiming memory associated with objects that are no longer referenced As discussed later, garbage collection on NSJ is a so-called stop-the-world activity. This means that application execution is briefly suspended while garbage collection is taking place. It is, therefore, desirable that the garbage collector performs its function efficiently without introducing long pauses during which application execution is suspended. Objects in the heap that are reachable from references in executing code are said to be live. Objects that are no longer referenced are considered dead and are termed garbage. Although, in general, programmers do not need to have detailed knowledge of how the garbage collector organizes the heap and how it allocates and frees memory within the heap, it is helpful to have a high level functional understanding of how the garbage collector operates. The next section provides the reader with such an understanding. 1 The Java heap is part of the larger NSJ process heap. In this paper, the term heap, unless explicitly qualified, refers to the Java heap. 2
3. Garbage Collection in NSJ NSJ employs a garbage collection technique known as generational collection, in which the heap is divided into generations, where each generation is a separate memory pool that holds objects of different ages. In NSJ, the heap is divided into two generations as shown in Figure 1. Figure 1: Heap Layout The young generation, also known as the new generation, is further subdivided into three spaces: Eden space: This is the area from which most 2 new objects are allocated. Two Survivor spaces labeled From and To : After garbage collection from the young generation is completed, one of the two survivor spaces is used to hold all live objects (that is those objects that have survived this round of young generation garbage collection), while the other survivor space is empty. The role of these survivor spaces will become clear when we discuss the behavior of garbage collection in the young generation. The old generation, also known as the tenured generation, is that part of the heap where longer lived objects are located. When an object has survived a certain number of young generation collections, that object is deemed to be long-lived and is said to be promoted from the young generation to the old generation. For completeness, it should be mentioned that there is another generation known as the permanent generation which is separate from the young and old generation. The permanent generation is used by the JVM to hold the classes and methods loaded by an application. Because the permanent generation is not used to hold regular application objects, its usage will not be discussed in this paper. 2 There are corner case situations where an object is allocated directly from the old generation, but to keep the discussion simple, such corner cases are not discussed in this paper. 3
The rationale for segmenting the heap into a young and old generation is based on the hypothesis of infant mortality. This hypothesis, which has been empirically observed, states that most Java objects die young that is to say, most objects become garbage soon after their creation. Based on this assumption, different garbage collection algorithms are applied to each of the new and old generations. For example, the new generation garbage collector is presumed to operate on a space that has relatively fewer live objects than garbage objects, and so its algorithm places a premium on speed to quickly evict garbage objects. The old generation garbage collector, on the other hand, is presumed to operate on a space that has a higher proportion of live objects relative to garbage objects, and so its algorithm puts a premium on space efficiency to compact live objects after evicting the relatively few garbage objects. Garbage collection from the new generation is called a minor collection or Scavenge, while garbage collection from the entire heap, that is, both the new and old generations, is called a major collection or full collection. (The remainder of this paper uses the term minor GC to refer to garbage collections from the young generation, and full GC to refer to collections from the entire heap.) We now discuss the conditions under which minor GC and full GC are triggered, and the actions taken in each of these collections. Please note that this discussion on garbage collection activities does not describe the algorithms that the NSJ garbage collectors implement, but rather describes how the garbage collectors work from a functional perspective. We start with an empty heap. As new objects are created, memory is allocated from the Eden space. When Eden fills up, and an object cannot be allocated because of lack of free memory, a minor GC is triggered. During a minor GC the following activities occur (refer to Figure 2): All live objects in Eden are marked (light green arrows in Figure 2) All live objects in Eden are moved to one of the survivor spaces, say, the To survivor space All non-marked objects in Eden are, by definition, garbage, and removed. Note that at this time the Eden is completely empty. Objects that have survived this minor GC and are now in the To survivor space are marked with a survival count, that is, a count of how many minor GC a particular object has survived. In this case, all objects in the To space are marked with a survival count of 1. When the survival count exceeds a user configurable threshold value, the object is moved to the old generation. In this example, we assume that the threshold value has been set to 2. Figure 2: During Minor GC 4
Figure 3: After Minor GC Note that at the end of this minor GC, the Eden and From survivor space is empty, and all live objects are located in the To survivor space as shown in Figure 3. In NSJ, all application threads are suspended throughout the brief duration of a minor GC and they are resumed only when the minor GC is complete. This behavior is known as stop-the-world collection. After the minor GC completes, application threads are resumed and object creation begins again causing the Eden to, over time, fill up, eventually triggering another minor GC. During this second minor GC, the following activities occur (refer to Figure 4): All live objects in Eden and in the To survivor space are marked. (Recall there are objects in the To space as a result of the first minor GC.) All live objects in the To survivor space have their survival count incremented to 2 (because they have survived 2 minor collections.) Because no live objects in the To survivor space has exceeded the threshold survival value (2 in this example), all live objects in the To space are copied to the empty From survivor space. All live objects in Eden space are moved to the From survivor space and their survival count set to 1. All non-marked objects in Eden and To survivor space are considered garbage and removed. Figure 4: During Minor GC 5
Figure 5: After Minor GC Note that at the end of this minor GC, the Eden space and the To survivor space are empty, and all live objects are located in the From survivor space as shown in Figure 5. To generalize: at the end of each minor GC, the Eden space and one of the survival spaces are always empty with all the live objects located in the other survival space. To continue the example, the next time a minor GC occurs, the following activities occur (refer to Figure 6): All live objects in Eden and in the From survivor space are marked. All live objects in the From survivor space have their survival count incremented. Those objects in the From space whose survival count equals 3 are moved to the old generation. The objects that are moved to the old generation are said to be promoted. The remaining live objects in the From space are moved to the To survivor space. All live objects in Eden are moved to the To survival space and their survival count is set to 1. All non-marked objects in Eden and the From space are considered garbage and are removed. Figure 6: During Minor GC 6
Figure 7: After Minor GC As shown in Figure 7, at the end of this minor GC, the old generation contains some live objects that have been promoted, in this case, from the From survival space. The minor collections continue in this manner, with live objects that have exceeded the survival threshold being promoted to the old generation, and remaining live objects being alternately moved to the empty survivor space. The continual promotion of objects to the old generation, over time, causes the old generation to become full which, in turn, triggers a full GC. 3 When a full GC occurs, the following activities occur (refer to Figure 8): All live objects in the old generation and young generation are marked. The non-marked objects in the young and old generations are, by definition, garbage objects, and are removed. The live objects in the old generation are then compacted to one end of the old generation. All live objects in the young generation are moved to the old generation. Figure 8: During Minor GC 3 In NSJ, the sequence of events that most often triggers full GC is as follows: An object allocation fails (because of lack of space in Eden), which triggers a minor GC, which, in turn, may cause object(s) to be promoted to the old generation. If an object cannot be promoted because of lack of space in old generation, this will triggers a full GC. 7
Figure 9: During Minor GC As shown in Figure 9, a full GC removes garbage objects not only from the old generation, but also from the young generation. In addition, all live objects from the young generation are moved to the old generation which leaves the young generation empty after a full GC. Given that a full GC operates over the entire heap, in contrast to a minor GC which operates only over the young generation, the duration of a full GC is typically longer than a minor GC. On NSJ, a full GC, like a minor GC, is a stop-the-world collection, which means that for the duration of a full GC, all applications threads are suspended. This section has so far described how the garbage collector frees memory allocated to garbage objects. We now briefly describe how objects are allocated from Eden when they are first created. Recall that at the end of each minor collection, the Eden is always empty, so there is a large contiguous block of memory available from which to allocate objects. Allocations from such blocks are extremely efficient using a simple bump-the-pointer technique. In this technique, the end of the last allocated object is kept track of, and when a new allocation request needs to be satisfied, all that needs to be done is to check whether the object will fit in Eden, and, if so, return the pointer value and update the pointer to point to the end of this allocation. 4. NSJ Garbage Collection Performance As noted, both minor GC and full GC in NSJ are stop-the-world activities. This means that collection time has the potential to impact application performance given that application execution is suspended for the duration of a collection. This section discusses collection times that have been observed on NSJ while running tests described later in this section. There are two primary measures of garbage collection performance throughput and pause time. Throughput is the percentage of total time not spent in garbage collection, considered over long periods of time. Pause time is the time taken by an individual collection during which an application may appear unresponsive. In general, server-side applications are tuned to minimize the aggregate time spent on collections to improve application throughput, while client-side GUI applications are tuned to minimize individual collection duration to reduce application pauses. Let us start with the observation that the aggregate time spent on collections depends on: Frequency with which collections (both minor GC and full GC) occur, and Duration of each collection. Let us look at each determinant of aggregate collection time, starting with the frequency with which minor GC and full GC occur. 8
Frequency of minor GC A minor GC occurs when Eden is full. So, the frequency of minor GC is determined by how rapidly the Eden gets filled. The larger the size of Eden, the larger is the time interval between minor GCs (that is lower the frequency.) For a given Eden size, the time taken for Eden to fill up depends on the rate of object creation and the size of the created objects. Therefore, the frequency of minor GC is dependent on a) the application, which determines the rate with which objects are created and their size, and, b) on the size of Eden, which is configurable. Frequency of full GC A full GC occurs when the old generation is full. So, the frequency of full GC is determined by how quickly the old generation gets filled. The larger the size of old generation, the larger is the time interval between full GCs. For a given old generation size, the time taken to fill up the old generation depends on how many objects get promoted during a minor GC (which, in turn, depends on object longevity and the configurable survival threshold parameter) and the size of the promoted objects. Therefore, the frequency of full GC depends on a) the application, which determines object longevity and the size of promoted objects, and, b) the size of the old generation and survival threshold value, both of which are configurable. Collection duration The collection duration is largely dependent on: Size of generations Object size Object lifetime Generation size: That generation size is a determinant of collection duration is obvious the larger the size, the greater the number of objects likely to be in a generation, and so, the greater the collection duration. Object size: Object size is a determinant of collection duration because during each collection, live objects are copied which carry a copying cost. During a minor GC, live objects from one of the surviving spaces are either copied (promoted) to the old generation, or are copied to the other survivor space. During a full GC, live objects in the old generation are copied to fill the holes left by evicted dead objects (compacted), and live objects from the young generation are copied to the old generation. So, for a given number of live objects, the collection duration is likely to increase with an increase in object size because of the copying cost involved. Object lifetime: Object lifetime is a determinant of collection duration because live objects carry a copying cost during a collection. Take the extreme case where all objects become garbage between successive minor collections. In this situation, minor GCs would be faster than they otherwise would be because no objects need to be copied between survivor spaces, and a full collection will never occur because no object gets promoted leaving the old generation always empty. To get an understanding of how these determinants generation size, object size, and object lifetime affect collection duration, the results of tests conducted specifically for this purpose are shown below. Three tests were run, one each for observing the effect of a determinant while holding the other two determinants constant. All the tests were run on NSJ 6.0 on Integrity NonStop servers. 9
Note: In each of the test runs, the heap size, which is the sum of young and old generation sizes, was specified by setting both the initial size and the maximum size to the same value, which prevented the heap from resizing itself at runtime. This is one of the recommendations stated in section 6. Also, in all these tests, the young generation size was configured to be one-third the old generation size by setting the ratio of young generation size to the old generation size to the default value of 1:2. Test Run #1: Effect of heap size on GC duration In this test, collection duration was observed by varying the heap size, holding the size of objects created and their lifetime constant. This test consisted of 3 separate runs with heap size set to 64 MB, 256 MB and 512 MB respectively. All objects that were created in this test were 400 bytes in size, and 1 in every 3 objects was long-lived, making them likely to be promoted to the old generation. Heap size (young generation to old generation ratio=1:2) 64 MB 256 MB 512 MB Object Size 400 Bytes 400 Bytes 400 Bytes Object Lifetime (ratio of long-lived objects to short-lived objects) 1:2 1:2 1:2 Mean minor GC duration (seconds) 0.02 0.13 0.20 Mean full GC duration (seconds) 0.35 0.53 0.73 Ratio of minor GC frequency 4 2 1.5 1 Ratio of full GC frequency 4 20 3 1 The observed GC duration and GC frequency results confirm the expected outcome: a larger generation size leads to a larger collection duration, but also lowers the collection frequency. Test Run #2: Effect of object size on GC duration In this test, collection duration was observed by varying the object size, holding heap size and object lifetime constant. This test consisted of 2 separate runs with the size of each object created set to 400 bytes and 200 KB respectively. In both runs, the heap size was set to 256 MB runs and 1 in every 3 objects created was long-lived. Object Size 0.4 KB 200 KB Heap Size (young generation to old generation ratio=1:2) 256 MB 256 MB Object Lifetime (ratio of long-lived objects to short-lived objects) 1:2 1:2 Mean minor duration (seconds) 0.13 0.14 Mean full GC duration (seconds) 0.53 1.10 The observed GC duration results confirm the expected positive correlation between object size and collection duration. Note the higher sensitivity of full GC duration to object size. This is a result of the cost of copying large objects necessitated by the need to compact the old generation space during each full collection. 4 This test was designed to create objects in a tight for loop, and the objects themselves performed no work. This rapid creation of objects causes the young and old generations to fill up quickly, which, in turn, causes minor and full GC to occur significantly more frequently than that typically observed in a real application. Given that the actual frequency data is not very meaningful in the context of this test, we are instead providing relative GC frequency data. 10
Test Run #3: Effect of object lifetime on GC duration In this test, collection duration was observed by varying the object lifetime, holding heap size and object size constant. This test consisted of 2 separate runs where the ratio of long-lived objects to short-lived objects was set to 2:1 and 8:1 respectively. In the context of this test, short-lived objects are defined as objects that are garbage collected by the next minor GC that occurs after their creation, and long-lived objects are defined as objects that survive a minor GC and are immediately promoted to the old generation. Across both runs, the heap size was held constant at 64 KB and all objects were 256 bytes in size. Note: To ensure that long-lived objects were immediately promoted to the old generation after surviving a minor GC, the survival threshold parameter was set to 0. Object Lifetime (ratio of long-lived to short-lived objects) 2:1 8:1 Heap Size (young generation to old generation ratio=1:2) 64 KB 64 KB Object Size 256 Bytes 256 Bytes Mean minor GC duration (seconds) 0.03 0.12 Mean full GC duration (seconds) 0.11 0.11 The 4-times increase in minor GC duration when the percentage of long-lived objects rose from 66% to 89% confirms the role of object longevity in collection time. The reason why object lifetime did not affect the full GC duration is because in this test the long-lived objects were mostly live for the duration of the test. The effect of this on full GC was that it had very few garbage objects to evict from the old generation, thus avoiding copying cost associated with compaction. The main observations from the three tests are: Minor GC duration was significantly less than full GC duration. Minor GC duration was in the sub-200 milliseconds range, and was as fast as 20 ms. (Test run #1, with heap size at 64 MB, and young generation size at approximately 21 MB) Full GC duration was in the sub-second range. Of the 3 determinants, heap size had the most effect on both minor GC and full GC duration. Key Take-away: The take-away from these test results is that from an aggregate collection time (which impacts application throughput) perspective, the most important contributing factor is the frequency of collections, not the duration of collections. (The frequency of full GC is generally the critical factor as that is usually more expensive than minor GC.) For example, if minor GC, with a mean duration of 200 ms, occurs at an interval of, say, 1 minute, minor GC accounts for less than 0.4% of processing time. Similarly, if full GC, with a mean duration of 1 second, occurs at an interval of, say, 5 minutes, full GC accounts for less than 0.4% of processing time. 11
5. NSJ Garbage Collection Performance The following table provides the NSJ runtime options relevant to garbage collection. Option Default Description -Xmsn 0 MB Initial size of heap -Xmxn 64 KB Maximum size of heap -XX:MinHeapFreeRatio=min and - XX:MaxHeapFreeRation=max 40 (min) 70 (max) Target range for the proportion of free space to total heap size. These are applied per generation. For example, if min is 30, and the percent of free space in a generation falls below 30%, the size of the generation is expanded so as to have 30% of the space free. Similarly, if max is 70, and the percent of free space in a generation exceeds 70%, the size of the generation is shrunk so as to have 70% of the space free. -XX:NewSize=n 2 KB Initial size of new generation -XX:NewRatio=n 2 Ratio between young and old generation. For example, if n=2, the ratio is 1:2 and the combined size of Eden and survivor spaces is one-third the total size of young and old generation. -XX:SurvivorRatio=n 8 Ratio between each survivor space and Eden. For example, if n is 8, each survivor space is one-tenth of the young generation. -XX:MaxTenuringThreshold=n 15 Number of times an object can survive a minor collection before being promoted to old generation. If n is 0, all objects that survive a minor collection are immediately promoted to the old generation. -XX:+ScavengeBeforeFullGC Enabled (+) Perform a young generation GC before a full GC -XX:-DisableExplicitGC Disabled (-) If option is enabled, calls to System.gc() are disabled -XX:+UseSerialGC Enabled (+) Use serial GC for both young and old generation. Note that XX:+UseParallelGC, XX:+UseParallelOldGC, XX:+UseConcMarkSweepGC are not supported The following runtime options are useful for collecting GC statistics. All of the options are disabled by default. Option -Xverbosegc -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimestamps -XX:+PrintTenuringDistribution -XX:+HeapDumpOnOutOfMemoryError Description Outputs very detailed information at every GC. This JVM option is specific to HP supplied JVMs. Outputs basic information at every GC Outputs detailed information at every GC Use with PrintGC or PrintGCDetails to output the timestamp at every GC Prints tenuring age distribution Dump heap to a file when java.lang.outofmemoryerror is thrown 12
6. Guidelines for Tuning Garbage Collection This section provides some high level guidelines for tuning GC performance. If you suspect GC frequency and/or GC duration to be a possible opportunity for improved application performance, the first task is to collect GC performance metrics. Use the Xverbosegc runtime option to get detailed GC metrics. The Xverbosegc option can be used to output the GC data to a file which can then be analyzed by using HPjmeter, a free tool downloadable from http://www.hp.com/go/hpjmeter. Usually, GC performance optimization can be most easily achieved by appropriately sizing the heap, and the young and old generations. Here are some general guidelines related to sizing. Avoid heap resizing at runtime by using the same value for minimum heap size (Xms) and maximum heap size (Xmx) Avoid setting the heap size to be larger than the needs of your application. Experiment with young generation size (XX:NewRatio or XX:NewSize) to minimize object promotion Understand the aging distribution of your objects using the XX:+PrintTenuringDistribution option. For example, if 80% of your objects are garbage collected with a minor GC, and the remaining 20% are long-lived and are eventually promoted, experiment with setting the MaxTenuringThreshold to a lower value. This will prevent the long-lived objects from being repeatedly copied between survivor spaces before being promoted to the old generation. If you set the MaxTenuringThreshold to zero, any objects that survive a minor GC will straight away be promoted to the old generation. Setting MaxTenuringThreshold to 0 should be accompanied by setting XX:SurvivorRatio to a high value to maximize the Eden space in the young generation. If lowering collection time is critical, experiment with setting smaller values for the young and old generation size. However, this will likely increase the GC frequency, thus impacting application throughput. Here are some programming related guidelines that can affect GC performance: Avoid calling System.gc() from your application as this triggers a full GC. A simple way to disable System.gc()calls is by setting the XX:+DisableExplicitGC runtime option. Avoid implementing the finalize() method because the garbage collector is prevented from immediately freeing the memory associated with garbage objects that have implemented the finalize() method. 13
7. References 1. For general background on Java garbage collection, refer to http://java.sun.com/javase/technologies/hotspot/gc/memorymanagement_whitepaper.pdf. Please note that many of the garbage collectors mentioned in this paper are not applicable on NSJ. 2. Suggestions for tuning by adjusting heap and garbage collection parameters can be found at http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html 3. Details on using the HPjmeter tool can be found at http://www.hp.com/go/hpjmeter 4. Additional considerations for sizing the heap on NonStop systems can be found in the NSJ 6.0 Programmer s Reference manual located at http://docs.hp.com/en/546595-001/546595-001.pdf Technology for better business outcomes Copyright 2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. Java is a US trademark of Sun Microsystems, Inc. 4AA0-6150ENW, February 2010