Precise and Efficient Garbage Collection in VMKit with MMTk LLVM Developer s Meeting Nicolas Geoffray nicolas.geoffray@lip6.fr
2 Background VMKit: Java and.net on top of LLVM Uses LLVM s JIT for executing code Uses Boehm for GC Performance bottlenecks No dynamic optimization Conservative GC
Time normalized to VMKit 5.00 3.75 2.50 1.25 CPU-intensive Benchmarks (JGF) MacOSX 10.5 8 x X86_32 2.66GHz 12Go 0 series lufact heapsort crypt fft sor matmult VMKit Jikes Sun 3
Time normalized to VMKit 1.50 1.13 0.75 0.38 VM-intensive Benchmarks (Dacapo) MacOSX 10.5 8 x X86_32 2.66GHz 12Go 0 antlr fop luindex bloat jython pmd VMKit Jikes Sun 4
5 Execution Overheads antlr fop luindex 7% 7% 10% 11% 65% 9% 2% 8% 6% 75% 5% 2% 8% 12% 73% bloat jython pmd 18% 5% 7% 17% 53% 24% 4% 12% 11% 49% 4% 20% 1% 20% 17% 38% Application Allocations Collections System.arraycopy Interface calls Others
6 Execution Overheads antlr fop luindex 7% 7% 10% 11% 65% 2% 9% 8% 6% 21% 14% 20% 75% 5% 2% 8% 12% 73% bloat jython pmd 7% 5% 35% 18% 53% 23% 24% 4% 49% 37% 17% 12% 11% 4% 20% 1% 20% 17% 38% Application Allocations Collections System.arraycopy Interface calls Others
7 Goal: replace Boehm with MMTk MMTk is JikesRVM s GC Framework for writing GCs Multiple GC Implementations (Copying, Mark and trace, Immix) Copying collectors require precise stack scanning Locate pointers on the stack
8 But... it s in Java? Yes, but nothing to be afraid of: Use of Magic tricks No use of runtime features (exceptions, inheritance) No use of standard library Use VMKit s AOT compiler Transform MMTk into a.bc file
9 Outline Introduction Precise garbage collection Compiling MMTk with VMJC Putting it all together What s left
Introduction Precise garbage collection Compiling MMTk with VMJC Putting it all together What s left 10
11 Precise Garbage Collection Write code that locates pointers in the stack llvm.gcroot in JIT-generated code llvm.gcroot in VMKit s runtime written in C++ Use LLVM s GC framework to generate stack maps Caml stack maps for llvm-g++ generated code JIT stack maps for JIT-generated code
12 Precise Garbage Collection App.java VMKit.cpp llvm-g++ VMKit.exe Caml Stackmap JITted App Stackmaps for precise stack scanning JIT Stackmap
13 Stack Scanning Problem: interweaving of different kinds of functions Application s managed (Java or C#) functions: trusted VMKit s C++ functions: trusted Application s JNI functions: untrusted Solution: create a side-stack for frame addresses Updated upon entry of a kind of method VMKit knows the kind of each frame on the thread stack
14 Type of methods Trusted Has a stack map, so can manipulate objects (llvm.gcroot) Saves frame pointer (llvm::noframepointerelim) Untrusted Has no stack map, so should not manipulate objects May not save the frame pointer
15 Stack Scanning Example (1) VMKit.main (C++)
16 Stack Scanning Example (1) push(enter Java) VMKit.main (C++) App.main (Java)
17 Stack Scanning Example (1) push(enter Java) VMKit.main (C++) App.main (Java) App.function (Java)
18 Stack Scanning Example (1) push(enter Java) push(enter native) VMKit.main (C++) App.main (Java) App.function (Java) VMKit.runtime (C++)
19 Stack Scanning Example (1) push(enter Java) push(enter native) push(enter Java) VMKit.main (C++) App.main (Java) App.function (Java) VMKit.runtime (C++) App.function2 (Java)
20 Stack Scanning Example (1) push(enter Java) VMKit.main (C++) App.main (Java) App.function (Java) push(enter native) push(enter Java) fp VMKit.runtime (C++) App.function2 (Java)
21 Stack Scanning Example (1) push(enter Java) VMKit.main (C++) App.main (Java) App.function (Java) push(enter native) push(enter Java) fp fp VMKit.runtime (C++) App.function2 (Java)
22 Stack Scanning Example (1) push(enter Java) push(enter native) push(enter Java) fp fp fp VMKit.main (C++) App.main (Java) App.function (Java) VMKit.runtime (C++) App.function2 (Java)
23 Stack Scanning Example (1) push(enter Java) push(enter native) push(enter Java) fp fp fp fp VMKit.main (C++) App.main (Java) App.function (Java) VMKit.runtime (C++) App.function2 (Java)
24 Stack Scanning Example (2) VMKit.main (C++)
25 Stack Scanning Example (2) push(enter Java) VMKit.main (C++) App.main (Java)
26 Stack Scanning Example (2) push(enter Java) push(enter JNI) VMKit.main (C++) App.main (Java) App.function (JNI)
27 Stack Scanning Example (2) push(enter Java) push(enter JNI) VMKit.main (C++) App.main (Java) App.function (JNI) App.function2 (JNI)
28 Stack Scanning Example (2) push(enter Java) push(enter JNI) push(enter native) VMKit.main (C++) App.main (Java) App.function (JNI) App.function2 (JNI) VMKit.jniRuntime (Java)
29 Stack Scanning Example (2) push(enter Java) push(enter JNI) VMKit.main (C++) App.main (Java) App.function (JNI) saved fp App.function2 (JNI) push(enter native) VMKit.jniRuntime (Java)
30 Stack Scanning Example (2) push(enter Java) push(enter JNI) push(enter native) fp saved fp VMKit.main (C++) App.main (Java) App.function (JNI) App.function2 (JNI) VMKit.jniRuntime (Java)
31 Stack Scanning Example (2) push(enter Java) push(enter JNI) push(enter native) fp fp saved fp VMKit.main (C++) App.main (Java) App.function (JNI) App.function2 (JNI) VMKit.jniRuntime (Java)
32 Running the GC A precise GC scans the stacks at safe points: point during execution where the GC can know the type of each value on the stack
33 Single-threaded Application GC always triggered at safe points gcmalloc instrunctions Collector::collect()
34 Multi-threaded Application When entering a GC, must wait for all threads to join Don t use signals! or no safe point Use a thread-local variable to poll on method entry and backward branches Scan stacks of threads blocked in JNI or system calls
35 Application changes for GC public static void runloop(int a) { } while (a--) System.out.println( Hello World );
36 Application changes for GC public static void runloop(int a) { if (getthreadid().dogc) GC() while (a--) { System.out.println( Hello World ); if (getthreadid().dogc) GC() } }
Introduction Precise garbage collection Compiling MMTk with VMJC Putting it all together What s left 37
38 What is VMJC? An Ahead of Time compiler (AOT) Generates.bc files from.class files Use of llvm tools to generate platform-dependant files shared library: llc -relocation-model=pic + gcc executable: llc + ld vmkit + gcc
39 Goal: compile MMTk with VMJC Generate a.bc file that can be linked with VMKit Interface MMTK VMKit (e.g. threads synchronization, stack scanning) Interface VMKit MMTk (e.g. gcmalloc)
40 Why MMTk does not need a Java runtime? No use of runtime features synchronizations, exceptions, inheritance No use of standard library HashMap, LinkedList, ArrayList
41 How MMTk is manipulating pointers? Definition of Magic classes and methods Address, Word, Offset Word Address.loadWord(Offset) Magic classes and methods translated by the compiler [VEE 09] Similar mechanism than Inline ASM for C
42 Example (Frampton [VEE 09]) Inline ASM in C Magic in Java void prefetchobjects( OOP *buffer, int size) { for(int i=0;i < size;i++){ OOP o = buffer[i]; asm volatile( "prefetchnta (%0)" :: "r" (o)); } } @NoBoundsCheck void prefetchobjects( ObjectReference[] buffer) { for(int i=0;i<buffer.length;i++) { ObjectReference current = buffer[i]; current.prefetch(); } }
Introduction Precise garbage collection Compiling MMTk with VMJC Putting it all together What s left 43
44 Option 1: Object File Create a.o file of MMTk gcc mmtk.o vmkit.o -o vmkit But... No inlining in application code
45 Option 2: LLVM Bitcode File Create a.bc file of MMTk vmkit (-load mmtk.bc) -java HelloWorld gcmalloc in C++ are linked at runtime Inlining in Java code Late binding of allocations in VMKit code new in applications are inlined with MMTk s malloc
46 Option 3: Everything is Bitcode Create a.bc file of MMTk Create a.bc file of VMKit Link, optimize and run
Introduction Precise garbage collection Compiling MMTk with VMJC Putting it all together What s left 47
48 What s left Implementing the MMTK VMKit interface Interactions between the GC and the VM In VMKit code, in managed code Run benchmarks! Finish implementation with read/write barriers Benchmark with different GCs from MMTk
http://vmkit.llvm.org 49