Precise and Efficient Garbage Collection in VMKit with MMTk



Similar documents
A Lazy Developer Approach: Building a JVM with Third Party Software

Replication on Virtual Machines

Java Coding Practices for Improved Application Performance

Fachbereich Informatik und Elektrotechnik SunSPOT. Ubiquitous Computing. Ubiquitous Computing, Helmut Dispert

Validating Java for Safety-Critical Applications

Virtual Machine Learning: Thinking Like a Computer Architect

Language Based Virtual Machines... or why speed matters. by Lars Bak, Google Inc

CSCI E 98: Managed Environments for the Execution of Programs

Compiling Object Oriented Languages. What is an Object-Oriented Programming Language? Implementation: Dynamic Binding

HOTPATH VM. An Effective JIT Compiler for Resource-constrained Devices

LaTTe: An Open-Source Java VM Just-in Compiler

General Introduction

Performance Tools for Parallel Java Environments

Jonathan Worthington Scarborough Linux User Group

Practical Performance Understanding the Performance of Your Application

How to Spice Up Java with the VTune Performance Analyzer?

How accurately do Java profilers predict runtime performance bottlenecks?

Application-only Call Graph Construction

Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters

Mobile Application Development Android

Uncovering Performance Problems in Java Applications with Reference Propagation Profiling

Zing Vision. Answering your toughest production Java performance questions

IBM SDK, Java Technology Edition Version 1. IBM JVM messages IBM

Tachyon: a Meta-circular Optimizing JavaScript Virtual Machine

Characteristics of Java (Optional) Y. Daniel Liang Supplement for Introduction to Java Programming

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

A JIT Compiler for Android s Dalvik VM. Ben Cheng, Bill Buzbee May 2010

JVM Tool Interface. Michal Pokorný

Port of the Java Virtual Machine Kaffe to DROPS by using L4Env

Write Barrier Removal by Static Analysis

Eclipse Visualization and Performance Monitoring

A staged static program analysis to improve the performance of runtime monitoring

Garbage Collection in the Java HotSpot Virtual Machine

CSC 551: Web Programming. Spring 2004

Crash Course in Java

picojava TM : A Hardware Implementation of the Java Virtual Machine

Contents. Java - An Introduction. Java Milestones. Java and its Evolution

CSC230 Getting Starting in C. Tyler Bletsch

Characterizing Java Virtual Machine for More Efficient Processor Power Management. Abstract

WebSphere Architect (Performance and Monitoring) 2011 IBM Corporation

Rootbeer: Seamlessly using GPUs from Java

Cloud Computing. Up until now

PC Based Escape Analysis in the Java Virtual Machine

Virtual Servers. Virtual machines. Virtualization. Design of IBM s VM. Virtual machine systems can give everyone the OS (and hardware) that they want.

Chapter 2 System Structures

EMSCRIPTEN - COMPILING LLVM BITCODE TO JAVASCRIPT (?!)

The Design of the Inferno Virtual Machine. Introduction

System Structures. Services Interface Structure

INTRODUCTION TO OBJECTIVE-C CSCI 4448/5448: OBJECT-ORIENTED ANALYSIS & DESIGN LECTURE 12 09/29/2011

Free Java textbook available online. Introduction to the Java programming language. Compilation. A simple java program

Online Optimizations Driven by Hardware Performance Monitoring

Multi-core Programming System Overview

Free Java textbook available online. Introduction to the Java programming language. Compilation. A simple java program

Thesis Proposal: Improving the Performance of Synchronization in Concurrent Haskell

Performance Measurement of Dynamically Compiled Java Executions

Experimental Evaluation of Distributed Middleware with a Virtualized Java Environment

Compilers and Tools for Software Stack Optimisation

CSE 452: Programming Languages. Acknowledgements. Contents. Java and its Evolution

Java Garbage Collection Basics

Constant-Time Root Scanning for Deterministic Garbage Collection

Advanced compiler construction. General course information. Teacher & assistant. Course goals. Evaluation. Grading scheme. Michel Schinz

PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design

The Java Virtual Machine and Mobile Devices. John Buford, Ph.D. Oct 2003 Presented to Gordon College CS 311

To Java SE 8, and Beyond (Plan B)

Using jvmstat and visualgc to Solve Memory Management Problems

umps software development

Habanero Extreme Scale Software Research Project

Chapter 3 Operating-System Structures

An Overview of Java. overview-1

Can You Trust Your JVM Diagnostic Tools?

XBOX Performances XNA. Part of the slide set taken from: Understanding XNA Framework Performance Shawn Hargreaves GameFest 2007

Object-Relative Addressing: Compressed Pointers in 64-bit Java Virtual Machines

Lecture 32: The Java Virtual Machine. The Java Virtual Machine

Armed E-Bunny: A Selective Dynamic Compiler for Embedded Java Virtual Machine Targeting ARM Processors

Transcription:

Precise and Efficient Garbage Collection in VMKit with MMTk LLVM Developer s Meeting Nicolas Geoffray nicolas.geoffray@lip6.fr

2 Background VMKit: Java and.net on top of LLVM Uses LLVM s JIT for executing code Uses Boehm for GC Performance bottlenecks No dynamic optimization Conservative GC

Time normalized to VMKit 5.00 3.75 2.50 1.25 CPU-intensive Benchmarks (JGF) MacOSX 10.5 8 x X86_32 2.66GHz 12Go 0 series lufact heapsort crypt fft sor matmult VMKit Jikes Sun 3

Time normalized to VMKit 1.50 1.13 0.75 0.38 VM-intensive Benchmarks (Dacapo) MacOSX 10.5 8 x X86_32 2.66GHz 12Go 0 antlr fop luindex bloat jython pmd VMKit Jikes Sun 4

5 Execution Overheads antlr fop luindex 7% 7% 10% 11% 65% 9% 2% 8% 6% 75% 5% 2% 8% 12% 73% bloat jython pmd 18% 5% 7% 17% 53% 24% 4% 12% 11% 49% 4% 20% 1% 20% 17% 38% Application Allocations Collections System.arraycopy Interface calls Others

6 Execution Overheads antlr fop luindex 7% 7% 10% 11% 65% 2% 9% 8% 6% 21% 14% 20% 75% 5% 2% 8% 12% 73% bloat jython pmd 7% 5% 35% 18% 53% 23% 24% 4% 49% 37% 17% 12% 11% 4% 20% 1% 20% 17% 38% Application Allocations Collections System.arraycopy Interface calls Others

7 Goal: replace Boehm with MMTk MMTk is JikesRVM s GC Framework for writing GCs Multiple GC Implementations (Copying, Mark and trace, Immix) Copying collectors require precise stack scanning Locate pointers on the stack

8 But... it s in Java? Yes, but nothing to be afraid of: Use of Magic tricks No use of runtime features (exceptions, inheritance) No use of standard library Use VMKit s AOT compiler Transform MMTk into a.bc file

9 Outline Introduction Precise garbage collection Compiling MMTk with VMJC Putting it all together What s left

Introduction Precise garbage collection Compiling MMTk with VMJC Putting it all together What s left 10

11 Precise Garbage Collection Write code that locates pointers in the stack llvm.gcroot in JIT-generated code llvm.gcroot in VMKit s runtime written in C++ Use LLVM s GC framework to generate stack maps Caml stack maps for llvm-g++ generated code JIT stack maps for JIT-generated code

12 Precise Garbage Collection App.java VMKit.cpp llvm-g++ VMKit.exe Caml Stackmap JITted App Stackmaps for precise stack scanning JIT Stackmap

13 Stack Scanning Problem: interweaving of different kinds of functions Application s managed (Java or C#) functions: trusted VMKit s C++ functions: trusted Application s JNI functions: untrusted Solution: create a side-stack for frame addresses Updated upon entry of a kind of method VMKit knows the kind of each frame on the thread stack

14 Type of methods Trusted Has a stack map, so can manipulate objects (llvm.gcroot) Saves frame pointer (llvm::noframepointerelim) Untrusted Has no stack map, so should not manipulate objects May not save the frame pointer

15 Stack Scanning Example (1) VMKit.main (C++)

16 Stack Scanning Example (1) push(enter Java) VMKit.main (C++) App.main (Java)

17 Stack Scanning Example (1) push(enter Java) VMKit.main (C++) App.main (Java) App.function (Java)

18 Stack Scanning Example (1) push(enter Java) push(enter native) VMKit.main (C++) App.main (Java) App.function (Java) VMKit.runtime (C++)

19 Stack Scanning Example (1) push(enter Java) push(enter native) push(enter Java) VMKit.main (C++) App.main (Java) App.function (Java) VMKit.runtime (C++) App.function2 (Java)

20 Stack Scanning Example (1) push(enter Java) VMKit.main (C++) App.main (Java) App.function (Java) push(enter native) push(enter Java) fp VMKit.runtime (C++) App.function2 (Java)

21 Stack Scanning Example (1) push(enter Java) VMKit.main (C++) App.main (Java) App.function (Java) push(enter native) push(enter Java) fp fp VMKit.runtime (C++) App.function2 (Java)

22 Stack Scanning Example (1) push(enter Java) push(enter native) push(enter Java) fp fp fp VMKit.main (C++) App.main (Java) App.function (Java) VMKit.runtime (C++) App.function2 (Java)

23 Stack Scanning Example (1) push(enter Java) push(enter native) push(enter Java) fp fp fp fp VMKit.main (C++) App.main (Java) App.function (Java) VMKit.runtime (C++) App.function2 (Java)

24 Stack Scanning Example (2) VMKit.main (C++)

25 Stack Scanning Example (2) push(enter Java) VMKit.main (C++) App.main (Java)

26 Stack Scanning Example (2) push(enter Java) push(enter JNI) VMKit.main (C++) App.main (Java) App.function (JNI)

27 Stack Scanning Example (2) push(enter Java) push(enter JNI) VMKit.main (C++) App.main (Java) App.function (JNI) App.function2 (JNI)

28 Stack Scanning Example (2) push(enter Java) push(enter JNI) push(enter native) VMKit.main (C++) App.main (Java) App.function (JNI) App.function2 (JNI) VMKit.jniRuntime (Java)

29 Stack Scanning Example (2) push(enter Java) push(enter JNI) VMKit.main (C++) App.main (Java) App.function (JNI) saved fp App.function2 (JNI) push(enter native) VMKit.jniRuntime (Java)

30 Stack Scanning Example (2) push(enter Java) push(enter JNI) push(enter native) fp saved fp VMKit.main (C++) App.main (Java) App.function (JNI) App.function2 (JNI) VMKit.jniRuntime (Java)

31 Stack Scanning Example (2) push(enter Java) push(enter JNI) push(enter native) fp fp saved fp VMKit.main (C++) App.main (Java) App.function (JNI) App.function2 (JNI) VMKit.jniRuntime (Java)

32 Running the GC A precise GC scans the stacks at safe points: point during execution where the GC can know the type of each value on the stack

33 Single-threaded Application GC always triggered at safe points gcmalloc instrunctions Collector::collect()

34 Multi-threaded Application When entering a GC, must wait for all threads to join Don t use signals! or no safe point Use a thread-local variable to poll on method entry and backward branches Scan stacks of threads blocked in JNI or system calls

35 Application changes for GC public static void runloop(int a) { } while (a--) System.out.println( Hello World );

36 Application changes for GC public static void runloop(int a) { if (getthreadid().dogc) GC() while (a--) { System.out.println( Hello World ); if (getthreadid().dogc) GC() } }

Introduction Precise garbage collection Compiling MMTk with VMJC Putting it all together What s left 37

38 What is VMJC? An Ahead of Time compiler (AOT) Generates.bc files from.class files Use of llvm tools to generate platform-dependant files shared library: llc -relocation-model=pic + gcc executable: llc + ld vmkit + gcc

39 Goal: compile MMTk with VMJC Generate a.bc file that can be linked with VMKit Interface MMTK VMKit (e.g. threads synchronization, stack scanning) Interface VMKit MMTk (e.g. gcmalloc)

40 Why MMTk does not need a Java runtime? No use of runtime features synchronizations, exceptions, inheritance No use of standard library HashMap, LinkedList, ArrayList

41 How MMTk is manipulating pointers? Definition of Magic classes and methods Address, Word, Offset Word Address.loadWord(Offset) Magic classes and methods translated by the compiler [VEE 09] Similar mechanism than Inline ASM for C

42 Example (Frampton [VEE 09]) Inline ASM in C Magic in Java void prefetchobjects( OOP *buffer, int size) { for(int i=0;i < size;i++){ OOP o = buffer[i]; asm volatile( "prefetchnta (%0)" :: "r" (o)); } } @NoBoundsCheck void prefetchobjects( ObjectReference[] buffer) { for(int i=0;i<buffer.length;i++) { ObjectReference current = buffer[i]; current.prefetch(); } }

Introduction Precise garbage collection Compiling MMTk with VMJC Putting it all together What s left 43

44 Option 1: Object File Create a.o file of MMTk gcc mmtk.o vmkit.o -o vmkit But... No inlining in application code

45 Option 2: LLVM Bitcode File Create a.bc file of MMTk vmkit (-load mmtk.bc) -java HelloWorld gcmalloc in C++ are linked at runtime Inlining in Java code Late binding of allocations in VMKit code new in applications are inlined with MMTk s malloc

46 Option 3: Everything is Bitcode Create a.bc file of MMTk Create a.bc file of VMKit Link, optimize and run

Introduction Precise garbage collection Compiling MMTk with VMJC Putting it all together What s left 47

48 What s left Implementing the MMTK VMKit interface Interactions between the GC and the VM In VMKit code, in managed code Run benchmarks! Finish implementation with read/write barriers Benchmark with different GCs from MMTk

http://vmkit.llvm.org 49