Structural Typing on the Java Virtual. Machine with invokedynamic



Similar documents
Habanero Extreme Scale Software Research Project

Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters

University of Twente. A simulation of the Java Virtual Machine using graph grammars

language 1 (source) compiler language 2 (target) Figure 1: Compiling a program

02 B The Java Virtual Machine

1 The Java Virtual Machine

Java Programming. Binnur Kurt Istanbul Technical University Computer Engineering Department. Java Programming. Version 0.0.

Java Interview Questions and Answers

Chapter 7D The Java Virtual Machine

Checking Access to Protected Members in the Java Virtual Machine

Glossary of Object Oriented Terms

Compiling Object Oriented Languages. What is an Object-Oriented Programming Language? Implementation: Dynamic Binding

CSCI E 98: Managed Environments for the Execution of Programs

The Java Virtual Machine (JVM) Pat Morin COMP 3002

The Java Virtual Machine and Mobile Devices. John Buford, Ph.D. Oct 2003 Presented to Gordon College CS 311

Fundamentals of Java Programming

MCI-Java: A Modified Java Virtual Machine Approach to Multiple Code Inheritance

picojava TM : A Hardware Implementation of the Java Virtual Machine

Comp 411 Principles of Programming Languages Lecture 34 Semantics of OO Languages. Corky Cartwright Swarat Chaudhuri November 30, 20111

Handout 1. Introduction to Java programming language. Java primitive types and operations. Reading keyboard Input using class Scanner.

The Java Series. Java Essentials I What is Java? Basic Language Constructs. Java Essentials I. What is Java?. Basic Language Constructs Slide 1

To Java SE 8, and Beyond (Plan B)

Habanero Extreme Scale Software Research Project

- Applet java appaiono di frequente nelle pagine web - Come funziona l'interprete contenuto in ogni browser di un certo livello? - Per approfondire

General Introduction

CSC 8505 Handout : JVM & Jasmin

C Compiler Targeting the Java Virtual Machine

KITES TECHNOLOGY COURSE MODULE (C, C++, DS)

Real-time Java Processor for Monitoring and Test

Stack Allocation. Run-Time Data Structures. Static Structures

Virtual Machines. Case Study: JVM. Virtual Machine, Intermediate Language. JVM Case Study. JVM: Java Byte-Code. JVM: Type System

7.1 Our Current Model

1. Overview of the Java Language

An Object Storage Model for the Truffle Language Implementation Framework

PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design

Java Application Developer Certificate Program Competencies

Built-in Concurrency Primitives in Java Programming Language. by Yourii Martiak and Mahir Atmis

Security Vulnerability Notice

Crash Course in Java

The Hotspot Java Virtual Machine: Memory and Architecture

The Design of the Inferno Virtual Machine. Introduction

Armed E-Bunny: A Selective Dynamic Compiler for Embedded Java Virtual Machine Targeting ARM Processors

Chapter 1 Fundamentals of Java Programming

Java Virtual Machine, JVM

Elemental functions: Writing data-parallel code in C/C++ using Intel Cilk Plus

2 Introduction to Java. Introduction to Programming 1 1

Fachbereich Informatik und Elektrotechnik SunSPOT. Ubiquitous Computing. Ubiquitous Computing, Helmut Dispert

HOTPATH VM. An Effective JIT Compiler for Resource-constrained Devices

CS 111 Classes I 1. Software Organization View to this point:

Under the Hood: The Java Virtual Machine. Lecture 24 CS 2110 Fall 2011

Departamento de Investigación. LaST: Language Study Tool. Nº 143 Edgard Lindner y Enrique Molinari Coordinación: Graciela Matich

Lecture 1 Introduction to Android

Moving from CS 61A Scheme to CS 61B Java

What Perl Programmers Should Know About Java

Semantic Analysis: Types and Type Checking

Computing Concepts with Java Essentials

Topics. Introduction. Java History CS 146. Introduction to Programming and Algorithms Module 1. Module Objectives

Hardware/Software Co-Design of a Java Virtual Machine

2) Write in detail the issues in the design of code generator.

Validating Java for Safety-Critical Applications

The C Programming Language course syllabus associate level

C# and Other Languages

Instruction Set Architecture (ISA)

Android Application Development Course Program

DATA STRUCTURES USING C

Concepts and terminology in the Simula Programming Language

Jonathan Worthington Scarborough Linux User Group

Replication on Virtual Machines

PROBLEM SOLVING SEVENTH EDITION WALTER SAVITCH UNIVERSITY OF CALIFORNIA, SAN DIEGO CONTRIBUTOR KENRICK MOCK UNIVERSITY OF ALASKA, ANCHORAGE PEARSON

Java Programming Fundamentals

A Java Virtual Machine Architecture for Very Small Devices

The Sun Certified Associate for the Java Platform, Standard Edition, Exam Version 1.0

Structural Design Patterns Used in Data Structures Implementation

Object Instance Profiling

Chapter 5 Names, Bindings, Type Checking, and Scopes

1/20/2016 INTRODUCTION

CSC 551: Web Programming. Spring 2004

An Overview of Java. overview-1

Cloud Computing. Up until now

Can You Trust Your JVM Diagnostic Tools?

Language Evaluation Criteria. Evaluation Criteria: Readability. Evaluation Criteria: Writability. ICOM 4036 Programming Languages

Inside the Java Virtual Machine

Characteristics of Java (Optional) Y. Daniel Liang Supplement for Introduction to Java Programming

Restraining Execution Environments

TECHNOLOGY Computer Programming II Grade: 9-12 Standard 2: Technology and Society Interaction

ProfBuilder: A Package for Rapidly Building Java Execution Profilers Brian F. Cooper, Han B. Lee, and Benjamin G. Zorn

Parrot in a Nutshell. Dan Sugalski dan@sidhe.org. Parrot in a nutshell 1

RARITAN VALLEY COMMUNITY COLLEGE ACADEMIC COURSE OUTLINE. CISY 105 Foundations of Computer Science

Developing Embedded Software in Java Part 1: Technology and Architecture

Chapter 7: Functional Programming Languages

SOACertifiedProfessional.Braindumps.S90-03A.v by.JANET.100q. Exam Code: S90-03A. Exam Name: SOA Design & Architecture

Java EE Web Development Course Program

Thomas Jefferson High School for Science and Technology Program of Studies Foundations of Computer Science. Unit of Study / Textbook Correlation

CSE 373: Data Structure & Algorithms Lecture 25: Programming Languages. Nicki Dell Spring 2014

Transcription:

WRIGHT STATE UNIVERSITY Structural Typing on the Java Virtual Machine with invokedynamic by Brian Diekelman A thesis submitted in partial fulfillment for the degree of Bachelor of Science in the Department of Engineering And Computer Science December 2012

WRIGHT STATE UNIVERSITY Abstract Department of Engineering And Computer Science Bachelor of Science by Brian Diekelman This thesis describes the implementation of a structurally typed programming language and compiler for the Java Virtual Machine that uses the invokedynamic bytecode instruction and support library. The invokedynamic instruction itself is explained in detail along with descriptions of the call site bootstrapping and linking process. Details are provided on how to construct polymorphic inline caches inside of call sites targeting structural types using trees of method handles. The invokedynamic-based implementation is benchmarked against other structural typing techniques and outperforms Core Reflection API-based structural typing implementations by a factor of two.

Acknowledgements I would like to thank Dr. T.K. Prasad, my thesis advisor. Additionally, I would like to thank the members of my thesis committee, Dr. Michael Raymer and Dr. Prabhaker Mateti. ii

Contents Abstract i Acknowledgements ii List of Figures List of Listings List of Tables Abbreviations v vi vii viii 1 Introduction 1 2 JVM Overview 3 2.1 Class File Structure.................................. 3 2.1.1 Field and Method Descriptors......................... 5 2.2 Data Endpoints..................................... 5 2.2.1 Operand Stack................................. 5 2.2.2 Local Variables................................. 6 2.2.3 Fields...................................... 8 2.2.4 Method Invocation............................... 9 2.3 Bytecode Verification.................................. 10 2.4 Generalization..................................... 11 3 Invokedynamic 12 3.1 Method Handles.................................... 12 3.2 Method Handle Combinators............................. 14 3.2.1 Filters...................................... 15 3.2.2 Spreaders and Collectors............................ 17 3.2.3 Guards..................................... 17 3.3 Invocation Linking................................... 17 3.3.1 invokevirtual Invocation.......................... 18 3.3.2 invokedynamic Invocation.......................... 18 3.3.3 Call Site Relinking............................... 20 3.3.4 Optimization.................................. 20 4 Structural Typing 21 4.1 Nominal Typing.................................... 21 4.1.1 Case Study: java.io.closeable......................... 22 4.2 Structural Typing................................... 23 4.2.1 Structural Interfaces.............................. 24 iii

Contents iv 4.3 Implementation..................................... 25 4.3.1 Compile-Time Slot Lookup.......................... 25 4.3.2 Runtime Slot Lookup............................. 26 4.4 Inline Caching..................................... 28 4.4.1 Monomorphic.................................. 30 4.4.2 Polymorphic.................................. 30 4.4.3 Megamorphic.................................. 32 5 Kale Language 33 5.1 Overview........................................ 33 5.1.1 Types...................................... 33 5.1.2 Operators.................................... 34 5.2 Compilation....................................... 35 5.2.1 Invocations................................... 36 5.3 Performance....................................... 40 6 Conclusion 42 Bibliography 43

List of Figures 2.1 Data Endpoints..................................... 6 2.2 Invocation Bytecode.................................. 9 3.1 Method Handle Composition............................. 16 3.2 invokevirtual Linking.................................. 18 3.3 invokedynamic Linking................................. 19 4.1 Single Inheritance Memory Layout.......................... 26 4.2 Structural Inheritance Layout............................. 27 4.3 Method Handle PIC.................................. 31 v

Listings 2.1 Java before compilation................................ 3 2.2 Mnemonic pseudocode describing the binary structure of class files........ 4 2.3 Operand stack arithmetic............................... 6 2.4 Java local variables................................... 6 2.5 Load and store..................................... 7 2.6 Java class fields..................................... 8 2.7 Load the value of a field onto the stack........................ 8 2.8 Set the value of a field................................. 8 2.9 Java class methods................................... 9 2.10 Java interface declaration............................... 10 2.11 Java interface invocation................................ 10 3.1 Java field and method lookup targets......................... 13 3.2 Reflective lookup using MethodHandles.Lookup................... 14 3.3 Classes to execute getfield instructions against.................. 15 3.4 Nested property accessor bytecode.......................... 15 3.5 virtual method invocation bytecode.......................... 18 3.6 dynamic invocation bytecode............................. 19 3.7 A bootstrap method.................................. 19 4.1 Nominal types..................................... 21 4.2 Structural types.................................... 23 4.3 Structural interface................................... 24 4.4 Nominal type inheritance............................... 25 4.5 Pseudocode: Java with structural types....................... 27 4.6 An interface call site.................................. 29 5.1 A Java code example.................................. 33 5.2 A Kale code example.................................. 33 5.3 A Kale interface.................................... 34 5.4 A Kale main function................................. 34 5.5 Operators in Kale................................... 34 5.6 Operator use in Kale.................................. 35 5.7 Invocations in Kale................................... 36 5.8 invokedynamic interface call site........................... 37 5.9 Interface call site runtime bootstrap method..................... 38 5.10 Generalized interface invoker method......................... 39 5.11 Static inline cache................................... 41 vi

List of Tables 2.1 Type Descriptors.................................... 4 2.2 Method Descriptors.................................. 5 2.3 Field Instruction Descriptors............................. 11 3.1 Method Handle Descriptors.............................. 14 3.2 getfield Descriptors................................. 16 5.1 Invocation Benchmarks................................ 41 vii

Abbreviations JVM JDK JSR MLVM API LIFO Java Virtual Machine Java Development Kit Java Specification Request Multi Language Virtual Machine Application Programming Interface Last In First Out viii

Chapter 1 Introduction The Java Virtual Machine (JVM) has become ubiquitous in enterprise applications for its platform portability, performance, and stability. Companies such as Google [1], Facebook [2], and Twitter [3] all deploy the JVM at scale in their application stack. The benefits of the JVM have enticed developers to build implementations of Python, Ruby, and over 100 domain specific languages that run on the JVM [4]. Languages like Jython and JRuby that target the JVM get access to generational garbage collection, native threading, and a highly optimized Just In Time (JIT) compiler. Regardless of the current level of adoption of other languages, the JVM was designed primarily to run Java. All the concepts of Java classes, exceptions, strong typing, and the distinction between primitive and reference types are all represented in code targeting the JVM. Therefore, language implementers are forced to express the concepts of their language in terms of JVM structures. This becomes problematic when the language being implemented doesn t conform to JVM function invocation and typing semantics. In order for languages using non-java semantics to run on the JVM, significant workarounds must be employed. These workarounds typically increase implementation complexity, cause code bloat, and decrease runtime performance. However, JVM 7 includes a new feature, invokedynamic, that makes the JVM a more flexible and adaptable language host. invokedynamic is a new bytecode instruction and runtime support library that together allow a lower level of access to class loading and linking operations within 1

CHAPTER 1. INTRODUCTION 2 the JVM. Language features like structural typing, which defines type equivalence based on types defining the same set of methods, can use this new link access to attain a level of performance that was not previously possible. Details of the current linking and invocation semantics of the JVM are described in Chapter 2. invokedynamic itself is described in detail in Chapter 3, and its integration into a structural type system is described in Chapter 4. Finally, a compiler for a language that supports structural typing for the JVM through the use of invokedynamic is described in Chapter 5.

Chapter 2 JVM Overview The Java Virtual Machine (JVM) is a stack-based, hardware agnostic virtual machine that is defined by the Java Virtual Machine Specification [5]. JVM executable code is defined in terms of portable bytecode [5, 4.10] stored in the class file format. 2.1 Class File Structure The class file format was originally designed to be a binary storage format for compiled Java code. Due to this, there is a one to one correspondence between structure in a Java source file and corresponding elements in a class file. Each class file defines one class, which must extend exactly one other class (the superclass), and can implement zero or more interfaces. Each class can contain zero or more fields and zero or more methods. Given an input Java source file such as Listing 2.1, a Java compiler would generate a class file output with bytecode representing method add similar to Listing 2.2. 1 package t e s t ; 2 class SomeClass { 3 int value ; 4 int add ( int b ) { 5 return value + b ; 6 } 7 } Listing 2.1: Java before compilation 3

CHAPTER 2. JVM OVERVIEW 4 1. c l a s s t e s t / SomeClass 2.super java / lang / Object 3. f i e l d value I 4.method add ( I ) I 5 aload 0 6 getfield t e s t / SomeClass / value I 7 iload 1 8 iadd 9 i r e t 10.end method 11.end c l a s s Listing 2.2: Mnemonic pseudocode describing the binary structure of class files Note that even though no extends declaration was used when declaring the class SomeClass, since every class must extend another class, the compiler automatically generates bytecode that extends the base class in the Java object system, java.lang.object. All classes, fields, and methods within the JVM are strongly typed with symbolic type descriptors [5, 4.3]. There are two classifications of types within the JVM: primitive and reference types. Primitive types are unboxed numeric types such as int and boolean [5, 2.3]. Reference types refer to dynamically created instances of a class. Table 2.1 shows the format of the type descriptors that will be used throughout this document. Type int boolean void classref Descriptor I Z V Lclassname; Table 2.1: Type descriptors for common JVM types [5, 4.3.3].

CHAPTER 2. JVM OVERVIEW 5 2.1.1 Field and Method Descriptors A field descriptor defines the name and type of a class field. As shown in line 3 of listing 2.2, the value field of the class is defined with the type descriptor I, which Table 2.1 defines as int (a 32-bit two s-complement integer). A method descriptor is composed of the type descriptors of its parameters and return type [5, 4.3.3]. As shown in line 4 of Listing 2.2, the method add(i)i takes an int parameter and returns an int. Example mappings between Java method declarations and the resulting method descriptors are provided in Table 2.2. Java Declaration void (int a, int b) boolean (int a, String b) Method Descriptor (II)V (ILjava/lang/String;)Z Table 2.2: Method Descriptors. 2.2 Data Endpoints In place of directly addressable memory, all data transfers within the JVM occur between socalled data endpoints [6]. Data endpoints are strongly typed memory stores that include: an operand stack, a local variable array, class fields, and class methods. The flow of data between data endpoints is represented in Figure 2.1. 2.2.1 Operand Stack The operand stack is a fixed-sized, Last In First Out (LIFO) data structure [5, 2.6.2] used to perform computation and store intermediary values between other data endpoints. Each method

CHAPTER 2. JVM OVERVIEW 6 class*someclass*{ **void*main()*{ **** local[0] **} }.*.*. local[n] *load *store Operand*Stack [top*of*stack].*.*..*.*..*.*..*.*. [bottom*of*stack] putfield getfield invoke* class*otherclass*{ **int*field1; **void*methodone(int*a)*{ ****... **} } Figure 2.1: Data flowing through endpoints. has access to its own private operand stack during execution, the size of which is determined by the maxstack class file attribute that is specified at compile time. Values are pushed onto the stack, either explicitly through load instructions or as a result of other instruction execution such as a return value being loaded to the stack after a method invocation. Once values are on the stack, arithmetic can be performed using instructions like iadd and isub (add and subtract, respectively). Listing 2.3 shows example bytecode that loads values onto the stack and adds them. 1 ldc 2 // load constant 2 onto the stack 2 ldc 3 // load constant 2 onto the stack 3 iadd // add the top two v a l u e s on the stack Listing 2.3: Operand stack arithmetic 2.2.2 Local Variables During execution, a method has access to a fixed number of local variable slots [5, 2.6.1]. Listing 2.4 shows an example Java method that annotates how the locals array is populated. 1 static int add ( int b ) { // l o c a l s [ 0 ] = b 2 int i = 7 ; // l o c a l s [ 1 ] = i 3 int j = 6 ; // l o c a l s [ 2 ] = j 4 return b + i + j ;

CHAPTER 2. JVM OVERVIEW 7 5 } Listing 2.4: Java local variables Local variables are transferred to and from the operand stack with the load and store opcodes. Figure 2.4 shows how the locals array could be allocated given local variable declarations. Listing 2.5 details how the locals array would interact with the operand stack as depicted in Listing 2.4 1. 1 ldc 7 // load the constant 7 onto the stack 2 istore 1 // s t o r e 7 to l o c a l s [ 1 ] 3 ldc 6 // load the constant 6 onto the stack 4 istore 2 // s t o r e 6 to l o c a l s [ 2 ] 5 iload 0 // load b from l o c a l s [ 0 ] 6 iload 1 // load i from l o c a l s [ 1 ] 7 iload 2 // load j from l o c a l s [ 2 ] 8 iadd // add i + j 9 iadd // add b to r e s u l t o f i + j 10 i r e t // r eturn r e s u l t Listing 2.5: Load and store Each load instruction in Listing 2.5 is prefixed by an i to indicate the type of value being loaded. Integer values require an i prefix, while references require an a prefix. The size of the locals array is statically defined within bytecode by the maxlocals attribute of the method. Attempting to access an index outside of this limit at runtime causes an exception to be thrown. 1 Note that compiler optimizations would remove the redundant loads and stores in Listing 2.5, but it is instructive to retain them.

CHAPTER 2. JVM OVERVIEW 8 2.2.3 Fields Field values of class instances are loaded onto the stack with the getfield instruction and written with the putfield instruction. Listing 2.6 defines a class A with an integer field value and Listing 2.7 shows the bytecode necessary to read from the value field. 1 package t e s t ; 2 class A { 3 int value ; 4 } Listing 2.6: Java class fields 1 aload [ o b j e c t r e f :A] 2 getfield t e s t / A.value I Listing 2.7: Load the value of a field onto the stack After the bytecode in Listing 2.7 executes, the value of the value field of class A will be on the top of the stack. The bytecode in Listing 2.8 consumes the two values loaded on the stack and sets the value of the value field of class A to 7. 1 aload [ o b j e c t r e f :A] 2 ldc 7 3 putfield t e s t / A.value I Listing 2.8: Set the value of a field Access to static fields is performed with the getstatic and putstatic instructions, which behave identically to getfield and putfield except they do not require an object reference to be loaded onto the stack.

CHAPTER 2. JVM OVERVIEW 9 2.2.4 Method Invocation Method invocation is performed by pushing arguments onto the operand stack, then using one of the invocation instructions. Invocation instructions pop all arguments off the operand stack, including the object reference if present, and push the return value if one exists for the target method. Each invocation instruction is referred to as a call site, and the object reference the method is being called on is referred to as the receiver. Listing 2.9 shows different forms of declared methods in Java source, and Figure 2.2 shows Java invocations of those methods and the corresponding invocation bytecode. 1 package t e s t ; 2 class A { 3 static int a ( ) {... } 4 int b ( ) {... } 5 int c ( int i ) {... } 6 } Listing 2.9: Java class methods 1 a ( ) ; 2 obj. b ( ) ; 3 4 obj. c ( 9 ) ; 1 invokestatic t e s t /A.a ( ) I 2 aload [ obj ] 3 invokevirtual t e s t /A.b ( ) I 4 aload [ obj ] 5 ldc 9 6 invokevirtual t e s t / A.c ( I ) I Figure 2.2: Mapping Java source to bytecode The simplest form of invocation is a static method invocation using the invokestatic instruction, which requires just arguments and a symbolic link to the target method as shown on line 1 of Figure 2.2. Virtual methods (methods declared without the static keyword), are called with the invokevirtual instruction, which requires the object reference of the receiver be pushed

CHAPTER 2. JVM OVERVIEW 10 on the stack along with the arguments. If the receiver is an interface, the invokeinterface instruction must be used as shown in Listings 2.10 and 2.11. 1 package example ; 2 interface Testable { 3 void t e s t ( S t r i n g s ) ; 4 } 5 class SomeClass implements Testable { 6 void t e s t ( S t r i n g s ) {... } 7 } Listing 2.10: Java interface declaration 1 invokeinterface example / Testable t e s t ( Ljava / lang / S t r i n g ; )V Listing 2.11: Java interface invocation If invokeinterface is used with an object reference that does not implement the specified interface, a ClassVerifyException is thrown by the JVM at runtime. More details about the invocation bytecode instructions can be found in Bill Venner s 1997 JavaWorld article [7]. 2.3 Bytecode Verification All data endpoint access instructions embed not only the symbolic link to their target, but also the target s type descriptor. Additionally, instructions generally require that a specific combination of operands with specific types are loaded onto the stack in order for them to execute without error. After the class is loaded into memory and parsed, and before it is made available for execution to other classes within the JVM, the class is run through a verification process to statically prove each instruction has been provided its required operands in the correct order. Any deviations from the required contract throw a ClassVerifyException at runtime.

CHAPTER 2. JVM OVERVIEW 11 2.4 Generalization All data endpoint access discussed in this chapter goes through the same process: verify permissions, verify target type, verify operand types, and lookup receiver. The field access instructions can be modeled as function, as shown in Table 2.3, where T is the field type and R is the receiver type. Instruction getfield putfield getstatic putstatic Descriptor (R)T (R,T)V ()T (T)V Table 2.3: Field instructions modeled as functions. With all data endpoint JVM instructions generalized to function invocations, instructions can be composed at runtime, as Chapter 3 will discuss in detail.

Chapter 3 Invokedynamic Though non-java languages have been running on the JVM since its initial public release, no real effort was made by the engineers within Sun (now Oracle) to make the JVM a hospitable host to languages with semantics that differed from Java. As more production languages (such as JRuby) used JVM-based runtimes, it became apparent to Sun that more work needed to be done at the JVM level to support these languages. The goal was to make the JVM more flexible and adaptable while still retaining the strong type verification required to enforce Java s type system. In order to accomplish this, fundamental changes to the JVM were needed. These modifications to core JVM functionality, a so-called renaissance of the JVM, were created as a branch of the OpenJDK project named the Da Vinci Machine Project in mid-2006 and formalized under JSR-292 [8]. The finalized implementation was included in the release of JDK 7. The modifications included low cost function pointers in the form of method handles, new method linking options with the invokedynamic instruction, and changes to the runtime optimizer. 3.1 Method Handles Before JSR-292, the smallest composable unit within the JVM was the class. The only available facility to pass invokable references to methods was the Core Reflection API, which was not designed for performance intensive applications. The Core Reflection API was designed as a 12

CHAPTER 3. INVOKEDYNAMIC 13 meta layer to provide introspection to classes and their declared fields and methods at runtime. The runtime support for Reflection existed outside of the reach of the JVM optimizer, which meant that reflective invocations could not benefit from optimization heuristics such as inlining [9]. Reflective invocations performed access control checks at each invocation, which resulted in significant overhead at runtime [10]. Instead of attempting to redefine the behavior of the Core Reflection API and break backwards compatibility, the JSR-292 Expert Group elected to create a new abstraction. Instead of separate classes to model fields and methods like the existing Core Reflection API, a single interface was created to abstract both field modification and method invocation bytecode instructions: the method handle. Method handles are function pointers represented as a method descriptor that encapsulate field modification and method invocation operations. They are represented in the Java standard library in JDK 7 by the MethodHandle class in the java.lang.invoke package. Unlike the Core Reflection Field and Method classes, a method handle does not exist to provide metadata or perform runtime introspection. A method handle is meant purely as an efficient means of invoking a functional representation of a bytecode instruction. Listing 3.1 defines a class, A with a single field and three methods that are resolved using a MethodHandles.Lookup in Listing 3.2. 1 package t e s t ; 2 class A { 3 int value ; 4 static int a ( ) {... } 5 int b ( ) {... } 6 int c ( boolean x ) {... } 7 } Listing 3.1: Java field and method lookup targets

CHAPTER 3. INVOKEDYNAMIC 14 1 MethodHandles. Lookup lookup = MethodHandles. lookup ( ) ; 2 MethodHandle mh1 = lookup. f i n d G e t t e r (A. class, value, 3 methodtype ( int. class ) ) ; 4 MethodHandle mh2 = lookup. f i n d S t a t i c (A. class, a, 5 methodtype ( int. class ) ) ; 6 MethodHandle mh3 = lookup. f i n d V i r t u a l (A. class, b, 7 methodtype ( int. class ) ) ; 8 MethodHandle mh4 = lookup. f i n d V i r t u a l (A. class, c, 9 methodtype ( int. class, boolean. class ) ) ; Listing 3.2: Reflective lookup using MethodHandles.Lookup The descriptors of the method handles acquired in Listing 3.2 are listed in Table 3.1. Method Handle mh1 mh2 mh3 mh4 Descriptor (Ltest/A;)I ()I (Ltest/A;)I (Ltest/A;Z)I Table 3.1: The descriptors of the method handles acquired in Listing 3.2. Each MethodHandle can then be called using its invoke(object...) method, which performs the underlying field access or method invocation. 3.2 Method Handle Combinators Now that the previously disjoint operations are unified with a common functional abstraction, they can be combined into composite operations. The potential of method handles is to generate and reconfigure code at runtime without generating and loading raw bytecode into a class loader. Instead, method handles can be composed with one another a function pointer that calls another function pointer and integrated with existing class methods to allow flexible code

CHAPTER 3. INVOKEDYNAMIC 15 structures constructed solely from method handles that the native code optimizer within the JVM is aware of and able to optimize. Every bytecode instruction involving data endpoints is representable as a method handle. These baseline method handles are provided by the MethodHandles class included in the java.lang.invoke package in JDK 7. The MethodHandles class provides method handles that filter arguments and return values, spread and collect vararg arguments, and conditionally branch between method handles. 3.2.1 Filters The filterarguments() and filterreturnvalue() in the MethodHandles class enable method handles to be adapted to call site descriptors or linked together. For example, given the classes A, B, and C in Listing 3.3, evaluating the expression a.b.c.value would require the bytecode listed in Listing 3.4. 1 class A { 2 B b ; 3 } 4 class B { 5 C c ; 6 } 7 class C { 8 int value ; 9 } Listing 3.3: Classes to execute getfield instructions against 1 aload [ a ] 2 getfield A.b 3 getfield B.c 4 getfield C.value

CHAPTER 3. INVOKEDYNAMIC 16 Listing 3.4: Nested property accessor bytecode This behavior can now be encapsulated into a single method handle by composing multiple method handles and filtering their return values. Table 3.2 lists equations that describe each method handle performing getfield operations for a.b.c.value. Bytecode Operation getfield A.b getfield B.c getfield C.value Equation m d 0 = (A) B m d 1 = (B) C m d 2 = (C) I Table 3.2: Equation representation of the method handle descriptors. The return value of each method handle is then filtered through the next method handle in the chain: m d 0(A) m d 1(B) m d 1(B) m d 2(C) m d 2(C) I After substitution, the method handle tree simplifies down to a method handle that takes an argument of type A and returns the int value of the value field in class C: m d 0(A) I. A visual representation of this filtering process is depicted in Figure 3.1. ( A ) B ( A ) B ( A ) I ( B ) C ( B ) I ( C ) I Figure 3.1: A tree of method handles simplify through substitution. Filtering allows one complex operation to be wrapped into one method handle that can then be invoked, passed as a single object reference to other code, or composed with other method handles for even more complex behavior.

CHAPTER 3. INVOKEDYNAMIC 17 3.2.2 Spreaders and Collectors A method handle with the descriptor (A,B,C)D could be adapted to a more general method handle with the descriptor (Object[])Object using the asvarargscollector(object[].class) method of the MethodHandle class. This collects the arguments A, B, and C into the Object[] argument of the adapted method handle. The inverse operation can be performed with the asspreader(class<?>, int) method of the MethodHandle class, which takes all arguments from the provided integer offset and spreads them distributes them as individual parameters back to the original (A,B,C)D form. 3.2.3 Guards Guards enable the composition of a traditional if-then-else statement using method handles. Constructing a guard is done by the guardwithtest method within the MethodHandles class which requires three method handles as arguments. First, a guard predicate which returns a boolean type. Second, a target method handle to invoke if the guard predicate is true. Third, a fallback method handle to invoke if the guard predicate is false. Using another guard as the fallback method handle enables the construction of a tree of conditional branches. 3.3 Invocation Linking The inflexibility of the JVM prior to JDK 7 stems from the rigidity of the invocation instructions. The modifications implemented in the Da Vinci Machine Project that make this linkage process more flexible come in the form of a new invocation instruction, invokedynamic.

CHAPTER 3. INVOKEDYNAMIC 18 3.3.1 invokevirtual Invocation As detailed in Chapter 2, the JVM uses an internal symbolic linking process to link an invocation instruction, a call site, to the target class method. During this process the type descriptor embedded in the call site is compared to the type descriptor of the target method and an exception is thrown by the JVM if the type descriptors are not compatible. A depiction of the linking process for the invocation instruction in Listing 3.5 is shown in Figure 3.2. 1 invokevirtual path / to / OtherClass.otherMethod ( I )V Listing 3.5: virtual method invocation bytecode... void somemethod(otherclass o) { invokevirtual path/to/otherclass.othermethod (I)V } JVM Internal Linker resolve OtherClass.otherMethod Figure 3.2: JVM runtime internally resolves symbolic link. 3.3.2 invokedynamic Invocation Instead of a hardwired symbolic link to the target method, invokedynamic call sites embed a symbolic link to a static bootstrap method which returns a CallSite object. The CallSite object contains a MethodHandle which points to the invocation target method. Simply put, instead of specifying a target method, an invokedynamic call site specifies who to ask to get a target method. Along with the bootstrap method symbolic link, invokedynamic call sites also embed the target method name and type descriptor, which acts as a contract that must be fulfilled by the method handle in the call site returned by the bootstrap method. An example

CHAPTER 3. INVOKEDYNAMIC 19 invokedynamic instruction and corresponding linkage visualization are depicted in Listing 3.6 and Figure 3.3, respectively.... voidksomemethodwotherclassko)k{ invokedynamic othermethodwotherclass,i)v kkkkkkpath/to/languageruntime, kkkkkktbootstrapt, kkkkkkwlookup,kstring,kmethodtype)kcallsite }... callkbootstrap JVMkInternalk Linker LanguageRuntime.bootstrap resolve handle voidksomemethodwotherclassko)k{ invokehandle handle replacekinvokedynamickplaceholder withkbootstrapkreturnedkcallsite }... Figure 3.3: JVM runtime defers to user code bootstrap to resolve symbolic link. 1 invokedynamic 2 othermethod ( OtherClass, I )V // s i g n a t u r e being invoked 3 path / to / LanguageRuntime.bootstrap // C a l l S i t e p r o v i d e r 4 ( Lookup, String, MethodType ) C a l l S i t e // bootstrap s i g n a t u r e Listing 3.6: dynamic invocation bytecode The bootstrap method uses the reflective MethodHandle.Lookup class to obtain method handles, and can optionally perform any combination of method handle adaptation as discussed in Section 3.2.1. The bootstrap method is defined in JVM bytecode just like any other class, as opposed to the internal JVM process that links the other invocation instructions. Listing 3.7 defines an example bootstrap method. 1 class LanguageRuntime { 2 static C a l l S i t e bootstrap ( Lookup lookup, 3 S t r i n g methodname, 4 MethodType type ) { 5 MethodHandle mh = // r e s o l v e a MethodHandle... 6 return new MutableCallSite (mh) ;

CHAPTER 3. INVOKEDYNAMIC 20 7 } 8 } Listing 3.7: A bootstrap method 3.3.3 Call Site Relinking CallSite is an abstract base class extended by ConstantCallSite and MutableCallSite. A ConstantCallSite cannot be re-targeted to another method handle, while the MutableCallSite can be updated with its settarget(methodhandle) method. A ConstantCallSite generally has better performance at the price of flexibility, but the MutableCallSite is useful in building adaptive call sites that are able to rebuild themselves in response to the receivers invoked on them. 3.3.4 Optimization The primary feature that sets method handles and the invokedynamic infrastructure apart from other JVM features like the Core Reflection API is that method handles are designed to be able to take advantage of many of the optimizations applied to normal bytecode [11]. Specifically, method handles can be inlined and constant folded as the method handle tree symbols are evaluated and optimized at runtime.

Chapter 4 Structural Typing A type system that requires equivalent types to have the same declared name, regardless of whether they define identical fields and methods, is referred to as a nominal (or nominative) type system. A type system that ignores the declared name of types and only requires equivalent types to share common field or method declarations is referred to as a structural type system. 4.1 Nominal Typing As described in Chapter 2, the JVM uses nominal types throughout. Type names are declared in Java source, embedded into bytecode instructions, and enforced at runtime by the JVM. As shown in Listing 4.1, even though classes A and B are structurally identical, they are not considered equivalent within Java or by the JVM at runtime, and therefore cannot be used interchangeably. 1 class A { 2 void t e s t ( ) {... } 3 } 4 class B { 5 void t e s t ( ) {... } 6 } 7 void m1(a a ) { 8 a. t e s t ( ) ; 9 m2( a ) ; // i l l e g a l 10 } 11 void m2(b b ) { 21

CHAPTER 4. STRUCTURAL TYPING 22 12 b. t e s t ( ) ; 13 m1( b ) ; // i l l e g a l 14 } Listing 4.1: Nominal types 4.1.1 Case Study: java.io.closeable Prior to the release of J2SE 1.5 in 2004, the Java standard library developers noticed they had several classes in the java.io package that all had close() methods. They wanted to be able to write generalized code that could operate on all of these classes, but the classes all had different hierarchies they extended different base classes so they could not be addressed as one common type. To solve this problem, the java.io.closeable interface was introduced. This interface defines only a close() method that throws an IOException, and is meant to enable objects that represent resources such as files and network connections to be closed without the caller knowing anything about the class implementation other than the fact that it implements close(). However, this presented a problem when dealing with the large collections of code already written. Existing classes that had close() methods but did not explicitly implement the Closeable interface because it didn t exist when the code was written could not be passed as an argument to code expecting an instance of Closeable. Due to Java s nominal type system, existing code would have to be modified to add the implements Closeable declaration to every relevant class definition and then be recompiled and released as a new version. One solution to the Closeable problem would be to create a way for classes that do not explicitly implement the Closeable interface to still be treated as equivalent to the Closeable interface

CHAPTER 4. STRUCTURAL TYPING 23 as long as they implement that single close() method. That concept is a form of structural typing. 4.2 Structural Typing Listing 4.2 demonstrates both the benefits and potential pitfalls of structural typing. Similar to Listing 4.1, two structurally equivalent types are defined. The difference is that, since Cowboy and Rectangle are structurally identical and they are defined within a structural type system, they can be used interchangeably on lines 9 and 13. 1 class Cowboy { 2 void draw ( ) {... } 3 } 4 class Rectangle { 5 void draw ( ) {... } 6 } 7 void drawgun( Cowboy cowboy ) { 8 cowboy. draw ( ) ; 9 paint ( cowboy ) ; // l e g a l 10 } 11 void paint ( Rectangle r e c t ) { 12 r e c t. draw ( ) ; 13 drawgun( r e c t ) ; // l e g a l 14 } Listing 4.2: Structural types This approach solves the java.io.closeable problem. Code could be written within this system that operated on both the Cowboy and Rectangle types uniformly, even though they have no explicitly declared relation to one another. New interfaces can be written that existing classes implicitly implement without modification. However, this interchangeable usage should immediately be cause for concern. The draw() method in Listing 4.2, while lexically identical in

CHAPTER 4. STRUCTURAL TYPING 24 each class, has a different meaning within the domain of each class. In the context of a Cowboy, to draw() means to draw a gun. In the context of a Rectangle, draw() means to render itself. This is an example of the ambiguity of the English language leaking into what should be a more formal definition within a programming language. This accidental equivalence, an edge case within structural type systems, can be mitigated through a hybrid use of structural typing with structural interfaces. 4.2.1 Structural Interfaces There is a design compromise between structural typing and nominal typing that solves the java.io.closeable problem while minimizing accidental equivalence. Described with Javalike syntax in Listing 4.3, the hybrid approach uses nominal typing to determine equivalence of two classes and structural typing to determine whether a type conforms to an interface. 1 class Cowboy { 2 void draw ( ) {... } 3 int getheight ( ) {... } 4 } 5 class Rectangle { 6 void draw ( ) {... } 7 int getheight ( ) {... } 8 } 9 interface Height { 10 int getheight ( ) ; 11 } Listing 4.3: Structural interface In Listing 4.3, Cowboy and Rectangle both conform to the Height interface by implementing the getheight() method and can be provided to any receiver that is expecting an instance of Height. However, even though Cowboy and Rectangle have identical structures they are

CHAPTER 4. STRUCTURAL TYPING 25 not equivalent to each other and cannot be used interchangeably, thereby limiting accidental equivalence errors. 4.3 Implementation The implementation challenges of structural interfaces stem from the fact that each concrete type (a type that corresponds to a specific memory layout) has the potential to implicitly implement a large number of interfaces. An interface like java.io.closeable in a structural type system could be implemented by hundreds of concrete types, so when a function accepts an instance of java.io.closeable as a parameter and calls its close() method the compiler has no idea which of the hundreds of implementing types will be passed to the function at runtime. If the fields and methods of a concrete type are generalized as slots (an offset within the memory layout of the concrete type), the problem can be described as whether the compiler can calculate the offset of a slot based only on information available at the call site at compile time, or if the offset of the slot cannot be determined until runtime. 4.3.1 Compile-Time Slot Lookup Each name of a class in Java corresponds to one and only one memory layout of the fields and methods of that class. Listing 4.4 defines several Java classes, and Figure 4.1 shows an approximation of the corresponding memory layout. 1 class Base { 2 void s a y H e l l o ( ) {... } 3 } 4 class Class1 extends Base { 5 void m1( ) {... }

CHAPTER 4. STRUCTURAL TYPING 26 6 void s a y H e l l o ( ) {... } 7 } 8 class Class2 extends Base { 9 void m2( ) {... } 10 void s a y H e l l o ( ) {... } 11 void m3( ) {... } 12 } 13 class Class3 extends Class2 { 14 void m2( ) {... } 15 void s a y H e l l o ( ) {... } 16 void m4( ) {... } 17 } Listing 4.4: Nominal type inheritance Offset Base Class1 Class2 Class3 [0] sayhello() sayhello() sayhello() sayhello() [1] m1() m2() m2() [2] m3() m4() Figure 4.1: An abstract representation of class memory layout. Given a pointer to an instance of Base, the compiler knows at compile time that sayhello() will always be located at offset [0]. The same relationship exists between Class2 and its subclass, Class3. Given a pointer to any class that extends Class2, m2() will always be at offset [1]. Each subclass aligns its lower offsets with its base class structure, then uses its higher offsets for its own structure. This predictability enables a compiler to calculate the offset of any field or method of a class and generate a simple indirect load or jump instruction to access it. 4.3.2 Runtime Slot Lookup Listing 4.5 shows pseudocode of two interfaces and two classes. Class1 and Class2 both implement the Stream interface by implementing close(). Class2 also implements the Gettable

CHAPTER 4. STRUCTURAL TYPING 27 interface by implementing get(). 1 interface Gettable { 2 void get ( ) ; 3 } 4 interface Stream { 5 void c l o s e ( ) ; 6 } 7 class Class1 { 8 void c l o s e ( ) {... } 9 } 10 class Class2 { 11 void c l o s e ( ) {... } 12 void get ( ) {... } 13 } 14 void c a l l ( Stream stream ) { 15 stream. c l o s e ( ) ; 16 } Listing 4.5: Pseudocode: Java with structural types Offset Gettable Stream Class1 Class2 [0] get() close() close() close() [1] get() Figure 4.2: An abstract representation of class memory layout. When calling into concrete types, the offset of a given method was known at compile time. With structural typing this is no longer the case, as evident by the memory layout of Class2 in Figure 4.2. Since Class2 implements both the Gettable and Stream interfaces it must allocate a slot in its memory layout for both the get() and close() methods. However, both of those methods are defined at offset [0] in their respective interfaces. If a compiler places the close() method at offset [0] of Class2, any call sites receiving an instance of Class2 as a Gettable cannot predict at compile time the offset at which the get() method will be located. The

CHAPTER 4. STRUCTURAL TYPING 28 inverse is also true if the get() method was placed at offset [0]. This problem is aggravated when more than two interfaces are considered across multiple concrete types. The offset that must be resolved on the receiver from the call site on line 15 cannot be determined through static analysis. The call site could be provided by either Class1 or Class2 at runtime. Regardless of which type is provided, the call site must be able to efficiently lookup the offset for the close() method on the provided type and invoke it. This ad-hoc type introspection requirement has the potential to introduce significant overhead per invocation. Prior to invokedynamic, the rigid nature of the invocation bytecode instructions meant the options to lookup and call methods on types at runtime were limited. Implementing a feature like structural typing on the JVM required either reflection or a code generation technique that generated inline type checks at the call site at compile time [12]. Neither option was preferable, since reflection introduced significant overhead and the code generation technique cannot account for types that are unknown at compile time. With invokedynamic method lookup overhead can be reduced by having the call site itself cache references to methods by receiver type as they are invoked. This approach has the potential to completely eliminate the method lookup as long as the receiver type is in the call site cache. 4.4 Inline Caching An inline cache is a dispatch table embedded in a call site that builds an associative cache of receiver type metadata in order to optimize future invocations. Inline caching was first developed for use in Smalltalk implementations in the 1980s [13] in order to improve dynamic

CHAPTER 4. STRUCTURAL TYPING 29 invocation performance. Listing 4.6 defines an interface call site that could benefit from inline caching. 1 interface Gettable { 2 void get ( ) ; 3 } 4 class A { 5 void get ( ) {... } 6 } 7 class B { 8 void get ( ) {... } 9 } 10 class C { 11 void get ( ) {... } 12 } 13 void c a l l ( Gettable g ) { 14 g. get ( ) ; 15 } Listing 4.6: An interface call site On the first invocation of call(gettable) in Listing 4.6, the inline cache of call site g.get() will be empty. If, for example, an instance of class A was passed as the parameter on the first invocation, a full lookup would have to be performed to examine the methods implemented in class A and find get(greetable). Once it is resolved, however, the offset of get() within class A will be added to the inline cache as an association between the receiver type A and the method handle of get() within type A. Depending on how many receivers a call site is associated with, it can be referred to as either monomorphic, polymorphic, or megamorphic.

CHAPTER 4. STRUCTURAL TYPING 30 4.4.1 Monomorphic According to research by Oracle s HotSpot engineering team, 90% of call sites only target one concrete method over their entire lifetime [14]. These call sites, called monomorphic, retain a reference to the previously invoked receiver type. Upon subsequent invocation, the previous receiver type is compared to the new receiver type and, if they match, the cached receiver metadata can be used to jump directly to the target method. The design decisions inherent to monomorphic call sites all revolve around the caching heuristics. Primarily, what happens on a cache miss? If a call site replaces its cached receiver type each time the new receiver doesn t match, the overhead of cache maintenance could dominate execution time if a call site is being called with two types in an alternating invocation pattern. Conversely, choosing to not replace the cached receiver type could miss an optimization opportunity. One approach to avoid these issues, developed as an extension to the monomorphic inline cache by Sun Microsystems for the Self project, is to simply grow the cache to accommodate new receiver types [15]. 4.4.2 Polymorphic A polymorphic inline cache is an inline cache that stores multiple receiver types. Pseudocode for a polymorphic inline cache for the call site in Listing 4.6 is demonstrated in Algorithm 1.

CHAPTER 4. STRUCTURAL TYPING 31 switch receiver.type do case A A.get(); case B case C B.get(); C.get(); otherwise lookup(receiver); endsw Algorithm 1: Inline cache method dispatch For the roughly 10% of call sites that are invoked against more than one receiver type, a polymorphic inline cache is allocated with initial space to store several receiver types. If a new receiver type is used at the call site, it falls through to the slower lookup() function which performs a linear search on the receiver type for the target method, then adds it to the cache. This behavior can be implemented with a tree of method handles as depicted in Figure 4.3. true get(a)v isinstanceof(a.class) false true isinstanceof(b.class) false get(b)v true isinstanceof(c.class) false get(c)v lookup(receiver) Figure 4.3: A polymorphic inline cache implemented with nested method handles.

CHAPTER 4. STRUCTURAL TYPING 32 Constructed with the guardwithtest() factory method in the MethodHandles class, the guard is bound to the isinstanceof(class) method of the java.lang.class metaclass of the receiver. If the guard succeeds, it immediately invokes a method handle that references the concrete method. If the guard fails, the fallback of the guard is the guard of the next receiver type check. There are several optimization variations that influence how this tree can be constructed. New types can be added to the root of the tree or inserted before the final lookup(receiver) call. Invocation counts could be tracked, bubbling up the most invoked receiver type to the root of the tree so it is checked first in order to speed up dispatch time. However, obviously the tree cannot grow without bound. The inline cache must be designed to handle the 1% of cases where an unmanageable number of types flow through one call site. 4.4.3 Megamorphic A call site that handles an inordinate amount of receiver types is referred to as megamorphic. In this state, efficient polymorphic dispatch is not possible, since too much time would be consumed walking the cache entries and comparing them to the new receiver. Megamorphic call sites often revert back to an unoptimized vtable-based dispatch mechanism [16].

Chapter 5 Kale Language 5.1 Overview Kale is a strongly typed programming language with structural interfaces that has been designed to prototype structural typing on the JVM using invokedynamic. Many design elements of Kale are inspired by the Google Go [17] programming language. 5.1.1 Types Type declarations in Kale are analogous to classes in Java. Types can be declared with zero or more fields and methods, all of which are annotated with type names. Listings 5.1 and 5.2 show how Java syntax maps to Kale syntax. 1 package t e s t ; 2 class Person { 3 int age ; 4 S t r i n g name ; 5 S t r i n g getname ( ) { 6 return this. name ; 7 } 8 } Listing 5.1: A Java code example 1 package t e s t 2 type Person { 3 age int ; 4 name string ; 5 getname ( ) string { 6 return this. name ; 7 } 8 } Listing 5.2: A Kale code example 33

CHAPTER 5. KALE LANGUAGE 34 Kale using trailing type annotations instead of the leading type annotations of most C-derived languages. Google Go s development team found this allows more complex declarations to be read left to right [18] instead of the spiral pattern of C [19]. Regardless of the small syntactical differences, both Listing 5.1 and 5.2 compile to the same bytecode. Structurally typed interfaces are declared with the interface keyword and can only contain method declarations no method bodies. Listing 5.3 defines an interface A that is implemented by type T. Note that if the return type of a method is omitted, void is assumed. 1 package t e s t 2 interface A { 3 a ( ) ; 4 } 5 type T { 6 a ( ) { } 7 } Listing 5.3: A Kale interface Kale, unlike Java, allows functions to be declared outside the scope of a class. Listing 5.4 shows a legal Kale program with only a main function declared. 1 package t e s t 2 main ( ) { 3 4 } 5.1.2 Operators Listing 5.4: A Kale main function Kale implements a simplified version of operator overloading. Listing 5.5 demonstrates a Vector type that has a + operator. 1 package t e s t 2 type Vector { 3 x int ; 4 y int ; 5 z int ; 6 + operator ( v1 Vector, v2 Vector ) Vector { 7 v = Vector ( ) ; 8 v. x = v1. x + v2. x ; 9 v. y = v1. y + v2. y ;

CHAPTER 5. KALE LANGUAGE 35 10 v. z = v1. z + v2. z ; 11 return v ; 12 } 13 } Listing 5.5: Operators in Kale Operators are made possible by Kale s relaxed tokenization rules. The only special lexeme types are control characters like parentheses, braces, and brackets, or punctuation like semi-colons, commas, and full stops. All other lexemes, even arithmetic operators, are considered symbols. Operators are detected in the parser by an occurrence of three consecutive symbols, in which case the first and third symbols are considered the operands and the second symbol is considered the operator. Usage of the operator defined in Listing 5.5 is shown in Listing 5.6. Operator invocation is constructed as a chain of function invocations, so another way of writing v1 + v2 would be v1.+(v2) and v1 + v2 + v3 would be v1.+(v2).+(v3). Parentheses can alter evaluation order, turning v1 + (v2 + v3) into v1.+(v2.+(v3)). This ignores the usual order of operations for basic 1 v1 = Vector ( ) ; 2 v1. x = 2 ; 3 v2 = Vector ( ) ; 4 v2. x = 3 ; 5 6 v3 = v1 + v2 ; 7 // a s s e r t v3. x == 5 Listing 5.6: Operator use in Kale arithmetic, but makes reasoning about operator use in complex types like a Vector or Point more straightforward. 5.2 Compilation The Kale compiler accepts Kale source and generates JVM bytecode as output. The compiler is implemented in Java and uses a custom lexer and recursive descent parser to build an abstract

CHAPTER 5. KALE LANGUAGE 36 syntax tree (AST). JVM bytecode is emitted directly from the AST through calls to the ASM bytecode generation library [20], and can be loaded directly into a ClassLoader and executed or packaged into a jar file. Since Kale programs differ slightly from Java in structure and content, some conversions must be applied in order to map Kale programs to class files. First, since Kale allows a broader range of characters in symbols compared to Java, symbols used for class and method names must be converted to an equivalent unqualified name as defined by the JVM specification [5, 4.2.2]. Additionally, functions defined outside the scope of a class within a package must be packaged inside of a class. For this reason, a functions class is generated within the declared package that contains all package-level functions. 5.2.1 Invocations Since Kale uses a mix of concrete types and structural interfaces, invocations generate a mix of invokevirtual and invokedynamic instructions. Listing 5.7 defines an interface, Gettable, and a type that implements that interface, Stream. In the main() function a new instance of Stream is created, and passed as a parameter to callget(gettable). The invocation g.get() generates an invokedynamic call site, since it is being invoked against an interface. All other invocations, including the call to internal() within the Stream class, use an invokevirtual instruction. 1 package some. t e s t 2 interface Gettable { 3 get ( ) ; 4 } 5 type Stream { 6 get ( ) {

CHAPTER 5. KALE LANGUAGE 37 7 i n t e r n a l ( ) ; 8 } 9 i n t e r n a l ( ) { 10 // no op 11 } 12 } 13 c a l l G e t ( g Gettable ) { 14 g. get ( ) ; 15 } 16 main ( ) { 17 s = Stream ( ) ; 18 c a l l G e t ( s ) ; 19 } Listing 5.7: Invocations in Kale It is tempting to simply replace every invocation with an invokedynamic instruction to simplify code generation and allow for maximum flexibility. However, the MethodHandle lookup and CallSite construction still have a non-zero cost compared to the JVM internal linking which requires no reification or simulation. Listing 5.8 shows the invokedynamic instruction that would be generated from g.get() in Listing 5.7. This invocation specifies the Kale runtime library static method that will provide a call site and takes advantage of the ability to embed parameters directly into the call site. By specifying the name of the interface as a String parameter the bootstrap method can verify whether a given receiver conforms to that interface at runtime. 1 invokedynamic get ( Ljava / lang / Object ; )V [ 2 // handle kind 0x6 : invokestatic 3 k a l e / runtime / B o o t s t r a p. b o o t s t r a p ( 4 ( Ljava / lang / invoke /MethodHandles$Lookup ; 5 Ljava / lang / S t r i n g ; 6 Ljava / lang / invoke /MethodType ; 7 Ljava / lang / S t r i n g ; 8 ) Ljava / lang / invoke / C a l l S i t e ;

CHAPTER 5. KALE LANGUAGE 38 9 ) 10 // arguments : 11 s o m e. t e s t. G e t t a b l e 12 ] Listing 5.8: invokedynamic interface call site Listing 5.9 shows the corresponding bootstrap method declaration that will be called by the JVM runtime in order to provide a call site for the interface call site. 1 public static C a l l S i t e bootstrap ( MethodHandles. Lookup c a l l e r, 2 S t r i n g name, 3 MethodType type, 4 S t r i n g targettype ) 5 { 6 Class <?> t a r g e t C l a s s = Class. forname ( 7 targettype, 8 true, 9 c a l l e r. lookupclass ( ). getclassloader ( ) ) ; 10 11 I n t e r f a c e C a l l S i t e i c s = new I n t e r f a c e C a l l S i t e ( t a r g e t C l a s s, 12 name, 13 type ) ; 14 15 i c s. roothandle = MethodHandles 16. insertarguments ( callhandle, 0, i c s ) 17. a s V a r a r g s C o l l e c t o r ( Object [ ]. class ) 18. astype ( type ) ; 19 20 i c s. settarget ( i c s. roothandle ) ; 21 22 return i c s ; 23 } Listing 5.9: Interface call site runtime bootstrap method The role of the bootstrap method is to populate the initial state of the call site with enough information to upgrade itself in response to new receiver types as they are encountered. The

CHAPTER 5. KALE LANGUAGE 39 bootstrap method creates an instance of InterfaceCallSite, which retains metadata about the interface class and method being invoked, and binds it to a generic method handle, callhandle, which is described in Listing 5.10. Using insertarguments sets the InterfaceCallSite as the first argument to call, then asvarargscollector collects any arguments provided by the caller into an array of Objects that can be re-spread using asspreader once the method is resolved. The InterfaceCallSite will act as a polymorphic inline cache to subsequent invocations. 1 public static Object c a l l ( I n t e r f a c e C a l l S i t e i c s, 2 Object o, 3 Object [ ] args ) 4 throws Throwable 5 { 6 // upgrade c a l l s i t e cache i f necessary, 7 // then lookup the t a r g e t method on o and invoke 8 } Listing 5.10: Generalized interface invoker method A MethodHandle pointing to call is set as the root of a MethodHandle tree similar to the tree depicted in Figure 4.3. Each subsequent invocation of call indicates that all cached receiver tests (the guard tests higher in the MethodHandle tree) have failed a cache miss so a new guard testing for the receiver type is set as the new root of the tree with its fallback set to the current root. Each InterfaceCallSite retains a cachedepth integer which tracks how many cached types the call site has in its MethodHandle tree. Once this exceeds a preset threshold currently five the call site is marked as megamorphic by setting InterfaceCallSite.cacheDepth to -1. After that, the cache is discarded and all subsequent invocations perform a reflective lookup against the receiver.

CHAPTER 5. KALE LANGUAGE 40 5.3 Performance Structural invocation through a MethodHandle cache was benchmarked against several other invocation techniques to test its performance. Three types A, B, and C are defined within both Kale and Java with a single no-argument method m that returns a string constant. Within Java, all three types declare that they implement interface I which also defines method m. The baseline performance metric is an invokeinterface instruction, which is heavily optimized and can take advantage of the limited number of implemented types inherent to explicit interface implementation as discussed in Chapter 4. Next, two Core Reflection API-based implementations, invokereflection and invokecachedreflection, were implemented to compare with the findings of Dubochet and Odersky s findings in their Scala-oriented structural typing implementation [12]. invokereflection does a full method lookup from the receiver Class object on each invocation, and invokecachedreflection implements a simple cache of the resolved Method objects. Additionally, a generative technique, invokestaticinline, which was also discussed as an option by Dubochet and Odersky, was implemented that defines an inline cache that is statically defined at compile time with a fixed set of types. The invokestaticinline approach is completely impractical in the real world, but it is important because it is written in Java, as opposed to the native implementation of invokeinterface. If the MethodHandle runtime optimizer was sufficiently smart, it would be able to generate equivalent native code (see Listing 5.11). Finally, invokemethodhandle is the MethodHandle-based adaptive polymorphic inline cache.

CHAPTER 5. KALE LANGUAGE 41 1 i f ( o instanceof A ) { 2 ( (A) o ).m( ) ; 3 } else i f ( o instanceof B ) { 4 ( (B) o ).m( ) ; 5 } else i f ( o instanceof C ) { 6 ( (C) o ).m( ) ; 7 } Listing 5.11: Static inline cache For each approach, an instance of A, B, and C were created as invocation targets. Then 100000 iterations of invocations were performed, alternating A, B, and C as parameters to a function that accepted a parameter of type I, where an invocation was performed against method m. The benchmark was performed on a 3.20 GHz 64-bit AMD Athlon II X3 450 in Windows 7 running JVM 1.7.0 07 with arguments -server -XX:+AggressiveOpts. The execution times, averaged over three runs of a cold JVM, are displayed Table 5.1. Technique Execution Time (ms) invokeinterface 4 invokereflection 296 invokecachedreflection 45 invokemethodhandle 22 invokestaticinline 5 Table 5.1: Execution time of interface invocations. Over multiple test executions, invokemethodhandle consistently outperformed the reflective inline cache by a factor of two. invokemethodhandle was still not as fast as invokeinterface, but that is a highly optimized native implementation that has been developed over several years. invokestaticinline is the most telling of the results, since it represents a theoretical best case performance. The runtime optimizer for method handles is new in JDK 7, so it is possible that it will gain performance as it matures relative to the other JVM runtime code generators.

Chapter 6 Conclusion A structurally typed programming language was developed to target the new JVM invokedynamic bytecode instruction and runtime support libraries. After the compiler was implemented, the resulting compiled programs using invokedynamic were benchmarked against existing solutions, including the invokeinterface instruction and inline caches constructed using the Core Reflection API. The invokedynamic solution outperformed the Core Reflection API-based solutions by a factor of two, but could not match the performance of the native invokeinterface instruction. The invokedynamic standard library API is powerful and the promise of abstraction without penalty of indirection is compelling. The potential applications of invokedynamic and method handles in general are widespread, including the possibility of pointers on the JVM and specializing data structure execution paths as runtime conditions change. As the method handle runtime optimizer is improved in future versions of the JVM, invokedynamic could become a viable general purpose replacement for the other invocation instructions. 42

Bibliography [1] Google. App Engine Java Overview, 2012. URL https://developers.google.com/ appengine/docs/java/overview. [2] High Scalability. Facebook s New Real-Time Messaging System: HBase To Store 135+ Billion Messages A Month, 2010. URL http://highscalability.com/blog/2010/11/ 16/facebooks-new-real-time-messaging-system-hbase-to-store-135.html. [3] Twitter. Twitter search is now 3x faster, 2011. URL http://engineering.twitter.com/ 2011/04/twitter-search-is-now-3x-faster_1656.html. [4] Janice J. Heiss. The JVM Language Summit, 2012. URL http://www.oracle.com/ technetwork/articles/java/jvmsummit-1715195.html. [5] Tim Lindholm, Frank Yellin, Gilad Bracha, and Alex Buckley. The Java Virtual Machine Specification, Java SE 7 Edition. Oracle, 2012. URL http://docs.oracle.com/javase/ specs/jvms/se7/jvms7.pdf. [6] Charles Oliver Nutter. invokedynamic, 2012. URL http://www.slideshare.net/ CharlesNutter/jax-2012-invoke-dynamic-keynote. [7] Bill Venners. How the Java virtual machine handles method invocation and return, 1997. URL http://www.javaworld.com/jw-06-1997/jw-06-hood.html. [8] Oracle. JSR-292, 2012. URL http://jcp.org/en/jsr/detail?id=292. [9] John Rose. method handles in a nutshell, 2008. URL https://blogs.oracle.com/jrose/ entry/method_handles_in_a_nutshell. 43

Bibliography 44 [10] John R. Rose. Bytecodes meet Combinators: invokedynamic on the JVM. Workshop on Virtual Machines and Intermediary Languages, OOPSLA 2009, 2009. [11] John Rose. Direct method handles, 2012. URL https://wikis.oracle.com/display/ HotSpotInternals/Direct+method+handles. [12] Gilles Dubochet and Martin Odersky. Compiling structural types on the jvm: a comparison of reflective and generative techniques from scala s perspective. In Proceedings of the 4th workshop on the Implementation, Compilation, Optimization of Object-Oriented Languages and Programming Systems, ICOOOLPS 09, pages 34 41, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-541-3. doi: 10.1145/1565824.1565829. URL http://doi.acm.org/10. 1145/1565824.1565829. [13] Allan M. Schiffman L. Peter Deutsch. Efficient implementation of the smalltalk-80 system. In Proceedings of the 11th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, POPL 84, 1984. [14] John Rose. Hotspot Internals for OpenJDK - CallingSequences - VirtualCalls, 2011. URL https://wikis.oracle.com/display/hotspotinternals/virtualcalls. [15] Chambers C. Ungar D. Hlzle, U. Optimizing dynamically-typed object-oriented languages with polymorphic inline caches. In In Proceedings of the ECOOP 91 Conference, Lecture Notes in Computer Science, vol. 512. Springer-Verlag, Berlin, 1991. [16] OpenJDK Compiler Team. Hotspot Internals for OpenJDK - Compiler - Overview of CompiledIC and CompiledStaticCall, 2012. URL https://wikis.oracle.com/display/ HotSpotInternals/Overview+of+CompiledIC+and+CompiledStaticCall.

Bibliography 45 [17] Google Go Development Team. The Go Programming Language Specification, 2012. URL http://golang.org/ref/spec. [18] Andrew Gerrand. Go s Declaration Syntax, 2010. URL http://blog.golang.org/2010/ 07/gos-declaration-syntax.html. [19] David Anderson. The Clockwise/Spiral Rule, 1994. URL http://c-faq.com/decl/ spiral.anderson.html. [20] Remi Forax Eric Bruneton. ASM Bytecode Library, 2000, 2012. URL http://asm.ow2. org/.