Compilation 2012 The Java Virtual Machine

Similar documents
Chapter 7D The Java Virtual Machine

University of Twente. A simulation of the Java Virtual Machine using graph grammars

language 1 (source) compiler language 2 (target) Figure 1: Compiling a program

v11 v12 v8 v6 v10 v13

CSC 8505 Handout : JVM & Jasmin

02 B The Java Virtual Machine

Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters

The Java Virtual Machine (JVM) Pat Morin COMP 3002

Virtual Machines. Case Study: JVM. Virtual Machine, Intermediate Language. JVM Case Study. JVM: Java Byte-Code. JVM: Type System

Habanero Extreme Scale Software Research Project

AVR 32-bit Microcontroller. Java Technical Reference. 1. Introduction. 1.1 Intended audience. 1.2 The AVR32 Java Extension Module

Under the Hood: The Java Virtual Machine. Lecture 24 CS 2110 Fall 2011

Introduction. Application Security. Reasons For Reverse Engineering. This lecture. Java Byte Code

Java Card 3 Platform. Virtual Machine Specification, Classic Edition. Version May 2015

The Java Virtual Machine Specification. Java SE 8 Edition

The Java Virtual Machine and Mobile Devices. John Buford, Ph.D. Oct 2003 Presented to Gordon College CS 311

- Applet java appaiono di frequente nelle pagine web - Come funziona l'interprete contenuto in ogni browser di un certo livello? - Per approfondire

Chapter 4 Lecture 5 The Microarchitecture Level Integer JAVA Virtual Machine

Inside the Java Virtual Machine

Retargeting Android Applications to Java Bytecode

EXTENDING JAVA VIRTUAL MACHINE TO IMPROVE PERFORMANCE OF DYNAMICALLY-TYPED LANGUAGES. Yutaka Oiwa. A Senior Thesis

Java Virtual Machine Locks

Structural Typing on the Java Virtual. Machine with invokedynamic

1 The Java Virtual Machine

Java Virtual Machine, JVM

HOTPATH VM. An Effective JIT Compiler for Resource-constrained Devices

Java Programming. Binnur Kurt Istanbul Technical University Computer Engineering Department. Java Programming. Version 0.0.

What is Software Watermarking? Software Watermarking Through Register Allocation: Implementation, Analysis, and Attacks

Built-in Concurrency Primitives in Java Programming Language. by Yourii Martiak and Mahir Atmis

picojava TM : A Hardware Implementation of the Java Virtual Machine

An Energy Consumption Model for Java Virtual Machine

Proceedings of the Java Virtual Machine Research and Technology Symposium (JVM '01)

Platform Independent Dynamic Java Virtual Machine Analysis: the Java Grande Forum Benchmark Suite

Armed E-Bunny: A Selective Dynamic Compiler for Embedded Java Virtual Machine Targeting ARM Processors

The Java Virtual Machine Specification

Briki: a Flexible Java Compiler

Checking Access to Protected Members in the Java Virtual Machine

General Introduction

Stack Allocation. Run-Time Data Structures. Static Structures

A Java Virtual Machine Architecture for Very Small Devices

HVM TP : A Time Predictable and Portable Java Virtual Machine for Hard Real-Time Embedded Systems JTRES 2014

C Compiler Targeting the Java Virtual Machine

Security Architecture and Verification of Java Bytecode

Virtual Machine Showdown: Stack Versus Registers

LaTTe: An Open-Source Java VM Just-in Compiler

Real-time Java Processor for Monitoring and Test

Java Interview Questions and Answers

Programming the Android Platform. Logistics

Lecture 32: The Java Virtual Machine. The Java Virtual Machine

Instruction Set Architecture

Cloud Computing. Up until now

Automated Detection of Non-Termination and NullPointerExceptions for Java Bytecode

X86-64 Architecture Guide

Checking Access to Protected Members in the Java Virtual Machine

Instruction Set Architecture (ISA)

Writing new FindBugs detectors

Jonathan Worthington Scarborough Linux User Group

Java Obfuscation Salah Malik BSc Computer Science 2001/2002

Code Sharing among Virtual Machines

Lab Experience 17. Programming Language Translation

Instrumenting Java bytecode

Programming Languages

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 20: Stack Frames 7 March 08

Monitors and Exceptions : How to Implement Java efficiently. Andreas Krall and Mark Probst Technische Universitaet Wien

Prototyping Faithful Execution in a Java Virtual Machine

MCI-Java: A Modified Java Virtual Machine Approach to Multiple Code Inheritance

Aalborg University. Master Thesis. Assessing Bit Flip Attacks and Countermeasures. Modelling Java Bytecode. Group:

A Prototype Tool for JavaCard Firewall Analysis

CS5460: Operating Systems

How To Safely Insert In Joie (Java)

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

Comp 411 Principles of Programming Languages Lecture 34 Semantics of OO Languages. Corky Cartwright Swarat Chaudhuri November 30, 20111

a storage location directly on the CPU, used for temporary storage of small amounts of data during processing.

Advanced compiler construction. General course information. Teacher & assistant. Course goals. Evaluation. Grading scheme. Michel Schinz

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Explain the relationship between a class and an object. Which is general and which is specific?

Obfuscation: know your enemy

First Java Programs. V. Paúl Pauca. CSC 111D Fall, Department of Computer Science Wake Forest University. Introduction to Computer Science

An Introduction to Assembly Programming with the ARM 32-bit Processor Family

Optimizations for a Java Interpreter Using Instruction Set Enhancement

Replication on Virtual Machines

Nullness Analysis of Java Bytecode via Supercompilation over Abstract Values

Security testing: a key challenge for software engineering. Yves Le Traon, yves.letraon@uni.lu Professor, Univ of Luxembourg

Automated Termination Proofs for Java Programs with Cyclic Data

UIL Computer Science for Dummies by Jake Warren and works from Mr. Fleming

CSCI E 98: Managed Environments for the Execution of Programs

1/20/2016 INTRODUCTION

Stack machines The MIPS assembly language A simple source language Stack-machine implementation of the simple language Readings:

Java and Java Virtual Machine Security

Can You Trust Your JVM Diagnostic Tools?

Compilers I - Chapter 4: Generating Better Code

Compiler and Language Processing Tools

Characteristics of Java (Optional) Y. Daniel Liang Supplement for Introduction to Java Programming

EC 362 Problem Set #2

Java Card 2.2 Virtual Machine Specification. Sun Microsystems, Inc. 901 San Antonio Road Palo Alto, CA USA fax

Fundamentals of Java Programming

Mike Zusman 3/7/2011. OWASP Goes Mobile SANS AppSec Summit 2011

Pemrograman Dasar. Basic Elements Of Java

7. Virtual Machine I: Stack Arithmetic 1

Chapter 1 Fundamentals of Java Programming

Transcription:

Compilation 2012 The Jan Midtgaard Michael I. Schwartzbach Aarhus University

Virtual Machines in Compilation Abstract Syntax Tree compile Virtual Machine Code interpret compile Native Binary Code 2

Virtual Machines in Compilation Abstract Syntax Tree compile Virtual Machine Code interpret compile Virtual Machine Code interpret compile Virtual Machine Code interpret compile Native Binary Code 3

Compiling Virtual Machine Code Example: gcc translates into RTL, optimizes RTL, and then compiles RTL into native code Advantages: facilitates code generators for many targets Disadvantage: a code generator must be built for each target 4

Interpreting Virtual Machine Code Examples: P-code for Pascal interpreters Postscript code for display devices Java bytecode for the Advantages: easy to generate code the code is architecture independent bytecode can be more compact Disadvantage: poor performance (naively 5-100 times slower) 5

Designing A Virtual Machine The instruction set may be more or less high-level A balance must be found between: the work of the compiler the work of the interpreter In the extreme case, there is only one instruction: compiler guy: execute "program" interpreter guy: print "result" The resulting sweet spot involves: doing as much as possible at compile time exposing the program structure to the interpreter minimizing the size of the generated code being able to verify security&safety policies on compiled code 6

Components of the JVM: stack (per thread) heap constant pool code segment program counter (per thread) (we ignore multiple threads in this presentation) 7

The Java Stack The stack consists of frames: sp operand stack lsp locals arguments this Note how a frame of the call stack contains smaller operand stack for storing temporary values The number of local slots in and the size of a frame are fixed at compile-time 8

The Java Heap The heap consists of objects: fields runtime type 9

The Java Constant Pool The constant pool consists of all constant data: numbers strings symbolic names of classes, interfaces, and fields 10

The Java Code Segment The code segment consists of bytecodes of variable sizes: pc 11

Java Bytecodes A bytecode instruction consists of: a one-byte opcode a variable number of arguments (offsets or pointers to the constant pool) It consumes and produces some stack elements Constants, locals, and stack elements are typed: addresses (a) primitive types (i,c,b,s,f,d,l) 12

Class Files Java compilers generate class files: magic number (0xCAFEBABE) minor version/major version constant pool access flags this class super class interfaces fields methods attributes (extra hints for the JVM or other applications) 13

Class Loading Classes are loaded lazily when first accessed Class name must match file name Super classes are loaded first (transitively) The bytecode is verified Static fields are allocated and given default values Static initializers are executed 14

From Methods to Bytecode A simple Java method: public int Abs(int i) { if (i < 0) return(i * -1); else return(i); }.method public Abs(I)I // int argument, int result.limit stack 2 // stack with 2 locations.limit locals 2 // space for 2 locals // --locals-- --stack--- iload_1 // [ x -3 ] [ -3 * ] ifge Label1 // [ x -3 ] [ * * ] iload_1 // [ x -3 ] [ -3 * ] iconst_m1 // [ x -3 ] [ -3-1 ] imul // [ x -3 ] [ 3 * ] ireturn // [ x -3 ] [ * * ] Label1: iload_1 ireturn.end method Comments show trace of: x.abs(-3) 15

A Sketch of a Bytecode Interpreter The core of a VM consists of a fetch-decode-execute loop: pc = code.start; while(true) { npc = pc + instruction_length(code[pc]); switch (opcode(code[pc])) { case ILOAD_1: push(locals[1]); break; case ILOAD: push(locals[code[pc+1]]); break; case ISTORE: t = pop(); locals[code[pc+1]] = t; break; case IADD: t1 = pop(); t2 = pop(); push(t1 + t2); break; case IFEQ: t = pop(); if (t==0) npc = code[pc+1]; }... } pc = npc; break; 16

Instructions The JVM has 256 instructions for: arithmetic operations branch operations constant loading operations locals operations stack operations class operations method operations See the JVM specification for the full list 17

Arithmetic Operations ineg i2c iadd isub imul idiv irem [...:i] [...:-i] [...:i] [...:i%65536] [...:i:j] [...:i+j] [...:i:j] [...:i-j] [...:i:j] [...:i*j] [...:i:j] [...:i/j] [...:i:j] [...:i%j] iinc k i [...] [...] locals[k]=locals[k]+i 18

Branch Operations goto L [...] [...] branch always ifeq L [...:i] [...] branch if i==0 ifne L [...:i] [...] branch if i!=0 ifnull L [...:a] [...] branch if a==null ifnonnull L [...:a] [...] branch if a!=null 19

Branch Operations if_icmpeq L [...:i:j] [...] branch if i==j if_icmpne L if_icmplt L if_icmpgt L if_icmpge L if_icmple L if_acmpeq L [...:a:b] [...] branch if a==b if_acpmne L 20

Constant Loading Operations iconst_0 [...] [...:0] iconst_1 [...] [...:1]... iconst_5 [...] [...:5] aconst_null ldc i [...] [...:null] [...] [...:i] More precisely, the argument of ldc is an index into the constant pool of the current class, and the constant at that index is pushed. ldc s [...] [...:String(s)] Again, the argument to ldc is actually an index into the constant pool 21

Locals Operations iload k [...] [...:locals[k]] istore k [...:i] [...] locals[k]=i aload k [...] [...:locals[k]] astore k [...:a] [...] locals[k]=a 22

Field Operations getfield f sig [...:a] [...:a.f] putfield f sig [...:a:v] [...] a.f=v getstatic f sig [...] [...:C.f] putstatic f sig [...:v] [...] C.f=v More precisely, the argument to these operations is an index in the constant pool which must contain the signature of the corresponding field. 23

Stack Operations dup [...:v] [...:v:v] pop [...:v] [...] swap [...v:w] [...:w:v] nop [...] [...] dup_x1 dup_x2 [...:v:w] [...:w:v:w] [...:u:v:w] [...:w:u:v:w] 24

Class Operations new C [...] [...:a] instanceof C checkcast C [...:a] [...:i] if (a==null) i==0 else i==(type(a) C) [...:a] [...:a] if (a!=null) &&!type(a) C) throw ClassCastException 25

Method Operations invokevirtual name sig m=lookup(type(a),name,sig) [...:a:v 1 :...:v n ] [...(:v)] push frame of size m.locals+m.stacksize locals[0]=a locals[1]=v 1... locals[n]=v n pc=m.code invokestatic invokespecial invokeinterface 26

Method Operations ireturn [...:i] [...] return i and pop stack frame areturn [...:a] [...] return a and pop stack frame return [...] [...] pop stack frame 27

A Java Method public boolean member(object item) { if (first.equals(item)) return true; else if (rest == null) return false; else return rest.member(item); } 28

Generated Bytecode.method public member(ljava/lang/object;)z.limit locals 2 // locals[0] = x // locals[1] = item.limit stack 2 // initial stack [ * * ] aload_0 // [ x * ] getfield Cons/first Ljava/lang/Object; // [ x.first *] aload_1 // [ x.first item] invokevirtual java/lang/object/equals(ljava/lang/object;)z // [bool *] ifeq else_1 // [ * * ] iconst_1 // [ 1 * ] ireturn // [ * * ] else_1: aload_0 // [ x * ] getfield Cons/rest LCons; // [ x.rest * ] aconst_null // [ x.rest null] if_acmpne else_2 // [ * * ] iconst_0 // [ 0 * ] ireturn // [ * * ] else_2: aload_0 // [ x * ] getfield Cons/rest LCons; // [ x.rest * ] aload_1 // [ x.rest item ] invokevirtual Cons/member(Ljava/lang/Object;)Z // [ bool * ] ireturn // [ * * ].end method 29

Bytecode Verification Bytecode cannot be trusted to be well-behaved Before execution, it must be verified Verification is performed: at class loading time at runtime A Java compiler must generate verifiable code 30

Verification: Syntax The first 4 bytes of a class file must contain the magic number 0xCAFEBABE The bytecodes must be syntactically correct 31

Verification: Constants and Headers Final classes are not subclassed Final methods are not overridden Every class except Object has a superclass All constants are legal Field and method references have valid signatures 32

Verification: Instructions Branch targets are within the code segment Only legal offsets are referenced Constants have appropriate types All instructions are complete Execution cannot fall of the end of the code 33

Verification: Dataflow Analysis and Type Checking At each program point, the stack always has the same size and types of objects No uninitialized locals are referenced Methods are invoked with appropriate arguments Fields are assigned appropriate values All instructions have appropriate types of arguments on the stack and in the locals 34

Verification: Gotcha.method public static main([ljava/lang/string;)v.throws java/lang/exception.limit stack 2.limit locals 1 ldc -21248564 invokevirtual java/io/inputstream/read()i return java Fake Exception in thread "main" java.lang.verifyerror: (class: Fake, method: main signature: ([Ljava/lang/String;)V) Expecting to find object/array on stack 35

Verification: Gotcha Again.method public static main([ljava/lang/string;)v.throws java/lang/exception.limit stack 2.limit locals 2 iload_1 return java Fake Exception in thread "main" java.lang.verifyerror: (class: Fake, method: main signature: ([Ljava/lang/String;)V) Accessing value from uninitialized register 1 36

Verification: Gotcha Once More ifeq A ldc 42 goto B A: ldc "fortytwo" B: java Fake Exception in thread "main" java.lang.verifyerror: (class: Fake, method: main signature: ([Ljava/lang/String;)V Mismatched stack types 37

Verification: Gonna Getcha Every Time A: iconst_5 goto A java Fake Exception in thread "main" java.lang.verifyerror: (class: Fake, method: main signature: ([Ljava/lang/String;)V Inconsistent stack height 1!= 0 38

Alternative: Proof-Carrying Code Elegant verification approach to enforce safety and security policies based on theorem proving methods E.g., allows distribution of native code while maintaining the safety guarantees of VM code No trust in the originator is needed 39

JVM Implementation A naive bytecode interpreter is slow State-of-the-art JVM implementations are not: http://kano.net/javabench 40

JVM Implementation A naive bytecode interpreter is slow State-of-the-art JVM implementations are not: However: take micro-benchmarks with a grain of salt http://kano.net/javabench 41

Just-In-Time Compilation Exemplified by SUN s HotSpot JVM Bytecode fragments are compiled at runtime targeted at the native platform based on runtime profiling customization is possible Offers more opportunities than a static compiler It needs to be fast as it happens at run-time 42

Other Java Bytecode Tools assembler (jasmin) disassembler (javap) decompiler (cavaj, mocha, jad) obfuscator (dozens of these...) analyzer (soot) 43