Jonathan Worthington Scarborough Linux User Group



Similar documents
General Introduction

Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters

Cloud Computing. Up until now

CSCI E 98: Managed Environments for the Execution of Programs

Parrot in a Nutshell. Dan Sugalski dan@sidhe.org. Parrot in a nutshell 1

The Design of the Inferno Virtual Machine. Introduction

1. Overview of the Java Language

Characteristics of Java (Optional) Y. Daniel Liang Supplement for Introduction to Java Programming

Garbage Collection in the Java HotSpot Virtual Machine

Advanced compiler construction. General course information. Teacher & assistant. Course goals. Evaluation. Grading scheme. Michel Schinz

System Structures. Services Interface Structure

Chapter 13: Program Development and Programming Languages

Restraining Execution Environments

Fachbereich Informatik und Elektrotechnik SunSPOT. Ubiquitous Computing. Ubiquitous Computing, Helmut Dispert

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

Online Recruitment System 1. INTRODUCTION

Java Virtual Machine: the key for accurated memory prefetching

picojava TM : A Hardware Implementation of the Java Virtual Machine

2 Introduction to Java. Introduction to Programming 1 1

Java Garbage Collection Basics

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

INTRODUCTION TO JAVA PROGRAMMING LANGUAGE

No no-argument constructor. No default constructor found

Rakudo Perl 6 on the JVM. Jonathan Worthington

Chapter 3: Operating-System Structures. Common System Components

Example of Standard API

Programming Languages

The programming language C. sws1 1

Waratek Cloud VM for Java. Technical Architecture Overview

The Java Virtual Machine and Mobile Devices. John Buford, Ph.D. Oct 2003 Presented to Gordon College CS 311

A Practical Method to Diagnose Memory Leaks in Java Application Alan Yu

Monitoring, Tracing, Debugging (Under Construction)

Levels of Programming Languages. Gerald Penn CSC 324

1/20/2016 INTRODUCTION

Chapter 1. Dr. Chris Irwin Davis Phone: (972) Office: ECSS CS-4337 Organization of Programming Languages

Language Evaluation Criteria. Evaluation Criteria: Readability. Evaluation Criteria: Writability. ICOM 4036 Programming Languages

Multi-core Programming System Overview

An Easier Way for Cross-Platform Data Acquisition Application Development

Performance in the Infragistics WebDataGrid for Microsoft ASP.NET AJAX. Contents. Performance and User Experience... 2

Java's garbage-collected heap

Virtual Servers. Virtual machines. Virtualization. Design of IBM s VM. Virtual machine systems can give everyone the OS (and hardware) that they want.

Chapter 3 Operating-System Structures

Mobile Application Languages XML, Java, J2ME and JavaCard Lesson 04 Java

Chapter 13 Computer Programs and Programming Languages. Discovering Computers Your Interactive Guide to the Digital World

Effective Java Programming. efficient software development

Data Centers and Cloud Computing

.NET and J2EE Intro to Software Engineering

CS 3530 Operating Systems. L02 OS Intro Part 1 Dr. Ken Hoganson

Stackato PaaS Architecture: How it works and why.

Firewall Builder Architecture Overview

How To Use Java On An Ipa (Jspa) With A Microsoft Powerbook (Jempa) With An Ipad And A Microos 2.5 (Microos)

Geoff Raines Cloud Engineer

C Compiler Targeting the Java Virtual Machine

Lecture 10: Dynamic Memory Allocation 1: Into the jaws of malloc()

Instrumentation Software Profiling

Compiling Object Oriented Languages. What is an Object-Oriented Programming Language? Implementation: Dynamic Binding

Chapter 3.2 C++, Java, and Scripting Languages. The major programming languages used in game development.

Topics. Introduction. Java History CS 146. Introduction to Programming and Algorithms Module 1. Module Objectives

Chapter 13: Program Development and Programming Languages

PHP vs. Java. In this paper, I am not discussing following two issues since each is currently hotly debated in various communities:

Java and Real Time Storage Applications

Mission-Critical Java. An Oracle White Paper Updated October 2008

Monitoring and Managing a JVM

02 B The Java Virtual Machine

The Hotspot Java Virtual Machine: Memory and Architecture

Practical Performance Understanding the Performance of Your Application

Suh yun Ki m (KIS T) (KIS suhyunk@.com

The Virtualization Practice

Performance Management for Cloudbased STC 2012

CSC230 Getting Starting in C. Tyler Bletsch

What s Cool in the SAP JVM (CON3243)

Data Centers and Cloud Computing. Data Centers

The C Programming Language course syllabus associate level

Chapter 16: Virtual Machines. Operating System Concepts 9 th Edition

Chapter 2: Remote Procedure Call (RPC)

Testing Automation for Distributed Applications By Isabel Drost-Fromm, Software Engineer, Elastic

Networks and Services

Python, C++ and SWIG

OPERATING SYSTEM SERVICES

How Java Software Solutions Outperform Hardware Accelerators

Memory Allocation. Static Allocation. Dynamic Allocation. Memory Management. Dynamic Allocation. Dynamic Storage Allocation

Virtual Machine Security

Using jvmstat and visualgc to Solve Memory Management Problems

Running SAP HANA One on SoftLayer Bare Metal with SUSE Linux Enterprise Server CAS19256

Programmable Networking with Open vswitch

How to create/avoid memory leak in Java and.net? Venkat Subramaniam

Validating Java for Safety-Critical Applications

language 1 (source) compiler language 2 (target) Figure 1: Compiling a program

CIS570 Modern Programming Language Implementation. Office hours: TDB 605 Levine

9/11/15. What is Programming? CSCI 209: Software Development. Discussion: What Is Good Software? Characteristics of Good Software?

Java Application Development using Eclipse. Jezz Kelway Java Technology Centre, z/os Service IBM Hursley Park Labs, United Kingdom

Chapter 12 Programming Concepts and Languages

A lap around Team Foundation Server 2015 en Visual Studio 2015

Cloud Operating Systems for Servers

Tool - 1: Health Center

A GENERIC PURPOSE, CROSS-PLATFORM, HIGH EXTENSIBLE VIRTUAL MACHINE. John Vlachoyiannis

Memory Profiling using Visual VM

Assignment # 1 (Cloud Computing Security)

SOFTWARE TESTING TRAINING COURSES CONTENTS

Building Docker Cloud Services with Virtuozzo

Transcription:

Jonathan Worthington Scarborough Linux User Group

Introduction

What does a Virtual Machine do? Hides away the details of the hardware platform and operating system. Defines a common set of instructions. Abstracts away operating system details Efficiently translates the virtual instructions to those supported by the hardware CPU. Provides support for high level language constructs (such as subroutines, OOP).

Why Virtual Machines? 1. Simplified software development and deployment. Program 1 Program 2 Compile For Each Platform Compile For Each Platform Without a VM

Why Virtual Machines? 1. Simplified software development and deployment. Program 1 Program 2 Compile to the VM VM VM Supports Each Platform With a VM

Why Virtual Machines? 2. High level languages have a lot in common. Strings, arrays, hashes, references, Subroutines, objects, namespaces, Closures and continuations Memory management Can implement these just once in the VM.

Why Virtual Machines? 3. High level language interoperability becomes easier. A consistent way to call subroutines and methods. A common representation of data types: strings, arrays, objects, etc. Code in multiple languages essentially runs as a single program.

Why Virtual Machines? 4. Can provide fine grained security and quota restrictions. This program can connect to server X, but can not access any local files. 5. Debugging and profiling more easily supported. 6. Possibility of dynamic optimizations by exploiting what is known at runtime but not be known at compile time.

A Few Well Known VMs The JVM (Java Virtual Machine).Net CLR (Common Languages Runtime) Parrot Many things you might not call VMs For example, the Perl 5, or Python, or Ruby interpreter could in many ways be considered a VM; they are just closely tied to the language.

Stack and register architectures

Stack and register machines Most virtual machines, including.net and JVM, are implemented as stack machines. push 17 push 25 add

Stack and register machines Many virtual machines, including.net and JVM, are implemented as stack machines. push 17 17 push 25 add

Stack and register machines Many virtual machines, including.net and JVM, are implemented as stack machines. push 17 push 25 17 25 17 add

Stack and register machines Many virtual machines, including.net and JVM, are implemented as stack machines. push 17 17 push 25 add 25 17 42 +

Stack and register machines Other virtual machines, such as Parrot, use registers. A register is a numbered storage location for holding working data. I0 I1 I2 I3 I4 I5 I6 I7 17 25

Stack and register machines The add instruction in Parrot adds the values stored in two registers and stores the result in a third. add I1, I3, I4 I0 I1 I2 I3 I4 I5 I6 I7 17 25

Stack and register machines The add instruction in Parrot adds the values stored in two registers and stores the result in a third. add I1, I3, I4 I0 I1 I2 I3 I4 I5 I6 I7 17 25 +

Stack and register machines The add instruction in Parrot adds the values stored in two registers and stores the result in a third. add I0, I3, I4 I0 I1 I2 I3 I4 I5 I6 I7 42 17 25 +

Register machine advantages What could be expressed in one register instruction took at least three stack instructions. When interpreting code (rather than JITing more later), there is overhead for mapping each virtual instructions to a real one at runtime, so less instructions is better.

Running virtual machine code

Running Virtual Machine Code There are a number of ways to execute code in the instruction set of the virtual machine on real hardware. Generally, the most portable solution (that works on most platforms) will be the slowest and the fastest ones will be the least portable.

The function per instruction approach Have one C function per instruction. Build a big array of pointers to those functions; array index = instruction code. Execute instructions by looking up the function appropriate in the table then calling it. Completely portable, but performance hit due to making a function call per instruction.

The switch approach A huge switch statement with one case for each instruction. After executing an instruction, the program counter is increment and we jump back to the top of the switch block again (using goto). Performance depends heavily on the code the compiler generates for switch blocks, but no per-op function call overhead is a bonus. Also completely portable.

The computed goto approach GCC allows goto to jump to a memory address computed at runtime rather than a named label like most other compilers! Write C code for each instruction in a single function, prefix it with a label and build a table of label addresses. After executing each instruction, look up the address of the C code for the next instruction using the table and goto that address.

The computed goto approach Computed goto performs better than the previous two approaches, worse than JIT. However, it only works on a small number of compilers, so not very portable. Code that uses computed goto interacts nastily with the C compiler s optimizer basically the optimizer can t do much with it. Tends to mean that the computed goto core takes a lot of time and memory to compile.

What is a JIT compiler? Just In Time means that a chunk of bytecode is compiled when it is needed. Compilation involves translating Parrot bytecode into machine code understood by the hardware CPU. High performance can execute some Parrot instructions with one CPU instruction. Not at all portable custom implementation needed for each type of CPU.

How does JIT work? For each CPU, write a set of macros that describe how to generate native code for the VM instructions. Do not need to write these for every instruction; can fall back on calling the C function function that implements it. A Configure script determines the CPU type and selects the appropriate JIT compiler to build if one is available.

How does JIT work? A chunk of memory is allocated and marked executable if the OS requires this. For each instruction in the chunk of bytecode that is to be translated: If a JIT macro was written for the instruction, use that to emit native code. Otherwise, insert native code to call the C function implementing that method, as an interpreter would.

Memory Management

Memory Management During their execution, programs allocate memory for storing working data in. Often this memory is only used for a short amount of time. There is only a finite amount of memory available to use, so programs need to free up memory that is no longer being used. Traditionally programs did this themselves, e.g. through malloc() and free() in C.

What is GC (Garbage Collection) and why? Garbage collection systems automate the freeing of memory when it is no longer in use. The programmer is no longer responsible for freeing memory meaning: No memory leaks. No chance of accidentally freeing things that are still in use. Faster development.

An Easy Solution: Reference Counting Just one approach to garbage collection, used in Perl 5 and many other interpreters. Every object has a reference count a value that keeps track of the number of variables and other objects that refer to that object. When the reference count reaches zero, there is no way the object could be accessed, so it is no longer in use, therefore it can be freed.

Reference Counting Not Really Easy Very easy to forget to increment or decrement the reference count as needed. VM code littered with reference count manipulation. Circular data structures never get freed as their reference count never reaches zero. A B

Reachability Based GC Initially consider all objects dead (that is, unreachable). A C B E D F

Reachability Based GC Mark any objects that are referenced by registers or on the stack as live. P0 P1 P2 P3 F E A B E C D F

Reachability Based GC Transitively mark objects referenced by live objects as alive. P0 P1 P2 P3 F E A B E C D F

Reachability Based GC Objects that were not marked alive can thus have the memory associated with them freed. A C B

Developing Virtual Machines

Regression Testing No two teams developing a VM are the same, but they all use regression testing. Each time a feature is added to the VM or a bug is found, write some test code that tests the feature or produces the bug. Tests can all be run automatically and their output checked. Breakage or bugs that re-surface will be spotted quickly.

Regression Testing Can also write tests before features are implemented to ensure they work as expected when implemented. Test Driven Development (TDD) Also useful for ensuring that multiple implementations of the same VM produce the same results. Good for Harmony project, implementing open source JVM.

Build Tools Build tools are often used to avoid writing a lot of boring and repetitive code by hand. For example, Parrot implements many different run-cores (function per instruction, switch, computed goto). The code for each instruction is only written once and the function headers, switch block or goto code is generated automatically.

Platform Awareness It s important to try and write code that will run on platforms with Different compilers Different byte order and word order Different system APIs A character encoding other than ASCII Some platforms are really weird.

Getting Involved The first step is to download the source, compile and play with a VM http://www.parrotcode.org/ http://incubator.apache.org/harmony/ Read the docs, play around, find bugs! Report em, or write a patch both are helpful. Who knows where it may lead

The End

Any questions?