EMSCRIPTEN - COMPILING LLVM BITCODE TO JAVASCRIPT (?!)

Similar documents

From Faust to Web Audio: Compiling Faust to JavaScript using Emscripten

LLVM for OpenGL and other stuff. Chris Lattner Apple Computer

RISC-V Software Ecosystem. Andrew Waterman UC Berkeley

The V8 JavaScript Engine

Lua as a business logic language in high load application. Ilya Martynov ilya@iponweb.net CTO at IPONWEB

Tachyon: a Meta-circular Optimizing JavaScript Virtual Machine

The programming language C. sws1 1

Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters

qwertyuiopasdfghjklzxcvbnmqwerty uiopasdfghjklzxcvbnmqwertyuiopasd fghjklzxcvbnmqwertyuiopasdfghjklzx cvbnmqwertyuiopasdfghjklzxcvbnmq

PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0

Origins of Operating Systems OS/360. Martin Grund HPI

picojava TM : A Hardware Implementation of the Java Virtual Machine

The Java Virtual Machine and Mobile Devices. John Buford, Ph.D. Oct 2003 Presented to Gordon College CS 311

ELEC 377. Operating Systems. Week 1 Class 3

PARALLEL JAVASCRIPT. Norm Rubin (NVIDIA) Jin Wang (Georgia School of Technology)

Parrot in a Nutshell. Dan Sugalski dan@sidhe.org. Parrot in a nutshell 1

CSE 373: Data Structure & Algorithms Lecture 25: Programming Languages. Nicki Dell Spring 2014

A History of Tcl in the Browser

Probabilistic Assertions

Fast JavaScript in V8. The V8 JavaScript Engine. ...well almost all boats. Never use with. Never use with part 2.

what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?

x86 ISA Modifications to support Virtual Machines

Visualizing Information Flow through C Programs

Sources: On the Web: Slides will be available on:

Magento & Zend Benchmarks Version 1.2, 1.3 (with & without Flat Catalogs)

The Not Quite R (NQR) Project: Explorations Using the Parrot Virtual Machine

General Introduction

Machine-Code Generation for Functions

System Structures. Services Interface Structure

C# and Other Languages

The Click2NetFPGA Toolchain. Teemu Rinta-aho, Mika Karlstedt, Madhav P. Desai USENIX ATC 12, Boston, MA, 13 th of June, 2012

Advanced compiler construction. General course information. Teacher & assistant. Course goals. Evaluation. Grading scheme. Michel Schinz

The Little Man Computer

Compiling Object Oriented Languages. What is an Object-Oriented Programming Language? Implementation: Dynamic Binding

10 Things Every Linux Programmer Should Know Linux Misconceptions in 30 Minutes

Android Renderscript. Stephen Hines, Shih-wei Liao, Jason Sams, Alex Sakhartchouk November 18, 2011

Valgrind BoF Ideas, new features and directions

AjaxScope: Remotely Monitoring Client-side Web-App Behavior

Restraining Execution Environments

CS 2112 Spring Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions

Jonathan Worthington Scarborough Linux User Group

Crash Course in Java

Characteristics of Java (Optional) Y. Daniel Liang Supplement for Introduction to Java Programming

Low Level Virtual Machine for Glasgow Haskell Compiler

Bringing Your C/C++ Game to the Web via Emscripten. Vladimir Vukićević

Towards OpenMP Support in LLVM

1/20/2016 INTRODUCTION

Instruction Set Architecture. or How to talk to computers if you aren t in Star Trek

CSCI E 98: Managed Environments for the Execution of Programs

KVM & Memory Management Updates

STM32JAVA. Embedded Java Solutions for STM32

KVM: A Hypervisor for All Seasons. Avi Kivity avi@qumranet.com

Virtualization. Clothing the Wolf in Wool. Wednesday, April 17, 13

17. November Übung 1 mit Swift. Architektur verteilter Anwendungen. Thorsten Kober Head Solutions ios/os X, SemVox GmbH

CSE 403. Performance Profiling Marty Stepp

High-Performance Processing of Large Data Sets via Memory Mapping A Case Study in R and C++

HOTPATH VM. An Effective JIT Compiler for Resource-constrained Devices

Language Based Virtual Machines... or why speed matters. by Lars Bak, Google Inc

The Design of the Inferno Virtual Machine. Introduction

OAMulator. Online One Address Machine emulator and OAMPL compiler.

Rakudo Perl 6 on the JVM. Jonathan Worthington

CSC230 Getting Starting in C. Tyler Bletsch

The C Programming Language course syllabus associate level

Semester Review. CSC 301, Fall 2015

Cloud Computing. Up until now

Instruction Set Architecture (ISA)

First Java Programs. V. Paúl Pauca. CSC 111D Fall, Department of Computer Science Wake Forest University. Introduction to Computer Science

Topics. Introduction. Java History CS 146. Introduction to Programming and Algorithms Module 1. Module Objectives

An Incomplete C++ Primer. University of Wyoming MA 5310

Dart a modern web language

COS 318: Operating Systems. Virtual Machine Monitors

MEAP Edition Manning Early Access Program Nim in Action Version 1

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

Using jvmstat and visualgc to Solve Memory Management Problems

Full and Para Virtualization

1 The Java Virtual Machine

LLVM on IBM POWER processors A progress report

One of the fundamental kinds of Web sites that SharePoint 2010 allows

Virtualization. Explain how today s virtualization movement is actually a reinvention

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM

Virtual Machine Learning: Thinking Like a Computer Architect

Choosing the web s future. Peter-Paul Koch FOWA, 6 October 2015

GTask Developing asynchronous applications for multi-core efficiency

Ibis: Scaling Python Analy=cs on Hadoop and Impala

Term Project: Roulette

PIC Programming in Assembly. (

High Performance Computing Lab Exercises

Precise and Efficient Garbage Collection in VMKit with MMTk

Replication on Virtual Machines

CS 40 Computing for the Web

Chapter 7D The Java Virtual Machine

CSCI 3136 Principles of Programming Languages

Scoping (Readings 7.1,7.4,7.6) Parameter passing methods (7.5) Building symbol tables (7.6)

Programming the Android Platform. Logistics

Lecture 7: Machine-Level Programming I: Basics Mohamed Zahran (aka Z)

Transcription:

EMSCRIPTEN - COMPILING LLVM BITCODE TO JAVASCRIPT (?!) ALON ZAKAI (MOZILLA) @kripken

JavaScript..? At the LLVM developer's conference..?

Everything compiles into LLVM bitcode The web is everywhere, and runs JavaScript Compiling LLVM bitcode to JavaScript lets us run ~ everything, everywhere

THIS WORKS TODAY! Game engines like Unreal Engine 3 Programming languages like Lua Libraries too: like Bullet

Of course, usually native builds are best But imagine, for example, that you wrote a new feature in clang and want to let people give it a quick test Build once to JS, and just give people a URL (and that's not theoretical)

OK, HOW DOES THIS WORK?

LLVM VS. JAVASCRIPT Random (unrelated) code samples from each: %r = load i32* %p %s = shl i32 %r, 16 %t = call i32 @calc(i32 %r, i32 %s) br label %next var x = new MyClass('name', 5).chain(function(arg) { if (check(arg)) domore({ x: arg, y: [1,2,3] }); else throw 'stop'; }); What could be more different? ;)

NUMERIC TYPES LLVM JS i8, i16, i32, float, double double

PERFORMANCE MODEL LLVM JS types and ops map ~1:1 to CPU virtual machine (VM), just in time (JIT) compilers w/ type profiling, garbage collection, etc.

CONTROL FLOW LLVM JS Functions, basic blocks & branches Functions, ifs and loops no goto!

VARIABLES LLVM JS Local vars have function scope Local vars have function scope Ironic, actually many wish JS had block scope, like most languages...

OK, HOW DO WE GET AROUND THESE ISSUES?

// LLVM IR define i32 @func(i32* %p) { %r = load i32* %p %s = shl i32 %r, 16 %t = call i32 @calc(i32 %r, i32 %s) ret i32 %t } Emscripten // JS function func(p) { var r = HEAP[p]; return calc(r, r << 16); } Almost direct mapping in many cases

Another example: float array[5000]; // C++ int main() { for (int i = 0; i < 5000; ++i) { array[i] += 1.0f; } } Emscripten var g = Float32Array(5000); // JS function main() { var a = 0, b = 0; do { a = b << 2; g[a >> 2] = +g[a >> 2] + 1.0; b = b + 1 0; } while ((b 0) < 5000); } (this "style" of code is a subset of JS called asm.js)

JS AS A COMPILATION TARGET JS began as a slow interpreted language Competition type specializing JITs Those are very good at statically typed code LLVM compiled through Emscripten is exactly that, so it can be fast

SPEED: MORE DETAIL (x+1) 0 32 bit integer + in modern JS VMs Loads in LLVM IR become reads from typed array in JS, which become reads in machine code Emscripten's memory model is identical to LLVM's (flat C like, aliasing, etc.), so can use all LLVM opts

BENCHMARKS (VMs and Emscripten from Oct 28th 2013, run on 64 bit linux)

Open source (MIT/LLVM) Began in 2010 Most of the codebase is not the core compiler, but libraries + toolchain + test suite

LLVM IR JS JS Emscripten Compiler Emscripten Optimizer Compiler and optimizer written mostly in JS Wait, that's not an LLVM backend..?

3 JS COMPILERS, 3 DESIGNS Mandreel : Typical LLVM backend, uses tblgen, selection DAG (like x86, ARM backends) Duetto : Processes LLVM IR in llvm::module (like C++ backend) Emscripten : Processes LLVM IR in assembly

EMSCRIPTEN'S CHOICE JS is such an odd target wanted architecture with maximal flexibility in codegen Helped prototype & test many approaches

DOWNSIDES TOO Emscripten currently must do its own legalization (are we doing it wrong? probably...)

OPTIMIZING JS Emscripten has 3 optimizations we found are very important for JS Whatever the best architecture is, it should be able to implement those let's go over them now

1. RELOOP block0: ; code0 br i1 %cond, label %block0, label %block1 block1: ; code1 br %label block0 Without relooping (emulated gotos): var label = 0; while (1) switch (label) { case 0: // code0 label = cond? 0 : 1; break; case 1: // code1 label = 0; break; }

1. RELOOP block0: ; code0 br i1 %cond, label %block0, label %block1 block1: ; code1 br %label block0 With relooping: while (1) { do { // code0 } while (cond); // code1 }

1. RELOOP Relooping allows JS VM to optimize better, as it can understand control flow Emscripten Relooper code is generic, written in C++, and used by other projects (e.g., Duetto) This one seems like it could work in any architecture, in an LLVM backend or not

2. EXPRESSIONIZE var a = g(x); var b = a + y; var c = HEAP[b]; var d = HEAP[20]; var e = x + y + z; var f = h(d, e); FUNCTION_TABLE[c](f); FUNCTION_TABLE[HEAP[g(x) + y](h(heap[20], x + y + z));

2. EXPRESSIONIZE Improves JIT time and execution speed: fewer variables less stuff for JS engines to worry about Reduces code size

3. REGISTERIZE var a = g(x) 0; // integers var b = a + y 0; var c = HEAP[b] 0; var d = +HEAP[20]; // double var a = g(x) 0; a = a + y 0; a = HEAP[a] 0; var d = +HEAP[20];

3. REGISTERIZE Looks like regalloc, but goal is different: Minimize # of total variables (in each type), not spills JS VMs will do regalloc, only they know the actual # of registers Benefits code size & speed like expressionize

OPTS SUMMARY Expressionize & registerize require precise modelling of JS semantics (and order of operations is in some cases surprising!) Is there a nice way to do these opts in an LLVM backend, or do we need a JS AST? Questions: Should Emscripten change how it interfaces with LLVM? What would LLVM like upstreamed?

CONCLUSION LLVM bitcode can be compiled to JavaScript and run in all browers, at high speed, in a standardscompliant way For more info, see emscripten.org feedback & contributions always welcome Thank you for listening!