Compilation 2012 Domain-Specific Languages and Syntax Extensions

Similar documents
CSCI 3136 Principles of Programming Languages

Syntax Check of Embedded SQL in C++ with Proto

Programming Language Features (cont.) CMSC 330: Organization of Programming Languages. Parameter Passing in OCaml. Call-by-Value

Textual Modeling Languages

Static vs. Dynamic. Lecture 10: Static Semantics Overview 1. Typical Semantic Errors: Java, C++ Typical Tasks of the Semantic Analyzer

1 Introduction. 2 An Interpreter. 2.1 Handling Source Code

Functional Programming

Semester Review. CSC 301, Fall 2015

Chapter 7: Functional Programming Languages

Compiler I: Syntax Analysis Human Thought

CSE 307: Principles of Programming Languages

The C Programming Language course syllabus associate level

Scoping (Readings 7.1,7.4,7.6) Parameter passing methods (7.5) Building symbol tables (7.6)

Chapter 1. Dr. Chris Irwin Davis Phone: (972) Office: ECSS CS-4337 Organization of Programming Languages

Programming Languages

Lecture 1: Introduction

Semantic Analysis: Types and Type Checking

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

YouTrack MPS case study

Parsing Technology and its role in Legacy Modernization. A Metaware White Paper

Advanced compiler construction. General course information. Teacher & assistant. Course goals. Evaluation. Grading scheme. Michel Schinz

High-Level Programming Languages. Nell Dale & John Lewis (adaptation by Michael Goldwasser)

COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing

Object-Oriented Software Specification in Programming Language Design and Implementation

Introduction. Compiler Design CSE 504. Overview. Programming problems are easier to solve in high-level languages

Scanning and parsing. Topics. Announcements Pick a partner by Monday Makeup lecture will be on Monday August 29th at 3pm

CA Compiler Construction

Advanced Functional Programming (9) Domain Specific Embedded Languages

Databases 2011 The Relational Model and SQL

Passing Arguments. A comparison among programming languages. Curtis Bright. April 20, 2011

Compiler Construction

SQL and Programming Languages. SQL in Programming Languages. Applications. Approaches

DSL Contest - Evaluation and Benchmarking of DSL Technologies. Kim David Hagedorn, Kamil Erhard Advisor: Tom Dinkelaker

Lecture 9. Semantic Analysis Scoping and Symbol Table

Parameter passing in LISP

BRICS. Growing Languages with Metamorphic Syntax Macros

Language Processing Systems

CS 141: Introduction to (Java) Programming: Exam 1 Jenny Orr Willamette University Fall 2013

Functional Programming in C++11

C Compiler Targeting the Java Virtual Machine

Principles of Programming Languages Topic: Introduction Professor Louis Steinberg

How To Use The C Preprocessor

Cedalion A Language Oriented Programming Language (Extended Abstract)

Compiler Construction

SQL INJECTION ATTACKS By Zelinski Radu, Technical University of Moldova

Anatomy of Programming Languages. William R. Cook

Programming Language Pragmatics

n Introduction n Art of programming language design n Programming language spectrum n Why study programming languages? n Overview of compilation

Organization of DSLE part. Overview of DSLE. Model driven software engineering. Engineering. Tooling. Topics:

SDMX technical standards Data validation and other major enhancements

Implementing Programming Languages. Aarne Ranta

How to make the computer understand? Lecture 15: Putting it all together. Example (Output assembly code) Example (input program) Anatomy of a Computer

Chapter 13. Introduction to SQL Programming Techniques. Database Programming: Techniques and Issues. SQL Programming. Database applications

WIRIS quizzes web services Getting started with PHP and Java

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science

An Eclipse Plug-In for Visualizing Java Code Dependencies on Relational Databases

7. Building Compilers with Coco/R. 7.1 Overview 7.2 Scanner Specification 7.3 Parser Specification 7.4 Error Handling 7.5 LL(1) Conflicts 7.

Real SQL Programming 1

Habanero Extreme Scale Software Research Project

Embedded Software Development with MPS

Moving from CS 61A Scheme to CS 61B Java

Generalizing Overloading for C++2000

Glossary of Object Oriented Terms

Programming Database lectures for mathema

03 - Lexical Analysis

Context free grammars and predictive parsing

CS346: Database Programming.

CSE 130 Programming Language Principles & Paradigms

A Framework for Extensible Languages

The Design Maintenance System (DMS ) A Tool for Automating Software Quality Enhancement

Language Evaluation Criteria. Evaluation Criteria: Readability. Evaluation Criteria: Writability. ICOM 4036 Programming Languages

The Advantages of Multi-Stage Programming

1.1 WHAT IS A PROGRAMMING LANGUAGE?

Theory of Compilation

Functional Programming. Functional Programming Languages. Chapter 14. Introduction

Compilers. Introduction to Compilers. Lecture 1. Spring term. Mick O Donnell: michael.odonnell@uam.es Alfonso Ortega: alfonso.ortega@uam.

The Mjølner BETA system

COS 301 Programming Languages

A Multi-layered Domain-specific Language for Stencil Computations

Design Patterns in Parsing

CSC Software II: Principles of Programming Languages

The previous chapter provided a definition of the semantics of a programming

Chapter 6: Programming Languages

Chapter 5 Functions. Introducing Functions

Static Analyzers. Context. Learning Objectives

How to Improve Database Connectivity With the Data Tools Platform. John Graham (Sybase Data Tooling) Brian Payton (IBM Information Management)

Language Oriented Programming

Master of Sciences in Informatics Engineering Programming Paradigms 2005/2006. Final Examination. January 24 th, 2006

Microphone Modem Model, and MinML Modification

A TOOL FOR DATA STRUCTURE VISUALIZATION AND USER-DEFINED ALGORITHM ANIMATION

Unit Testing for Domain-Specific Languages

MEAP Edition Manning Early Access Program Nim in Action Version 1

Building Call Graphs for Embedded Client-Side Code in Dynamic Web Applications

ML for the Working Programmer

Teaching Pragmatic Model-Driven Software Development

[Refer Slide Time: 05:10]

Programming Languages CIS 443

Embedded Programming in C/C++: Lesson-1: Programming Elements and Programming in C

Transcription:

Compilation 2012 and Syntax Extensions Jan Midtgaard Michael I. Schwartzbach Aarhus University

GPL Problem Solving The General Purpose Language (GPL) approach: analyze the problem domain express the conceptual model as an OO/FP/ design program a framework/library express concrete application as framework/library client Pros: predictable and familiar result (relatively) low cost of implementation Cons: difficult to fully exploit domain-specific knowledge only available to general programmers 2

DSL Problem Solving The DSL approach: analyze the problem domain express the conceptual model as a language design implement a compiler or interpreter Pros: possible to exploit all domain-specific knowledge also available to domain experts Cons: (relatively) high cost of implementation risk of Babylonian confusion lack of tool support (IDE, ) hard to combine DSLs or DSL and GPL this way 3

Variations of DSLs A stand-alone DSL: a novel language with unique syntax and features example: LaTeX An embedded DSL: an existing GPL extended with DSL features example: JSP An external DSL: a stand-alone DSL invoked from a GPL example: SQL invoked from Java (JDBC) 4

From DSL to GPL A stand-alone DSL may evolve into a GPL: Fortran Formula Translation Algol Algorithmic Language Cobol Common Business Oriented Language Lisp List Processing Language Simula Simulation Language ML Meta Language A (successful) DSL design should plan for growth 5

Using Domain-Specific Knowledge Domain-specific syntax: domain-specific syntax clarifies the behavior directly denote high-level concepts Domain-specific analysis: consider global properties of the application Domain-specific optimization: exploit domain-specific analysis results GPL frameworks cannot provide these benefits 6

The Ocamlyacc/Menhir Languages A stand-alone (or external) DSL: no general-purpose computing is required Domain concepts: Context-free grammars Tokens / terminals Non-terminals and productions Implemented using: a lexer+parser (hand-written or ocamllex/ocamlyacc) a symbol checker + analysis a parsetable builder + emitter (menhir contains different table/code/coq backends) 7

DSL Syntax for Grammars start : start PLUS term { } start MINUS term { } term { }; term : term STAR factor { } term SLASH factor { } factor { }; factor : ID { } LPAR start RPAR { }; The BNF syntax closely matches the domain at hand 8

GPL Alternatives Parsing can be done in a number of ways: Hand-written (lexer and) parser (more next week) Hand-written parser table Parser combinators Harder to write correctly Fixed implementation strategy In contrast (OCaml)yacc and menhir decouple the language description from the workings of the language parser 9

DSL Analysis for Grammars Symbol checking: Checks non-terminal and terminal names Checks indexes ($1) for validity (bounds + data) Menhir also type checks the productions (by type checking the action code) Analyses grammar for useless productions (reachability) and removes them Checks grammar for LALR/LR(1) conformance These are checked by phases in the ocamlyacc/menhir compiler 10

GPL Analysis Alternative Lots of yellow PostIt notes: These cannot (all) be checked by a GPL compiler, e.g., OCaml or Java. 11

The JWIG Language An embedded DSL (in Java): lots of general-purpose computing is required Domain concepts: XML templates Web services sessions Implemented using: a syntax extension a static analysis a framework 12

DSL Syntax for JWIG public class test extends Service { } String userid; public class Login extends Session { XML wrap = [[<html> <body bgcolor="yellow"> <[contents]> </body> </html>]]; public void main() { XML login = [[<form> Userid: <input type="text" name="userid"> <input type="submit"/> </form>]]; show wrap<[contents = login]; userid = receive userid; show wrap<[contents = "Welcome "+userid]; } } 13

GPL Syntax Alternative XML login = XML.make("<form>\nUserid: <input type=\"text\" name=\"userid\">\n<input type=\"submit\"/>\</form>"); show(wrap.plug("contents",login)); userid = receive("userid"); The DSL syntax maps directly to methods calls in an underlying Java framework Avoiding escapes makes the syntax more legible But this is just a thin layer of syntactic sugar 14

DSL Analysis for JWIG A static analysis that at compile time guarantees: only well-formed and valid XML is ever generated only existing form fields are ever received only exisiting gaps are ever plugged This is a DSL analysis that is performed on the resulting compiled class files 15

JWIG Implementation Model JWIG syntax jwigc Java syntax javac.class files jwiga JWIG framework analysis results 16

Syntax Extensions Programmers may want to extend the syntax of their programming language: introduce domain-specific syntax abbreviate common idioms define language extensions ensure consistency Such extensions are introduced through macros 17

Macros Macros are as old as programming Is used as an orthogonal abstraction mechanism Two different flavors: lexical macros syntactic macros Main Entry: 2 macro Pronunciation: 'ma-(")kro Function: noun Inflected Form(s): plural macros Etymology: short for macroinstruction Date: 1959 a single computer instruction that stands for a sequence of operations 18

Lexical Macros Operate on sequences of tokens Are handled by a preprocessor Are independent of the host language syntax Examples: CPP TeX 19

CPP - The C Preprocessor Integrated into C compilers Also works as a stand-alone expander Intercepts directives such as: #define #undef #ifdef #if #include 20

Lexical Macro Example CPP macro to square a number: #define square(x) X * X square(z + 1) z + 1 * z + 1 21

Lexical Macro Example CPP macro to square a number: #define square(x) X * X square(z + 1) z + (1 * z) + 1 Adding parentheses as a hack: #define square(x) (X) * (X) square(z + 1) (z + 1)*(z + 1) 22

Parsing Problem #define swap(x,y) { int t=x; X=Y; Y=t; } if (a > b) swap(a,b); else b=0; *** test.c:3: parse error before 'else' 23

Parsing Problem Hack #define swap(x,y) { int t=x; X=Y; Y=t; } if (a > b) swap(a,b); else b=0; *** test.c:3: parse error before 'else' #define swap(x,y) do { int t=x; X=Y; Y=t; } while (0) if (a > b) swap(a,b); else b=0; 24

Expansion Time #define A 87 #define B A #undef A #define A 42 B??? Eager expansion (definition time): B 87 Lazy expansion (invocation time): B A 42 CPP is lazy 25

Expansion Order #define id(x) X #define one(x) id(x) #define two a,b one(two)??? Inner ( call-by-value ): one(two) one(a,b) *** arity error 'one' Outer ( call-by-name ): one(two) id(two) two a,b 26

Expansion Order in CPP CPP uses a pragmatic "argument prescan": one(two) id(a,b) *** arity error 'id' Useful for composing macros: #define succ(x) ((X)+1) #define call7(x) X(7) call7(succ) succ(7) ((7)+1) 27

Recursive Expansion #define x 1+x x??? Definition time: *** recursive definition Invocation time: x 1+x 1+1+x 1+1+1+x... 28

Recursive Expansion in CPP CPP uses a pragmatic "intercept-and-ignore": int x = 2; #define x = 1+x x 1+x Maintain a stack of macro invocations Ignore invocations of macros already on the stack At runtime the value of x is 3 29

TeX Macros \def \vector #1[#2..#3] { } $({#1}_{#2},\ldots,{#1}_{#3})$ \vector \phi[0..n-1] $({\phi}_{0},\ldots,{\phi}_{n-1})$ Flexible invocation syntax Parsing ambiguities (chooses shortest invocation) Expansion is lazy and outer Recursion is permitted (conditions allowed) 30

Syntactic Macros Operate on sequences of ASTs Are handled by the parser Are integrated with the host language syntax Examples: C++ templates Jakarta Tool Suite 31

C++ Templates Integrated into C++ compilers Is intended as a genericity mechanism But is often used as a macro language Macros accept ASTs for: identifers constants types The result is always an AST for a declaration 32

Syntactic Macro Example template <class T> T GetMax(T x, T y) { return (x>y?x:y); } int i,j; max = GetMax <int> (i,j); Template bodies are parsed at definition time (unlike CPP macros) Templates are syntactically expanded Heavy use of templates yields bloated code (unlike Java generics that are not macros) 33

Metaprogramming C++ templates: perform compile time constant folding of arguments allow multiple template definitions and pattern matching This combination enables metaprogramming: Turing-complete computations during compilation Template libraries exist for: booleans control structures functions variables data structures 34

Metaprogramming Example template <int X, int Y> struct pow { static const int n=x*pow<x,y-1>::n; }; template <int X> struct pow<x,0> { static const int n = 1; }; const int z = pow<5,3>::n; The value 125 is assigned to z at compile time 35

Metaprogramming for Specialization template <int I> inline float dot(float *a, float *b) { return dot<i-1>(a,b) + a[i]*b[i]; } template <> inline float dot<0>(float *a, float *b) { return a[0]*b[0]; } float x[3], y[3]; float z = dot<2>(x,y); float z = x[0]*y[0] + x[1]*y[1] + x[2]*y[2]; The overhead of control structures are removed 36

Jakarta Tool Suite JTS extends Java with simple syntactic macros Macros accept ASTs for: AST_QualifiedName AST_Exp AST_Stm AST_FieldDecl AST_Class AST_TypeName The result is an AST specified as: exp{... }exp stm{... }stm mth{... }mth cls{... }cls 37

Hygienic Macros macro swap(ast_qualifiedname x, AST_QualifiedName y) local temp stm{ int temp = x; x = y; y = temp; }stm int temp = 42; int tump = 87; #swap(temp,tump); Potential name clash problem: int temp = temp; temp = tump; tump = temp; But local names are renamed uniquely: int temp143 = temp; temp = tump; tump = temp143; Hygienic macros are available in Scheme, various macro extensions of Java such as JSE, 38