Textual Modeling Languages Slides 4-31 and 38-40 of this lecture are reused from the Model Engineering course at TU Vienna with the kind permission of Prof. Gerti Kappel (head of the Business Informatics Group) 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 1
Workflow Overview Text comprehension + transformation Text Lexing Token stream (tokens = lexical units: comments and whitespaces removed) Parsing Concrete syntax tree (syntactical units: for cycle, if statement) Abstraction Abstract syntax tree (AST) Transformation Model Editor specification How to provide useful feedback to the user in a GUI 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 2
Table of Contents EBNF XText ANTLR + M2M transformation Online demo 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 3
Language Specification Basics Extended Backus-Naur-Form (EBNF) Originally introduced by Niklaus Wirth to specify the syntax of Pascal In general, they can be used to specify a context-free grammar ISO Standard Fundamental assumption: A text consists of a sequence of terminal symbols (visible characters). EBNF specifies all valid terminal symbol sequences using a compact and finite representation grammar Grammar Terminal symbols (lexical units) Non-terminal symbols (syntactical units) Starting non-terminal symbol (root of the syntax tree) Production rules consist of a left side non-terminal and a right side (valid symbol sequences) 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 4
EBNF Production rules consist of: Terminal NonTerminal Choice Optional Repetition Grouping Comment 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 5
Example: Entity DSL Language constructs Arbitrary character sequences Keywords Separation characters Scope borders References type String type Boolean entity Conference { property name : String property attendees : Person[] property speakers : Speaker[] } entity Person { property name : String } entity Speaker extends Person { } 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 6
Example: Entity DSL II. EBNF grammar Model := Type*; Type := SimpleType Entity; SimpleType := 'type' Identifier; Entity := 'entity' Identifier ('extends' Identifier)? '{' Property* '}'; Property := 'property' Identifier ':' Identifier ('[]')?; Identifier := ('a'..'z' 'A'..'Z' '_') ('a'..'z' 'A'..'Z' '_' '0'..'9'); 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 7
Example: Entity DSL III. EBNF-to-Ecore mapping Model := Type*; Type := SimpleType Entity; SimpleType := 'type' Identifier; Entity := 'entity' Identifier ('extends' Identifier)? '{' Property* '}'; Property := 'property' Identifier ':' Identifier ('[]')?; Identifier := ('a'..'z' 'A'..'Z' '_') ('a'..'z' 'A'..'Z' '_' '0'..'9'); 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 8
EBNF vs. Ecore EBNF + Specifies concrete syntax + Linear order of elements No reusability Only containment relationships Ecore + Reusability by inheritance + Non-containment references + Predefined data types and user-defined enumerations ~ Specifies only abstract syntax Conclusion A meaningful EBNF cannot be generated from a metamodel and vice versa! Challenge How to overcome the gap between these two worlds? 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 9
Solution Overview Generic Syntax Like XML Metamodel is sufficient, i.e., no concrete syntax is needed For instance: HUTN (OMG Standard) - www.omg.org/spec/hutn Metamodel First! Step 1: Specify metamodel Step 2: Specify textual syntax additionally For instance: TCS (Eclipse Plug-in) - www.eclipse.org/gmt/tcs Grammar First! Step 1: Syntax is specified by a grammar (concrete syntax & abstract sytnax) Step 2: Metamodel is derived from step 1 For instance: Xtext (Eclipse Plugin) - www.eclipse.org/xtext Although Xtext supports importing metamodels in the meanwhile Separate metamodel and grammar 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 10
Table of Contents EBNF XText ANTLR + M2M transformation Online demo 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 11
XText Introduction XText was originally part of openarchitectureware, now it is more a separate project Used to develop textual domain specific languages Grammar definition similar to EBNF, but with extra features for metamodel generation Creates metamodel, parser, and editor from grammar definition Editor supports syntax check, highlighting, and code completion Context-sensitive constraints on the grammar described in OCL-like language 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 12
XText Workflow Overview oaw «artifact» Metamodel «artifact» Grammar «artifact» Constraints Xtext (Textual DSLs) «component» Parser «component» Editor Xpand (M2C) Xtend (M2M) «artifact» Model 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 13
XText Grammar I. Xtext grammar similar to EBNF Extended by Object-oriented concepts Information necessary to derive the metamodel Editor Example A A : (type = B); type B 0..1 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 14
XText Grammar II. Terminal rules Similar to EBNF rules Return value is String by default EBNF expressions Cardinalities Zero to one? Zero to many * One to many + Character range Wildcard Until Token Negated Token Predefined rules ID, String, Int, URI '0'..'9' 'f'.'o' '/*' -> '*/' '#' (!'#')* '#' 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 15
XText Grammar III. terminal ID : ('^')?('a'..'z' 'A'..'Z' '_') ('a'..'z' 'A'..'Z' '_' '0'..'9')*; terminal INT returns ecore::eint : ('0'..'9')+; terminal ML_COMMENT : '/*' -> '*/'; 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 16
XText Grammar IV. Type rules For each type rule a class is generated in the metamodel Class name corresponds to rule name Type rules contain Terminals -> Keywords Assignments -> Attributes or containment references Cross References -> Non-containment references Assignment Operators Single-valued feature = Multi-valued feature += Boolean features?= Enum rules Map Strings to enumeration literals 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 17
XText Grammar V. Assignment State : 'state' name = ID (transitions += Transition)* 'end'; Cross References Transition : event = [Event] '=>' state = [State]; Enum rules enum ChangeKind : ADD MOVE REMOVE; enum ChangeKind : ADD = 'add' ADD = '+' MOVE = 'move' MOVE = '->' REMOVE = 'remove' REMOVE = '-'; 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 18
XText Tooling I. Xtext Blueprint Projects Grammar definition Code generator IDE functionality 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 19
XText Tooling II. Xtext Grammar Project Workflow File for generating DSL Editor Properties: File Extension, Grammar Definition 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 20
XText Tooling III. Xtext Grammar Definition Grammar Name Default Terminals (ID, STRING, ) Metamodel URI 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 21
XText Tooling IV. Xtext Grammar Definition for State Machines 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 22
XText Tooling V. Generated Ecore-based Metamodel 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 23
XText Tooling VI. Generated DSL Editor Outline View Error! Code Completion (Ctrl+Space) Highlighting of keywords Error Description 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 24
Entity DSL Example Model := Type*; EBNF Grammar Type := SimpleType Entity; SimpleType := 'type' Identifier; Entity := 'entity' Identifier ('extends' Identifier)? '{' Property* '}'; Property := 'property' Identifier ':' Identifier ('[]')?; Identifier := ('a'..'z' 'A'..'Z' '_') ('a'..'z' 'A'..'Z' '_' '0'..'9'); type String type Boolean Example Model entity Conference { property name : String property attendees : Person[] property speakers : Speaker[] } entity Person { property name : String } entity Speaker extends Person { } 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 25
EBNF to XText Transition Model := Type*; EBNF Grammar Type := SimpleType Entity; SimpleType := 'type' Identifier; Entity := 'entity' Identifier ('extends' Identifier)? '{' Property* '}'; Property := 'property' Identifier ':' Identifier ('[]')?; Identifier := ('a'..'z' 'A'..'Z' '_') ('a'..'z' 'A'..'Z' '_' '0'..'9'); XText Grammar grammar MyDsl with org.eclipse.xtext.common.terminal generate mydsl "http://mydsl" Model : elements += Type*; Type : SimpleType Entity; SimpleType : 'type' name = ID; Entity : 'entity' name = ID ('extends' extends = [Entity])? '{' properties += Property* '}'; Property : 'property' name = ID ':' type = [Type](many?= '[]')?; 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 26
Specifying Context Sensitive Constraints for Textual DSLs I. Examples Entity names must start with an uppercase character Entity names must be unique Property names must be unique within one entity Answer Use the same techniques as for metamodels! XText Grammar grammar MyDsl with org.eclipse.xtext.common.terminal generate mydsl "http://mydsl" Model : elements += Type*; Type : SimpleType Entity; SimpleType : 'type' name = ID; Entity : 'entity' name = ID ('extends' extends = [Entity])? '{' properties += Property* '}'; Property : 'property' name = ID ':' type = [Type](many?= '[]')?; 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 27
Specifying Context Sensitive Constraints for Textual DSLs II. Examples 1. Entity names must start with an uppercase character 2. Entity names must be unique 3. Property names must be unique within one entity Solutions context mydsl::entity WARNING "Name should start with a capital": name.tofirstupper() == name; context mydsl::entity ERROR "Name must be unique": ((Model)this.eContainer).elements.name.select(e e == this.name).size == 1; context mydsl::property ERROR "Name must be unique": ((Model)this.eContainer.eContainer).properties.name.select(p p == this.name).size == 1; 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 28
Specifying Context Sensitive Constraints for Textual DSLs III. Every edit operation for cheap constraints Every save operation for cheap to expensive constraints Every generation operation for very expensive constraints 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 29
Specifying Context Sensitive Constraints for Textual DSLs IV. Code Generation Projects are automatically generated Code 2 Model Injector is used in the workflow file <component class="org.eclipse.xtext.mwereader" uri="${modelfile}"> <!-- this class will be generated by the xtext generator --> <register class="org.xtext.example.mydslstandalonesetup"/> </component> Xpand templates can be developed as usual «IMPORT mydsl» «DEFINE main FOR Model» «FOREACH this.elements.typeselect(entity) AS e-» «FILE e.name".class"-» class «e.name»... «ENDFILE» «ENDFOREACH» «ENDDEFINE» 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 30
XText: Standard Framework for EMF? Tight integration with Eclipse and oaw Powerful editors, constraint checks, code generation, model transformation, Allows to switch between EMF models and text-based artefacts Both are supported in Eclipse: Textual and Graphical Modeling! Allows to customize the editor by many extension points Pretty printing, code completion, Strong community 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 31
Strengths and Weaknesses of XText (+) Compact: Minimal effort for establishing small/simple DSLs EMF/Eclipse integration EBNF like (easy to learn) Editor for free text comprehension transformation editor specification 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 32
Strengths and Weaknesses of XText (-) Mixing of concerns: Text comprehension, actual transformation and editor specification are implicit and woven tightly together in one specification with the following consequences: Bidirectionality is hard to support Impossible to reuse e.g., the transformation part with a different textual comprehension part (e.g., for XML, regular expressions, directory structures, handwritten parsers, ). Complex specifications become difficult to maintain as everything becomes complex, i.e., textual comprehension and the transformation. 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 33
Strengths and Weaknesses of XText (-) Generality: Not all languages can be expressed (easily) with an Xtext grammar (mixing in Java is not possible!) Target model is usually quite low-level. In general a metamodel first approach for an arbitrary model requires a subsequent M2M transformation leading to a high complexity of the complete transformation chain due to different languages. Fixed interpretation of EBNF Using existing or preferred (e.g., bottom-up) parsers is not possible No parts of the transformation can be reused if the modeling standard (Ecore) is changed 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 34
Table of Contents EBNF XText ANTLR + M2M transformation Online demo 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 35
Alternative Tree-Based Approach Separate transformation into two distinct parts: 1. Text-to-tree (text comprehension) with a parser (e.g., generated from an EBNF specification) 2. Tree-to-model (actual transformation or interpretation) with a model transformation language (e.g., SDM, TGG) Step (1) can be achieved with just an EBNF specification and no extra effort. Step (2) involves (i) adding extra typing information to the homogeneous tree structure from a parser, and (ii) deducing context-sensitive relationships to yield a (regular) graph structure. Arbitrary metamodels can be targeted with increasing complexity of Step (2) 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 36
Workflow with ANTLR + emoflon Simple Tree Metamodel (AST) [Moca Tree] Target Metamodel EBNF SDM, TGG Lexer + Parser Transformation text tree model Text-to-Tree Tree-to-Model 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 37
Advantages of Textual DSLs Textual languages have specific strengths compared to graphical languages Scalability Pretty-printing Compact and expressive syntax Productivity for experienced users IDE support softens learning curve Configuration management/versioning Concurrent work on a model, especially with a version control system Diff, merge, search, replace, Working without a sophisticated editor 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 38
State-of-the-Art DSL Development Techniques DSLs are very common: CSS, regular expressions, ant, SQL, HQL, Rails Two types of DSLs: internal vs. external Internal DSLs Embedded languages in existing host languages Explicit internal DSLs: becoming mainstream through Ruby and Groovy Implicit internal DSLs: fluent interfaces simulate DSLs in Java and C# External DSLs Have their own custom syntax Own parser to process them Own editor to build sentences Own compiler to executable language or own interpreter Many XML-based languages have ended up as external DSLs (not user friendly) 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 39
References XText related Project Site: http://www.eclipse.org/xtext Xtext Webinar: http://live.eclipse.org/node/705 Text-based language meta-frameworks Monticore: www.monticore.org MGrammar: http://msdn.microsoft.com/en-us/library/dd857654(vs.85).aspx TCS: www.eclipse.org/gmt/tcs HUTN: http://www.omg.org/spec/hutn/ Domain specific (modeling) languages Fowler, M.: Domain Specific Languages. 2009. Mernik, M., Heering, J., and Sloane, A. M.: When and how to develop domainspecific languages. ACM Computing Surveys 37(4), pp. 316-344, 2005. Kelly, S., and Tolvanen, J.P.: Domain-Specific Modeling: Enabling Full Code Generation. Wiley-IEEE Computer Society, 2008. 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 40
Table of Contents EBNF XText ANTLR + M2M transformation Online demo 24. Mai 2012 Real-Time Systems Lab Prof. Dr. Andy Schürr Dr. Gergely Varró 41