A Multi-layered Domain-specific Language for Stencil Computations



Similar documents
HPC enabling of OpenFOAM R for CFD applications

Structure of Presentation. The Role of Programming in Informatics Curricula. Concepts of Informatics 2. Concepts of Informatics 1

Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff

Chapter 1. Dr. Chris Irwin Davis Phone: (972) Office: ECSS CS-4337 Organization of Programming Languages

Building Platform as a Service for Scientific Applications

Language Evaluation Criteria. Evaluation Criteria: Readability. Evaluation Criteria: Writability. ICOM 4036 Programming Languages

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes

DARPA, NSF-NGS/ITR,ACR,CPA,

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

HPC Programming Framework Research Team

HPC Wales Skills Academy Course Catalogue 2015

ParFUM: A Parallel Framework for Unstructured Meshes. Aaron Becker, Isaac Dooley, Terry Wilmarth, Sayantan Chakravorty Charm++ Workshop 2008

GEDAE TM - A Graphical Programming and Autocode Generation Tool for Signal Processor Applications

16.1 MAPREDUCE. For personal use only, not for distribution. 333

Run-time Variability Issues in Software Product Lines

HPC Deployment of OpenFOAM in an Industrial Setting

Introduction to Generative Software Development

Control 2004, University of Bath, UK, September 2004

International Workshop on Field Programmable Logic and Applications, FPL '99

HIGH PERFORMANCE BIG DATA ANALYTICS

Big Data looks Tiny from the Stratosphere

Cluster, Grid, Cloud Concepts

LDIF - Linked Data Integration Framework

Implementing a Bidirectional Model Transformation Language as an Internal DSL in Scala

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

Modeling Turnpike: a Model-Driven Framework for Domain-Specific Software Development *

Towards real-time image processing with Hierarchical Hybrid Grids

Distributed communication-aware load balancing with TreeMatch in Charm++

Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms

Layered Approach to Development of OO War Game Models Using DEVS Framework

A Case Study - Scaling Legacy Code on Next Generation Platforms

FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG

What is a programming language?

Fall 2012 Q530. Programming for Cognitive Science

Distributed Computing and Big Data: Hadoop and MapReduce

MPI and Hybrid Programming Models. William Gropp

Petascale Software Challenges. William Gropp

Dynamic Network Analyzer Building a Framework for the Graph-theoretic Analysis of Dynamic Networks

RevoScaleR Speed and Scalability

Integrated Model-based Software Development and Testing with CSD and MTest

Iterative Solvers for Linear Systems

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Software Architecture

Parallel Computing. Benson Muite. benson.

Scientific Computing Programming with Parallel Objects

The Fastest Way to Parallel Programming for Multicore, Clusters, Supercomputers and the Cloud.

Programming Languages & Tools

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

today 1,700 special programming languages used to communicate in over 700 application areas.

Report: Declarative Machine Learning on MapReduce (SystemML)

Easier - Faster - Better

A DSL-based Approach to Software Development and Deployment on Cloud

P013 INTRODUCING A NEW GENERATION OF RESERVOIR SIMULATION SOFTWARE

Difference Between Model-Driven and Traditional Iterative Software Development

The Scientific Data Mining Process

Quotes from Object-Oriented Software Construction

Highly Scalable Dynamic Load Balancing in the Atmospheric Modeling System COSMO-SPECS+FD4

Introduction to Digital System Design

Software Engineering Reference Framework

A Software and Hardware Architecture for a Modular, Portable, Extensible Reliability. Availability and Serviceability System

D5.6 Prototype demonstration of performance monitoring tools on a system with multiple ARM boards Version 1.0

Principles and characteristics of distributed systems and environments

Big Data Systems CS 5965/6965 FALL 2015

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

Design with Reuse. Building software from reusable components. Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 14 Slide 1

BSC vision on Big Data and extreme scale computing

Recent Advances in Periscope for Performance Analysis and Tuning

Foundations of Model-Driven Software Engineering

1/20/2016 INTRODUCTION

Microsoft HPC. V 1.0 José M. Cámara (checam@ubu.es)

Position Classification Flysheet for Computer Science Series, GS Table of Contents

Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations

MEng, BSc Applied Computer Science

NEUROMATHEMATICS: DEVELOPMENT TENDENCIES. 1. Which tasks are adequate of neurocomputers?

Introduction to Virtual Machines

Algorithm & Flowchart & Pseudo code. Staff Incharge: S.Sasirekha

Uintah Framework. Justin Luitjens, Qingyu Meng, John Schmidt, Martin Berzins, Todd Harman, Chuch Wight, Steven Parker, et al

Embedded Software Development with MPS

Document Engineering: Analyzing and Designing the Semantics of Business Service Networks

1 External Model Access

Scalable and High Performance Computing for Big Data Analytics in Understanding the Human Dynamics in the Mobile Age

Hadoop. Sunday, November 25, 12

Transcription:

A Multi-layered Domain-specific Language for Stencil Computations Christian Schmitt, Frank Hannig, Jürgen Teich Hardware/Software Co-Design, University of Erlangen-Nuremberg Workshop ExaStencils 2014, Dresden, Germany; March 31, 2014

Challenges for Software Development in Computational Science and Engineering Current Situation Hardware: Modern HPC clusters are massively parallel Parallel intra-core, intra-node, and inter-node Increasing heterogeneity Applications: Become more complex with increasing computation power More complex (physical) models Code development in interdisciplinary teams Algorithm: Multigrid is a general idea Components and parameters depend on type of problem, grid,... Software development has to address all these issues! 1

Challenge: The 3 P s Performance Portable: high performance on different target hardware Competetive: performance comparable to hand-written code Portability Support different target architectures from the same algorithm description Support different target languages and technologies from the same algorithm description Productivity Algorithm description at a high level Hide low-level details from the user 2

Domain-Specific Language (DSL) Definitions Domain-Specific Languages: An Annotated Bibliography, Arie van Deursen, Paul Klint, und Joost Visser: A domain-specific language (DSL) is a programming language or executable specification language that offers, through appropriate notations and abstractions, expressive power focused on, and usually restricted to, a particular problem domain. Domain-Specific Languages, Martin Fowler: Domain-specific language: a computer programming language of limited expressiveness focused on particular domain. 3

Advantages of DSLs Productivity Yield average programmers from the difficulties of parallel programming Priority is given to the development of algorithms and applications, i. e., focus is not on details of a low-level implementation Performance Exploitation of generic parallel execution patterns in a domain at high abstraction level Reduced (restricted) expressiveness Full extraction of available parallelism Make use of knowledge (domain, architecture) for static and dynamic optimizations Portability and Scalability DSL and run-time system should be easily extendable in order to adapt to new architectures Applications (DSL codes) should remain the same Allows hardware engineers to introduce innovations without worrying about portability 4

Goals of a DSL Domain-specific: provide only expressions relevant to the topic Orthogonal: one single way of specifying something Expressive and compact: describe relevant constructs with few statements Abstract: work on a high-level point of view Adaptable: support complex things Adoptable: employ terms and concepts of the domain Regular: all terms should follow the same syntax and ideas Well-defined: non-ambiguous and easy to understand Talk about what should be computed, not how. (declarative vs. imperative) 5

Two Approaches to Creating DSLs Internal / embedded DSLs Utilize a general-purpose programming language (host language) Extension or restriction of the host language (or both at the same time) Extensions possible in form of libraries (e. g., data types, objects, methods), annotations, macros, etc. Same syntax as host language and usually the same compiler or interpreter External DSLs Completely new defined programming language More flexible and expressive than internal DSLs Syntax and semantics defined freely, but often related to existing languages Higher design effort, but supporting tools exist Potential to create a powerful semantic model as intermediate representation (IR) 6

Outline Motivation & Introduction to DSLs Language Design Guidelines The ExaStencils DSL 7

Language Design Guidelines

Language Design Guidelines It s all about simplicity! Randall Munroe. xkcd: Manuals. 2014. URL: http://xkcd.com/1343/ 8

Language Design Guidelines Identify users Language purpose Identify uses Appropriate representation Enable modularity Align abstract & concrete syntax Stay consistent Abstract syntax Concrete syntax Design Guidelines Be descriptive Language realization Language content Re-use existing languages Use domain concepts Focus on essentials Avoid redundancy based on Gabor Karsai et al. Design Guidelines for Domain Specific Languages. In: Proceedings of the 9th OOPSLA Workshop on Domain-Specific Modeling. 2009, pp. 7 13 9

The ExaStencils DSL

Goals of the ExaStencils DSL Three different types of users with individual expectations: Natural scientists Care about effectivity of solving the problem Little knowledge of underlying methods No requirements to override compiler-chosen decisions Mathematicians Care about applicability and convergence of mathematical models Little interest in implementation specifics Want to extend/override compiler-chosen decisions with custom multigrid components Computer scientists Care about performance and software engineering approach Little interest in the mathematical problem Have to understand of the compiler and its decision process 10

ExaStencils DSL Multi-layered structure Top-down approach: From abstract to concrete Very few mandatory specifications on one layer room for SPL-based decisions on lower layers Decisions may be taken by user via specification in DSL Mandatory specifications include Shape and size of computational domain Continuous equation External DSL Better reflection of extensive ExaStencils approach Enables greater flexibility of different layers Eases tailoring of DSL layers to users 11

ExaStencils: Multi-layered DSL Structure Different layers of DSL tailored towards different users and knowledge: abstract 1 Continuous Domain & Continuous Model 2 3 Discrete Domain & Discrete Model Algorithmic Components & Parameters Hardware Description concrete 4 Complete Program Specification 12

ExaStencils: Multi-layered DSL Structure Different layers of DSL tailored towards different users and knowledge: Natural scientists Continuous Domain & Continuous Model Discrete Domain & Discrete Model Algorithmic Components & Parameters Complete Program Specification Hardware Description 12

ExaStencils: Multi-layered DSL Structure Different layers of DSL tailored towards different users and knowledge: Continuous Domain & Continuous Model Mathematicians Discrete Domain & Discrete Model Algorithmic Components & Parameters Complete Program Specification Hardware Description 12

ExaStencils: Multi-layered DSL Structure Different layers of DSL tailored towards different users and knowledge: Continuous Domain & Continuous Model Computer scientists Discrete Domain & Discrete Model Algorithmic Components & Parameters Complete Program Specification Hardware Description 12

ExaStencils: Multi-layered DSL Structure Continuous Domain & Continuous Model (Layer 1) Specification of Size and structure of computational domain Variables Functions and operators (pre-defined functions and operators also available) Mathematical problem 1 Domain d = UnitCube 2 3 Function f = 0 4 Unknown solution = 1 5 Operator Lapl = Laplacian 6 7 PDE pde { Lapl(solution) = f } 8 PDEBC bc { solution = 0 } 9 10 Accuracy = 7 13

ExaStencils: Multi-layered DSL Structure Discrete Domain & Discrete Model (Layer 2) Discretization of Computational domain into fragments (e. g., triangles) Variables to fields Specification of data type Selection of discretized location (cell based or node based) Transformation of energy functional to PDE or weak form 1 Fragment f1 = UnitCube 2 3 Discrete_Domain d {... } 4 5 Field<Double, 1>@nodes f 6 Stencil <Double, 1, 1, FD, 2>@nodes Lapl 7 8 //... 14

ExaStencils: Multi-layered DSL Structure Algorithmic Components & Parameters (Layer 3) Multigrid cycle type Multigrid components (e. g., selection of smoother) Definition of operations on sets (parts of the computational domain) Operations in matrix notation 1 Iteration smoother { u = Mu + Nf } 2 Set s1 = [0, 0] - [END, END], [+=1, +=1] 3 4 u(s1) = A(s1, s1) * u(s1) 5 6 Set s2 = ( [0,0] + [1,1] ), [+=2, +=2] 7 Set s3 = [0:2, 0:2], [+=2, +=2] 8 //... 15

ExaStencils: Multi-layered DSL Structure Complete Program Specification (Layer 4) Complete multigrid V-cycle, or Custom cycle types Operations depending on the multigrid level Loops over computational domain Communication and data exchange Custom output data formats 1 def V@(1 to 6) () : Unit { 2 repeat up 3 { GaussSeidel 3 @(current) () } 4 Residual@(current) () 5 Restrict@(current) () 6 //... 7 def Application() : Unit { 8 var res0 : Real = sqrt ( L2Residual @6 () ) 9 //... 10 repeat up 10 { 11 V @6 () 12 } //... 16

ExaStencils: Multi-layered DSL Structure Target Hardware Description Availability of compute units (e. g., CPUs, GPUs, FPGAs) and capabilities Memory and cache architecture Cluster nodes (e. g., number of cores, types of available accelerators, available software) Complete cluster (e. g., number of nodes, interconnect topology and characteristics, I/O storage specifications) 1 Hardware { 2 MPI { 3 name "my cluster" 4 id HYBRID_CLUSTER 5 nodes 32 6 components { 7 XEON_CPU[2] 8 GTX_680_GPU[2] 9 } 10 //... 17

ExaStencils Concept End- Domain user expert Mathematician Software Hardware specialist expert DSL program Discretization and algorithm selection Software component selection via SPL Polyhedral optimization Code ceneration Exascale C++ Tuning towards target hardware 18

ExaStencils: Multi-layered DSL Structure Continuous Domain & Continuous Model discretization Discrete Domain & Discrete Model Domain Knowledge parametrization Algorithmic Components & Parameters specification Complete Program Specification optimization Intermediate Representation (IR) generation Exascale C++ Code Hardware Knowledge 19

ExaStencils Framework Implementation details Implemented in Scala Specialized data structures for each DSL layer Domain-knowledge modeled as compiler-wide accessible module A central instance keeps track of program state changes: StateManager Functionality separated into namespaces, e. g., exastencils.core: Log functionality, StateManager, compiler settings exastencils.core.collectors exastencils.datastructures: Annotations, program state duplication, trait Strategy, trait Transformation exastencils.datastructures.{l1, l2, l3, l4, ir} exastencils.parsers exastencils.prettyprinting 20

ExaStencils Framework Transformations Are grouped together in strategies Are atomic either applied completely or not at all Are applied to program state in depth search order May be applied to a part of the program state Carry an identifier Strategies Are applied by the StateManager Carry an identifier A standard strategy for sequential execution of transformations is provided Custom strategies possible 21

ExaStencils Framework Transition between layers Transition from one layer to another via transformations Step-wise from one layer to the next Domain knowledge needed for each transition (e. g., which discretization or which smoother to choose) Transitions are atomic no mix of nodes of different layers Intermediate Representation (IR) Layer Layer in which most optimizations take place Progress is made by application of many strategies with a single and narrow purpose Polyhedral model to be used for partitioning of computational domain Can be prettyprinted 22

ExaStencils Framework: Example transformations (IR) 1 var s = Strategy("example standard strategy") 2 3 // replace constant 1 under a certain node with 3 4 s += Transformation("t1", 5 { case x : Constant if(x.value == 1) 6 => Constant(3) 7 }, someprogramstatenode) 8 9 // rename all variables to j 10 s += Transformation("t2", 11 { case x : Variable 12 => Variable("j", x.type) 13 }) 14 15 // duplicate all methods 16 s += Transformation("t3", 17 { case x : FunctionStatement 18 => List(x, FunctionStatement( 19 x.returntype, x.name + "_", x.parameters, x.body)) 20 }) 21 s.apply // execute transformations sequentially 23

Summary and Outlook Discussed in this talk Definitions of DSLs Language Guidelines Multi-layered DSL for ExaStencils Language examples Implementation details Future work Finalization of language specifications on different layers Transformations of more abstract layers to lowest layer (Layer 4 (Complete Program Specification)) Domain knowledge as a base for automated decision making and optimization 24

Thanks for listening. Questions? 25