The D-Box coordinational language and the run-time system



Similar documents
Lecture Notes in Computer Science 5161

Parallel Compression and Decompression of DNA Sequence Reads in FASTQ Format

Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003

Cloud Computing. Up until now

Building an Inexpensive Parallel Computer

A Multi-layered Domain-specific Language for Stencil Computations

Client/Server Computing Distributed Processing, Client/Server, and Clusters

System Description: The MathWeb Software Bus for Distributed Mathematical Reasoning

How to make the computer understand? Lecture 15: Putting it all together. Example (Output assembly code) Example (input program) Anatomy of a Computer

Overlapping Data Transfer With Application Execution on Clusters

Application Monitoring in the Grid with GRM and PROVE *

Modular Communication Infrastructure Design with Quality of Service

Code Generation for High-Assurance Java Card Applets

A Contribution to Expert Decision-based Virtual Product Development

Parallelization of video compressing with FFmpeg and OpenMP in supercomputing environment

Thesis Proposal: Improving the Performance of Synchronization in Concurrent Haskell

Application Architectures

Efficiency Considerations of PERL and Python in Distributed Processing

Stream Processing on GPUs Using Distributed Multimedia Middleware

Syntax Check of Embedded SQL in C++ with Proto

Object-Oriented Software Specification in Programming Language Design and Implementation

NASSI-SCHNEIDERMAN DIAGRAM IN HTML BASED ON AML

A Java Based Tool for Testing Interoperable MPI Protocol Conformance

Component visualization methods for large legacy software in C/C++

Advanced TTCN-3 Test Suite validation with Titan

Introduction to Embedded Systems. Software Update Problem

COMP5426 Parallel and Distributed Computing. Distributed Systems: Client/Server and Clusters

Generating Aspect Code from UML Models

Equalizer. Parallel OpenGL Application Framework. Stefan Eilemann, Eyescale Software GmbH

Parallel Computing. Benson Muite. benson.

Twelve Theses on Reactive Rules for the Web

Software Synthesis from Dataflow Models for G and LabVIEW

Development of Performance Testing Tool for Railway Signaling System Software

Monitoring Message Passing Applications in the Grid

Operating System for the K computer

Computer Science. Master of Science

Integrated Development of Distributed Real-Time Applications with Asynchronous Communication

Cluster, Grid, Cloud Concepts

International Workshop on Field Programmable Logic and Applications, FPL '99

Programming Languages

Workflow Automation and Management Services in Web 2.0: An Object-Based Approach to Distributed Workflow Enactment

Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005

Knowledge-based Approach in Information Systems Life Cycle and Information Systems Architecture

OPC COMMUNICATION IN REAL TIME

Integration of the OCM-G Monitoring System into the MonALISA Infrastructure

4.1 CD BSc (Hons) Information Technology (Diploma to Degree Upgrade 1.5 Years Part Time)

Ontological Representations of Software Patterns

A Study on Data Analysis Process Management System in MapReduce using BPM

IC2D: Interactive Control and Debugging of Distribution

CSCI E 98: Managed Environments for the Execution of Programs

EMBEDDED SOFTWARE DEVELOPMENT: COMPONENTS AND CONTRACTS

irods and Metadata survey Version 0.1 Date March Abhijeet Kodgire 25th

Annotated bibliography for the tutorial on Exploring typed language design in Haskell

Visual Basic. murach's TRAINING & REFERENCE

Using an MPI Cluster in the Control of a Mobile Robots System

Master of Science in Computer Science

Language-oriented Software Development and Rewriting Logic

Doctor of Philosophy in Computer Science

core. Volume I - Fundamentals Seventh Edition Sun Microsystems Press A Prentice Hall Title ULB Darmstadt

INTRODUCTION TO JAVA PROGRAMMING LANGUAGE

A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster

How To Understand The Concept Of A Distributed System

XMPP A Perfect Protocol for the New Era of Volunteer Cloud Computing

EIT ICT Labs MASTER SCHOOL DSS Programme Specialisations

The Service Availability Forum Specification for High Availability Middleware

Manjrasoft Market Oriented Cloud Computing Platform

Virtual Credit Card Processing System

SYLLABUS. 1 seminar/laboratory 3.4 Total hours in the curriculum 42 Of which: 3.5 course

Fair Scheduling Algorithm with Dynamic Load Balancing Using In Grid Computing

A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment

SimWebLink.NET Remote Control and Monitoring in the Simulink

Middleware and Distributed Systems. Introduction. Dr. Martin v. Löwis

Computer Organization & Architecture Lecture #19

CSE 373: Data Structure & Algorithms Lecture 25: Programming Languages. Nicki Dell Spring 2014

Decomposition into Parts. Software Engineering, Lecture 4. Data and Function Cohesion. Allocation of Functions and Data. Component Interfaces

GPU Profiling with AMD CodeXL

Towards Integrating Modeling and Programming Languages: The Case of UML and Java

A STUDY OF THE BEHAVIOUR OF THE MOBILE AGENT IN THE NETWORK MANAGEMENT SYSTEMS

Security Analysis of Dynamic Infrastructure Clouds

Course MS10975A Introduction to Programming. Length: 5 Days

Publication list. Cumulative impact factor:

Transcription:

The D-Box coordinational language and the run-time system Support for distribute development in Clean functional language Hernyák Zoltán http://aries.ektf.hu/~hz hz@aries.ektf.hu Main results of the PhD dissertation 2009 Advisor: Dr. Horváth Zoltán, professor Eötvös Lóránd University, Faculty of Informatics H-1117 Budapes, Pázmány Péter sétány 1/C. PhD School of Eötvös Lóránd University, Faculty of Informatics PhD Programm: The foundations of and methodologies in informatics Leader of the PhD School and that of the PhD programme: Dr. Demetrovics János (Full Member of the Hungarian Academy of Sciences)

Introduction The present thesis deals with a coordination language that supports the distributed evaluation of a Clean programme, and its run-time system. The D-Clean higher abstraction coordination language primitives designed for the Clean functional language will be translated into these lower-level D-Box coordination language definitions, on which the concepts channel, box, protocol are defined. The primary coordination element is the box, that contains a computation task - that can be a function or an expression - formulated on Clean language. Parameters required by this expression are carried by typed channels from other boxes; these are capable of minimal buffering as well. On the other hand it supports the add and remove operations known from the queue data structure. The channels are read by a protocol expression that hides the details of channelhandling from the Clean language expression. The protocol expression is able to perform basic transformations on the data coming through the channels - e.g. construct a list from them. The expression processes the data received from the protocol, then it computes the result. The results are given to an output protocol, that will transmit those to the outgoing channels. The input and output channels of the boxes are connected. The nod elements of the computation graph are the boxes, the edges are the channels. The topic of the dissertation is the syntax and semantics of the coordination language describing this graph. Aims 1. To create a coordination language that supports the distributed functional programming in its operation. This should support the D-Clean higher abstraction coordination language and should be suitable for developing distributed programming solutions on a functional programming language. To work out the syntactic and static semantic rules of the coordination language, on the basis of which the D-Box language descriptions can be processed and checked. 2. To give the specifications connected to the operation of the coordination language, that can be the basis of creating a code-generating tool. To apply external pattern 1

files during code-generation, that can be worked out in case of other platform, middleware as well. The code generating tool should parameter the patterns with types, and to create a Clean language source code. To work out the macros applied by the code-generating tool, and with the help of which the patterns can be created. To create concrete pattern files, with the help of which a code can be generated in case of some operation system and middleware. 3. To create the run-time system completing the services of the middleware, the informal syntactic and semantic description of the functions of API. These functions can be referred to in the pattern files. This system should be suitable for preparing the running of projects generated on the basis of D-Box language description, and to support it during running. 4. To test the environment created previously, to run concrete distributed computational tasks, to measure effectiveness, performance. Antecedents of topic choice In the first phase of my research I have dealt with the MPI/PVM programme libraries under linux, using message passing communication method. However the complexity, weak typedness of C, C++ programming languages, is not a favorable environment for preparing distributed programmes, not to mention its cumbersome debugging and tracking methods. Later, based on the possibilities offered by the.net Framework I have prepared an implementation built on similar guidelines, for an O.O.P. platform, in C# language. The implementation was worked out on a pure C# language, and the communication was assured by a low-level socket handling, but in essence it was a class-collection furnished with communication-supporting methods. I have prepared a simple run-time system to it, and this was later the basis of the D-Box run-time system discussed in the present thesis. The Clean functional programming language, due to its many characteristics, is an attractive programming platform. Unfortunately the Clean does not support the distributed running at the moment. It seemed necessary to develop a higher-level process-description and coordination system. To serve this aim, Zsók Viktória, the researcher of ELTE, has designed the D-Clean language with the help of which communication could be described by means similar to functional language expressions. I have joined the research at that point. The design of D-Box language was a cooperation. In this research I have prepared the D-Clean translator, the syntax and semantics of D-Box language, the D-Box code-generating, run-time system I have also implemented 2

the vast majority of D-Box pattern files. It was also me who have prepared an earlier implementation of D-Box protocol templates, that had to be reconsidered due to the SplitF protocol introduced later. The present protocol implementations in Clean language were prepared by a researcher from ELTE, Diviánszky Péter. According to our solution, the functions created by the user do not have to contain any information related to communication or the run-time system. The generated code completely hides this from the Clean functions. This way the development and testing of the functions under traditional environments, and its transfer to the distributed environment without any altering becomes possible. Antecedents of the research topic From among the functional programming languages the Concurrent Haskell possessing a parallel evaluation system was studied. The JoCaml language is the supplement of the Objective Caml language with the help of the join calculus, and it has a specifically parallel and distributed development support. It knows the concept of channel and abstract place. Communication among the processes are done through channels. The ERLANG makes the creation of concurrent, real-time, distributed and problemtolerating systems. The language possesses inbuilt means on the domain of distributing and message transmission, it fulfils the distribute without a common memory area, through sending messages. The Hume is a strongly typed, functionally founded programming language, its primary priority is to check the limitedness of running time and use of resource. It applies an asynchronous communication model, the basic element of which is the box. The boxes have unique identifiers, the in- and outgoing data are typed. The boxes can be connected, the description of which happens separate from the definitions - during this it will be checked whether the connection type is correct. The boxes can be connected through devices (stream, port, etc.) as well. The boxes can might as well be connected to themselves, forming a one-element loop. Connection among the boxes can be called wires. Starting values can be placed on the wires, that are useful in case of creating the loops. To the Eden functional programming language Jost Berthold has created an implementation language called EdI, in which he defined channel-creating and data-sending primitives. Due to the lazy evaluation of the Eden, he has defined evaluation strategies, that can force the computing (and sending) of the value on the sending side. Through completing the ObjectIO library developed for the Clean functional programming language interactive programmes can be developed, that contain a menu and dialog windows. The unique type makes it possible to handle resources from a pure functional approach. A such kind of unique type value can not be duplicated. 3

The porting of ObjectIO in a Linux environment is not complete, so it can not be built upon in heterogeneous systems. The Concurrent Clean was a purely functional, strongly type language, that supported parallel and distributed evaluations as well. Unfortunately this extension of the language could not keep pace with the language versions, its development has stopped. In the early phase of Clean there also was a transputer supported language version. With the help of annotations it was possible to set the parts of the expression that can be evaluated in a parallel way. The paralleling strategies were based on these annotations, with their help it was possible to set an evaluation order. Applied methods The syntax of the D-Box coordination language is set by EBNF description and lex + yacc definitions as well. The rules essential to the analysis of the static syntactic properness were formerly described. The operation of the D-Box protocols was specified. The operation of the run-time system was given in the form of a natural operational semantics. The created D-Box lexical and semantic analyzer is a programme generated not on the basis of BNF, written on C language. A C++, C, Clean language codes are read, parametered from external pattern files, on the basis of information included in D-Box definitions. This is solved with the help of macros built into the pattern files. The Start expression of the boxes is created by the D-Box on the basis of code-generating non-patterns - the code strongly dependent on the applied protocol and channels is generated immediately by the translator. The syntactic properness of the structure built up from lexical elements is checked by a not purely LALR(1) syntactic analyzer. The attached static semantic analyzer operates in a separate run, similar to the code-generating run. We do not perform code-optimizing steps, as it is not important in the code of the channels, the code of the boxes is generated in Clean language, and the code-optimization is done by the Clean translator and lazy evaluation system. The inter-layer services of the running system are extended by special services essential to the running of the project. ICE functions are called from the Clean code through interfaces. The source of the interface is in the pattern files as well. Proposition 1. The syntax and static semantics of D-Box language I have created a coordination language through which a kind of computing graph can be defined in which applications written on a Clean functional programming language are 4

running on the computing nodes. The coordination language supports special language elements as well, such as the *World type restorable value. These value types can not be carried through channels, a replacing value can be generated on the host side. The syntax was given in an EBNF form and in a form that can be processed by lexer and yacc. During the static semantic check the channels of the box and the applied input protocol; the output protocol and the output channels were checking. It was analyzed, whether the protocol was able to create the input parameters of the applied expression, and whether it is able to handle the output values. Furthermore it was analyzed whether each input and output channel is used only once and whether the dynamic sub-graph startups can be fulfilled during running. Above all these it was also analyzed, whether the communication between the boxes is closed concerning the sub-graph. Proposition 2. Specification of D-Box language primitives I have introduced the syntactical and static semantic rules of the coordination language, on the grounds of which I have created an operating syntactical checking programme. Based on the Windows operation system the ICE middleware a template collection was also created and attached to the thesis as a DVD supplement. The names of template files and the library structure was put into a XML configuration file on the basis of which the code generating phase is more easy to parameter. The specification of the coordination language discusses the operation of protocols, the serialization of more difficult structures, lists, lists of lists, and the steps of writing on a channel. We have given the reading of channels, the de-serialization of a symbolserial arriving on the channel, the specification related to the compilation of protocol results. The operation of the input and the output protocols was precisely defined this way. The operation of the protocols is based on the lazy evaluation of Clean language, instead of the parallel way. The processing starts with the output protocol, that causes the evaluation of the expression. The expression takes away the incoming data from the input protocol in a lazy way, this way causes the implicit reading of input channels. The generated data are put on the channels by the output protocol. The D-Box translator takes the definitions describing the computation graph either from a file, or directly from the D-Clean translator. The latter is the result of the integration of the two translators. After the evaluation of D-Box definitions and their static semantic check a resource code was generated on the basis of the pattern files. For the support of code-generation, macros were developed, the description of which are included in the 5

attachment of the present thesis. The Clean Start function of the generated computation nodes (box) are created on the basis of non-patterns by the translator, as basically its every line is up to the type, number of applied channels, and the protocol. We use the C++ linker - as we could not apply the Clean linker to the task. C++ language object files are also attached to the codes of the box during the linking. Proposition 3. Formal semantics and implementation of the run-time system The thesis describes the state of the run-time system and the operational semantics. During the running, from a starting state boxes forming the spine of the project are launched, and the channel starting commands are processed. In the meantime new boxes are dynamically started and new channel starting commands can be part of the system. In the project finish state the start of every box is finished and every channel start command was fulfilled. The run-time system places the generated binary code into a code library service, on the basis of which the project can be started. During this, the system first instantiate the boxes belonging to the beginning sub-graph. After their starting, the boxes can ask for the starting of channels and the instantiate of boxes belonging to further sub-graphs. A name provider ensures the meeting of running components. The run-time system contains a scheduler, that defines which component should be put on which concrete computer. Proposition 4. Verification of the applicable of the D-Box system on a problem class Translation, code generation and running was checked through fulfilling a real-life, computation-intensive problem. Measures and charts related to the distributed running of the application prove that the generated code is effective in this problem class. The running time of the distributed case using 4, 8, 16 nodes, cutting point alteration, operational complication approximated to the expected maximal speed-up value. 6

Bibliography [1] William Gropp, Ewing Lusk, Anthony Skjellum: Using MPI - Portable Parallel Programming with Message-Passing Interface MIT Press, 1999 [2] Al Geist, Adam Beguelin, Jack Dongarra, Weicheng Jiang, Robert Manchek,Vaidy Sunderam: PVM: Parallel Virtual Machine A Users Guide and Tutorial for Networked Parallel Computing, MIT Press, 1994, http://www.netlib.org/pvm3/ book/pvm-book.html [3] Kevin Hammond, Greg Michaelson, Robert Pointon: The Hume Report, version 1.1, http://www-fp.cs.st-andrews.ac.uk/hume/report/ [4] Jost Bertold: Explicit and Implicit Parallel Functional Programming: Concepts and Implementation, PhD Disszertáció, 2008, Marburg. [5] Jones, S. P., Gordon, A., Finne, S.: Concurrent Haskell, Conference Record of POPL 96: The 23rd ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, Glasgow, 1996, 11 pp. [6] Finne, S. and Jones, S., P. J.: Concurrent Haskell, In Principles Of Programming Languages, St. Petersburg Beach, Florida, 1996, pp. 295-308 [7] Fournet, C., Le Fessant, F., Maranget, L., Schmitt, A.: The JoCaml language beta release, Documentation and user s manual, INRIA, 2001. [8] Leroy X. et al. The Objective Caml Language (version 3.10). Software and documentation, available at http://caml.inria.fr, 2007. [9] J. Barklund and R. Virding. Erlang Reference Manual, 1999. Available from http: //www.erlang.org/download/erl_spec47.ps.gz. 2007.06.01 [10] Kesseler, M.H.G.: The Implementation of Functional Languages on Parallel Machines with Distributed Memory, PhD Thesis, Catholic University of Nijmegen, 1996. [11] Serrarens, P.R.: Communication Issues in Distributed Functional Computing, Ph.D. Thesis, University of Nijmegen, January 2001. 7

[12] Horváth Z., Zsók V., Serrarens, P., Plasmeijer, R.: Parallel Elementwise Processable Functions in Concurrent Clean, Mathematical and Computer Modelling 38, pp. 865-875, Pergamon, 2003. [13] Horváth Zoltán, Hernyák Zoltán, Zsók Viktóra: Coordination Language for Distributed Clean, Acta Cybernetica (ISSN: 0324-721 X), Vol. 17 (2), Institute of Informatics, University of Szeged, Szeged, Hungary, 2005, pp. 247-271. Selected publication of CSCS PhD Conference in Computer Science. [14] Achten, P., Wierich, M.:A Tutorial to the Clean Object I/O Library, University of Nijmegen, 2000. http://www.cs.kun.nl/~clean [15] Plasmeijer,R.-van Eekelen,M.: Functional Programming and Parallel Graph Rewriting, Addison-Wesley, 1993. [16] [EeNoPlSm90] van Eekelen,M. et al.: Concurrent Clean, Technical Report no 90-20, November 1990, University of Nijmegen. [17] Plasmeijer, R., van Eekelen, M.: Concurrent Clean Language Report, University of Nijmegen, 2001. [18] Hernyák Zoltán: PEDPI as a Message Passing Interface with OO support, in: Striegnitz, Jörg; Davis, Kei (Eds.) (2003) Proceedings of the Workshop on Parallel/High-Performance Object-Oriented Scientific Computing (POOSC 03), Interner Bericht FZJ-ZAM-IB-2003-09, Juli 2003, pp. 93-100. 8

List of publications Referred publications 1. Horváth Zoltán, Hernyák Zoltán, Zsók Viktóra: Coordination Language for Distributed Clean, Acta Cybernetica (ISSN: 0324-721 X), Vol. 17 (2), Institute of Informatics, University of Szeged, Szeged, Hungary, 2005, pp. 247-271. Selected publication of CSCS PhD Conference in Computer Science. 2. Horváth Zoltán, Hernyák Zoltán,Kozsik Tamás, Tejfel Máté, Ulbert Attila: A Data Intensive Application on a Cluster - Parallel Elementwise Processing, in P. Kacsuk, D. Kranzlmüller, Zs. Nemeth, J. Volkert (Eds.): Distributed and Parallel System - Cluster and Grid Computing, Proc. 4th Austrian-Hungarian Workshop on Distributed and Parallel Systems, Kluwer Academic Publishers, The Kluwer International Series in Engineering and Computer Science, Vol. 706, pp. 46-53, Linz, Austria, September 29-October 2, 2002. 3. Zsók Viktória, Hernyák Zoltán, Horváth Zoltán: Designing Distributed Computational Skeletons in D-Clean and D-Box, in.: Lecture Notes in Computer Science, Horváth Zoltán(ed.) in.: Central European Functional Programming School (The First Central European Summer School, CEFP 2005, Budapest, Hungary, July 4-15, 2005), Revised Selected Lectures. ISSN 0302-9743, vol. 4164, 2006, pp. 229-265. 4. Zsók Viktória, Hernyák Zoltán, Horváth Zoltán: Distributed Pattern Design in D-Clean, Central European Functional Programming School, CEFP 2005, ELTE, Budapest, Hungary, July 4-15, 2005, Lecture Notes, 33 pages 5. Zsók Viktória, Hernyák Zoltán, Horváth Zoltán: Improving the Distributed Elementwise Processing Implementation in D-Clean, In: Horváth Z., Kozma L, Zsók V. (eds): Proceedings of the 10th Symposium on Programming Languages and Software Tools (ISBN: 978-963-463-925-1), SPLST 2007, Dobogókő, Hungary, June 14-16, 2007, Eötvös University Press, 2007, pp. 256-264. 6. Zsók Viktória, Hernyák Zoltán, Horváth Zoltán: Distributed Pattern Design in D-Clean, Vene V., Meriste M.(ed.) in.: Proceedings of the Ninth Symposium on Programming Languages and Software Tools, ISBN: 9949-11-113-7, SPLST 2005, Tartu, 9

Estonia, 13-14 August, 2005, Tartu University Press, 2005, pp. 220-234. Publications in referred proceedings 7. Zsók Viktória, Hernyák Zoltán, Horváth Zoltán: /Distributed Computation on Cluster using D-Clean and D-Box. Extended abstract In: Davis, K., Quintino, T., Striegnitz, J. (eds): 5th Workshop on Parallel/High Performance Object-Oriented Scientific Computing, POOSC 06 at 20th European Conference on Object-Oriented Programming, ECOOP 2006, Nantes, France, 3rd July, 2006, 3 pages. Summary: Object-Oriented Technology, ECOOP 2006 Workshop Reader, ECOOP 2006 Workshops, Nantes, France, July 3-7, 2006, Final Reports, LNCS 4379, Springer Verlag, 2007, pp. 141-145. 8. Horváth Zoltán, Hernyák Zoltán, Zsók Viktória: Implementing Distributed Skeletons using D-Clean and D-Box, In: Butterfield, A. (ed): Proceedings of the 17th International Workshop on Implementation and Application of Functional Languages, IFL 2005, Dublin, Ireland, September 19-21, 2005, pp. 1-16. 9. Hernyák Zoltán, Horváth Zoltán, Zsók Viktória: Clean-CORBA Interface Supporting Pipeline Skeleton, Csőke Lajos(ed.) in.: Proceedings of 6th International Conference on Applied Informatics, Eger, Hungary, January 27-31, 2004. Eger, Hungary, B.V.B. Press, Vol. I. pp. 191-200. Publications in international conference proceedings 10. Zsók Viktória, Horváth Zoltán, Hernyák Zoltán: /Distributed Elementwise Processing in D-Clean, In: Nilsson, H. (ed): Proceedings of the Seventh Symposium on Trends in Functional Programming, TFP 2006, Nottingham, UK, 19-21 April, 2006, The University of Nottingham, pp. 378-386. 11. Hernyák Zoltán, Horváth Zoltán, Zsók Viktória: Design of Language Elements for Dynamic Distributed Computation of Clean Expressions on Clusters, in: Loidl, H-W. (ed): Proceedings of Fifth Symposium on Trends in Functional Programming, TFP 2004, Munich, Germany, November 25-26, 2004, Ludwig-Maximilians University, pp. 257-270. 12. Hernyák Zoltán: PEDPI as a Message Passing Interface with OO support, in: Striegnitz, Jörg; Davis, Kei (Eds.) (2003) Proceedings of the Workshop on Parallel/ High-Performance Object-Oriented Scientific Computing (POOSC 03), Interner Bericht FZJ-ZAM-IB-2003-09, Juli 2003, pp. 93-100. 10