The scalability impact of a component-based software engineering framework on a growing SAMR toolkit: a case study



Similar documents
Software Architecture Issues in Scientific Component Development

A first evaluation of dynamic configuration of load-balancers for AMR simulations of flows

A Case Study - Scaling Legacy Code on Next Generation Platforms

Component Based Software Engineering: A Broad Based Model is Needed

Definition of a Software Component and Its Elements

Design with Reuse. Building software from reusable components. Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 14 Slide 1

Patterns in. Lecture 2 GoF Design Patterns Creational. Sharif University of Technology. Department of Computer Engineering

Systems Integration: Co C mp m onent- t bas a e s d s o s ftw ft a w r a e r e ngin i eeri r n i g

The Benefits of Modular Programming

Mesh Generation and Load Balancing

Simulation Platform Overview

Developing the Architectural Framework for SOA Adoption

Basic Trends of Modern Software Development

Load Balancing Strategies for Parallel SAMR Algorithms

Parallel I/O on JUQUEEN

Software Development around a Millisecond

Software Engineering. Software Reuse. Based on Software Engineering, 7 th Edition by Ian Sommerville

Software Engineering. Software Engineering. Component-Based. Based on Software Engineering, 7 th Edition by Ian Sommerville

Effective Java Programming. efficient software development

Sourcery Overview & Virtual Machine Installation

Design of Scalable, Parallel-Computing Software Development Tool

1.1 The Nature of Software... Object-Oriented Software Engineering Practical Software Development using UML and Java. The Nature of Software...

Information integration platform for CIMS. Chan, FTS; Zhang, J; Lau, HCW; Ning, A

ME6130 An introduction to CFD 1-1

A Management Tool for Component-Based Real-Time Supervision and Control Systems

COBOL2EJB : A Tool Generating Wrapping Component for CICS COBOL System

Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer

Developing SOA solutions using IBM SOA Foundation

Business Modeling with UML

IDC Reengineering Phase 2 & 3 US Industry Standard Cost Estimate Summary

TESSY Automated dynamic module/unit and. CTE Classification Tree Editor. integration testing of embedded applications. for test case specifications

Express Introductory Training in ANSYS Fluent Lecture 1 Introduction to the CFD Methodology

Numerical Calculation of Laminar Flame Propagation with Parallelism Assignment ZERO, CS 267, UC Berkeley, Spring 2015

Tool Support for Inspecting the Code Quality of HPC Applications

Component Based Development in Software Engineering

Fibre Channel Overview of the Technology. Early History and Fibre Channel Standards Development

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Uintah Framework. Justin Luitjens, Qingyu Meng, John Schmidt, Martin Berzins, Todd Harman, Chuch Wight, Steven Parker, et al

Ergon Workflow Tool White Paper

The Design and Implement of Ultra-scale Data Parallel. In-situ Visualization System

Excerpts from Chapter 4, Architectural Modeling -- UML for Mere Mortals by Eric J. Naiburg and Robert A. Maksimchuk

Migrating Legacy Software Systems to CORBA based Distributed Environments through an Automatic Wrapper Generation Technique

David Pilling Director of Applications and Development

CBM SOMA - SCA. Techniques and Standards to Increase Business and IT Flexibility. Jouko Poutanen Senior IT Architect, IBM Software Group

Service Virtualization: Managing Change in a Service-Oriented Architecture

Arcane/ArcGeoSim, a software framework for geosciences simulation

AP Computer Science AB Syllabus 1

D5.6 Prototype demonstration of performance monitoring tools on a system with multiple ARM boards Version 1.0

So#ware Tools and Techniques for HPC, Clouds, and Server- Class SoCs Ron Brightwell

November 3-4, The BioMA platform and applications. European Project n Workshop November 3 rd Marcello Donatelli (CREA)

SOACertifiedProfessional.Braindumps.S90-03A.v by.JANET.100q. Exam Code: S90-03A. Exam Name: SOA Design & Architecture

Modern Software Development Tools on OpenVMS

Modular Safety Cases

Increasing Development Knowledge with EPFC

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC Denver

UNIFACE Component-based. Development Methodology UNIFACE V Revision 0 Dec 2000 UMET

A New Unstructured Variable-Resolution Finite Element Ice Sheet Stress-Velocity Solver within the MPAS/Trilinos FELIX Dycore of PISCEES

In-Database Analytics

APPROACHES TO SOFTWARE TESTING PROGRAM VERIFICATION AND VALIDATION

Part I Courses Syllabus

SDR Architecture. Introduction. Figure 1.1 SDR Forum High Level Functional Model. Contributed by Lee Pucker, Spectrum Signal Processing

Multicore Parallel Computing with OpenMP

OpenFOAM Workshop. Yağmur Gülkanat Res.Assist.

Decomposition into Parts. Software Engineering, Lecture 4. Data and Function Cohesion. Allocation of Functions and Data. Component Interfaces

Bachelor of Games and Virtual Worlds (Programming) Subject and Course Summaries

Testing of Digital System-on- Chip (SoC)

HPC & Visualization. Visualization and High-Performance Computing

Building Test-Sites with Simware

ParFUM: A Parallel Framework for Unstructured Meshes. Aaron Becker, Isaac Dooley, Terry Wilmarth, Sayantan Chakravorty Charm++ Workshop 2008

Satisfying business needs while maintaining the

2 (18) - SOFTWARE ARCHITECTURE Service Oriented Architecture - Sven Arne Andreasson - Computer Science and Engineering.

Getting Things Done: Practical Web/e-Commerce Application Stress Testing

Component-based Development Process and Component Lifecycle Ivica Crnkovic 1, Stig Larsson 2, Michel Chaudron 3

Web Service Implementation Methodology

Fully Automatic Hex Dominant Mesher. Paul Gilfrin Sharc Ltd

Introduction to Automated Testing

COMPARISON OF OBJECT-ORIENTED AND PROCEDURE-BASED COMPUTER LANGUAGES: CASE STUDY OF C++ PROGRAMMING

Service Virtualization:

HPC Deployment of OpenFOAM in an Industrial Setting

Agent Languages. Overview. Requirements. Java. Tcl/Tk. Telescript. Evaluation. Artificial Intelligence Intelligent Agents

Automation can dramatically increase product quality, leading to lower field service, product support and

System Software Integration: An Expansive View. Overview

Transcription:

The scalability impact of a component-based software engineering framework on a growing SAMR toolkit: a case study Benjamin A. Allan and Jaideep Ray We examine a growing toolkit for parallel SAMR-based simulations and the impact of componentbased software engineering on the scalability of the development process as well as the scalability of the run-time performance. The CBSE approach enables a wide range of contributions to the toolkit in a short time. We also examine the question of publications: Does writing good software hurt productivity? 1. Introduction In this case we set out to build a reacting flow modeling toolkit [] based on structured adaptive mesh refinement (SAMR) following the Common Component Architecture (CCA) [1]. This is a work in progress. Thus we are performing a test of the CCA concepts as well as generating a simulation code. Most of the costs and issues involved in building scientific research codes are rarely discussed in the science literature that presents the results of the simulations. As the complexities of studied phenomena are exploding, however, we must start examining the underlying problems in building codes complex enough to simulate these phenomena. 1.1. The research software problems Creating a CFD toolkit as part of a research project presents many practical issues: The limited time available requires the use of externally controlled legacy libraries, of mixed language programming (since rewriting all the legacy code in a more favored language is not an option), and of a team of developers of mixed abilities many of whom may be only transiently involved in the project. The lead developer (usually also the software architect and a hero programmer) must be able to rigorously enforce subsystem boundaries and to control boundary changes. The team is usually required to continue publishing while coding, if they wish to obtain funding for their next projects. In a high-performance computing (HPC) code, coping with shifting platforms (either hardware or software) is required. For better scientific validation, swapping out developed package elements for alternative code (possibly developed by competing research groups) is often required. Over the course of any project, requirements change and various package elements will have to be recoded. Less frequently, but usually more expensively, an interface between elements may need to be extended or redefined. Sandia National Laboratories, MS9, Livermore, CA 9-3 baallan,jairay@ca.sandia.gov 1

B. Allan & J. Ray 1.. The CCA approach to HPC CBSE Our development process and software products follow the Common Component Architecture (CCA) [] methods. The CCA defines, among other things, a set of generic rules for composing a single-program-multiple data (SPMD) parallel application from many software components. Each component is a black-box object that provides public interfaces (called ports) and uses ports provided by other components. In object-oriented terms, a port is merely the collection of methods into an interface class. The CCA leaves the definition of SAMR-specific ports to us. A component is never directly linked against any other. Rather, at run-time each component instance informs the framework of which ports it provides and which it wishes to use. At the user s direction, the framework exchanges the ports among components. A component sees other components only through the ports it uses. Implementation details are always fully hidden. All needed components are combined into a single SPMD code.. Software development scalability In this section we present summary information about the software produced to date. The task flow of the project, broken down by products and developers, shows that the CCA approach enables diverse contributors to produce all the required products as a team. The raw data, in the form of a Gantt chart, is too large to present here, but appears in [3]. Over the four year course of the project eight developers have participated, with the average head-count at any moment being four..1. How the CCA model manifests as a SAMR framework The component design of the CFRFS project [] allows encapsulation and reuse of large legacy libraries written in C and FORTRAN. Of the 1 components produced in the project so far, 13 require some aspect of parallel computing, while the rest handle tasks local to a single compute node. As with most research projects, the lead developer is critical: in the work-flow chart we see that 3 of the components were created by developer A, 13 by developer B, by developer C, and the remaining components by others. Port interfaces get created to handle new functionality; the present count is 31. Four of the eight developers contributed port definitions. In defining ports, the lead developer often plays a consulting role, but the details of a port encapsulating new functionality are usually dictated by an expert on the underlying legacy libraries rather than by the lead developer. As we see in figure, the typical component source code is less than lines of C++ (excluding any wrapped legacy code); the exceptions being the GrACE [7] library wrapping component at almost lines and the adapter for Chombo [] inter-operation at almost 7 lines. The average component is therefore reasonably sized to be maintainable. The best measure of the legacy codes encapsulated by the components may be obtained by examining the size of the resulting component libraries in figure 1. The Chombo, HDF, and thermochemical properties libraries are the three giants at the end of the histogram; most components compile to under kilobytes. An often proposed metric of quality for an object-oriented code is the number of lines per function. Figure 3 shows that the distribution of lines per port function varies widely across components, being particularly high in those components which are complicated application drivers or wrappers which hide complexity in an underlying legacy library. A question often asked by people considering adopting the component approach is how

SAMR toolkit scalability through CBSE 3 1 1 3 1 3 7 1 17 > Binary size in kb 3 No. of lines per component Figure 1. Histogram of component binary library sizes. Figure. Histogram of component source code sizes. complex should my ports be?, by which they mean how many functions are in one port. The correct answer is it depends on what the port does, but in figure we see that most ports are small with fewer than functions. The largest port, with 39 functions is the AMRPort which encapsulates the bulk of the GrACE functionality. CCA component technology is often described as providing plug-and-play functionality such that alternative implementations can be plugged in for testing and optimization. We test this claim in the CFRFS context by examining how many different components implement each kind of port, in figure. In this combustion work, the most often implemented port turns out to be the diffusion functionality, with components providing alternate models. In CFD modeling, there is often a central mesh concept that is reused in nearly all code modules. This is apparent in our case from figure, where the most frequently used port is the AMRPort with 9 clients. The next most frequently used port is the properties port which allows a driver component to tune any component that exports the properties port... Productivity We have seen the impact of the CCA on the CFRFS software development, but what of the project intellectual productivity? One of the principal goals of the project is to produce a reusable software toolkit, of course, but what of other metrics of career success? Software developers in the course of the work have produced: six scientifically interesting in-house test applications, 11 conference papers, a book chapter, and 3 journal papers all related directly to this work. A SAMR framework interoperability standard has also been drafted and tested with the Chombo framework. Rather than distracting the development team from the intellectual issues the software is designed to address, the component structure of the SAMR framework enables faster testing of new ideas because any needed changes are usually isolated to one or two components.

B. Allan & J. Ray 3 1 1 1 1 1 No. of lines per function 3 3 No. of functions per port Figure 3. Histogram of function size. Figure. Histogram of port size in number of functions. Specific port 3 Specific port No. of implementations of a given port No. of using components 3 3 3 3 3 Figure. Number of components implementing each port type, sorted by increasing frequency. Figure. Number of components using each port type, sorted by increasing frequency. 3. Conclusions In this abstract we have characterized the software structure and processes obtained from the application of the CCA high-performance component methodology to the structured AMR modeling of reacting flows. We have seen that the component methodology enables efficient development of both software and intellectual contributions.. Acknowledgments This work was supported by the United States Department of Energy, in part by the Office of Advanced Scientific Computing Research: Mathematical, Information and Computational Sciences through the Center for Component Technology for Terascale Simulation Software and

SAMR toolkit scalability through CBSE in part by the Office of Basic Energy Sciences: SciDAC Computational Chemistry Program. This is an account of work performed under the auspices of Sandia Corporation and its Contract No. DE-AC-9AL with the United States Department of Energy. REFERENCES 1. CCA Forum homepage. http://www.cca-forum.org/,.. P. Colella et al. Chombo Infrastructure for Adaptive Mesh Refinement. http: //seesar.lbl.gov/anag/chombo. 3. PCFD Gantt chart of CFRFS software development processes. http:// cca-forum.org/ baallan/pcfd-snl/workflow,.. Sophia Lefantzi and Jaideep Ray. A component-based scientific toolkit for reacting flows. In Proceedings of the Second MIT Conference on Computational Fluid and Solid Mechanics, volume, pages 11 1, Boston, Mass., 3. Elsevier Science.. Lois Curfman McInnes, Benjamin A. Allan, Robert Armstrong, Steven J. Benson, David E. Bernholdt, Tamara L. Dahlgren, Lori Freitag Diachin, Manojkumar Krishnan, James A. Kohl, J. Walter Larson, Sophia Lefantzi, Jarek Nieplocha, Boyana Norris, Steven G. Parker, Jaideep Ray, and Shujia Zhou. Parallel PDE-based simulations using the Common Component Architecture. In Are Magnus Bruaset, Petter Bjørstad, and Aslak Tveito, editors, Numerical Solution of PDEs on Parallel Computers. Springer-Verlag,. invited chapter, submitted.. Habib N. Najm et al. CFRFS homepage. http://cfrfs.ca.sandia.gov/, 3. 7. M. Parashar et al. GrACE homepage. http://www.caip.rutgers.edu/ parashar/tassl/projects/grace/,.