Enhancing the performance of scientific workflow execution in e-science environments by harnessing the standards based Parameter Sweep Model

Similar documents
Anwendungsintegration und Workflows mit UNICORE 6

On Enabling Hydrodynamics Data Analysis of Analytical Ultracentrifugation Experiments

Mitglied der Helmholtz-Gemeinschaft UNICORE. Uniform Access to JSC Resources. Michael Rambadt, 20.

Enhancing UNICORE Storage Management using Hadoop

Mitglied der Helmholtz-Gemeinschaft. System monitoring with LLview and the Parallel Tools Platform

UNICORE s UNICORECC Toolbox and SDKs

Scalable Services for Digital Preservation

A Service for Data-Intensive Computations on Virtual Clusters

Enhanced Resource Management Capabilities using Standardized Job Management and Data Access Interfaces within UNICORE Grids

Background. These efforts tend to be multi-institutional. Institutes and universities have varying infrastructure and schedulers installed

Obelisk: Summoning Minions on a HPC Cluster

Integrated Communication Systems

Grid Computing in SAS 9.4 Third Edition

The XSEDE Global Federated File System (GFFS) - Breaking Down Barriers to Secure Resource Sharing

Interoperability between Sun Grid Engine and the Windows Compute Cluster

"Charting the Course to Your Success!" MOC A Understanding and Administering Windows HPC Server Course Summary

Parametric Jobs Facilitation of Instrument Elements Usage In Grid Applications

Monitoring of the UNICORE middleware

PMES Dashboard User Manual

Microsoft Technical Computing The Advancement of Parallelism. Tom Quinn, Technical Computing Partner Manager

On the Applicability of Workflow Management Systems for the Preservation of Business Processes

Scheduling in SAS 9.3

Sun Grid Engine, a new scheduler for EGEE

Basic Scheduling in Grid environment &Grid Scheduling Ontology

Job Management System Extension To Support SLAAC-1V Reconfigurable Hardware

Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC

Installing and running COMSOL on a Linux cluster

Scheduling in SAS 9.4 Second Edition

The Mantid Project. The challenges of delivering flexible HPC for novice end users. Nicholas Draper SOS18

Sentaurus Workbench Comprehensive Framework Environment

XSEDE Service Provider Software and Services Baseline. September 24, 2015 Version 1.2

<Insert Picture Here> Application Testing Suite Overview

GT 6.0 GRAM5 Key Concepts

LSKA 2010 Survey Report Job Scheduler

locuz.com HPC App Portal V2.0 DATASHEET

A Data Management System for UNICORE 6. Tobias Schlauch, German Aerospace Center UNICORE Summit 2009, August 25th, 2009, Delft, The Netherlands

XSEDE Science Gateway Use Cases

UFTP High-performance data transfer for UNICORE

GridWay: Open Source Meta-scheduling Technology for Grid Computing

RDS Building Centralized Monitoring and Control

Code and Process Migration! Motivation!

The OMII Software Distribution

Workflow-Management with flowguide

Figure 12: Fully distributed deployment of the Job Scheduler toolkit

Grid Computing in SAS 9.3 Second Edition

Chris Smith, Platform Computing Marvin Theimer, Microsoft Glenn Wasson, UVA July 14, 2006 Updated: October 2, 2006

HyperWorks Enterprise 11.0 Release Notes 2011

Anar Manafov, GSI Darmstadt. GSI Palaver,

CHAPTER 5 IMPLEMENTATION OF THE PROPOSED GRID NETWORK MONITORING SYSTEM IN CRB

Performance Test Process

See-GRID Project and Business Model

Microsoft HPC. V 1.0 José M. Cámara (checam@ubu.es)

Mascot Integra: Data management for Proteomics ASMS 2004

Analyses on functional capabilities of BizTalk Server, Oracle BPEL Process Manger and WebSphere Process Server for applications in Grid middleware

MIGRATING DESKTOP AND ROAMING ACCESS. Migrating Desktop and Roaming Access Whitepaper

SOFTWARE TESTING TRAINING COURSES CONTENTS

Distributed RAID Architectures for Cluster I/O Computing. Kai Hwang

ITG Software Engineering

GT4 GRAM: A Functionality and Performance Study

Gridbus Workflow Management System and Aneka Enterprise Middleware

Sector vs. Hadoop. A Brief Comparison Between the Two Systems

Broadening Moab/TORQUE for Expanding User Needs

Grid Scheduling Architectures with Globus GridWay and Sun Grid Engine

The Lattice Project: A Multi-Model Grid Computing System. Center for Bioinformatics and Computational Biology University of Maryland

TUTORIAL. Rebecca Breu, Bastian Demuth, André Giesler, Bastian Tweddell (FZ Jülich) {r.breu, b.demuth, a.giesler,

Provisioning and Resource Management at Large Scale (Kadeploy and OAR)

PRACE WP4 Distributed Systems Management. Riccardo Murri, CSCS Swiss National Supercomputing Centre

TIBCO Spotfire Statistics Services Installation and Administration Guide

The GridWay Meta-Scheduler

Service Oriented Architectures

Ce document a été téléchargé depuis le site de Precilog. - Services de test SOA, - Intégration de solutions de test.

Status and Integration of AP2 Monitoring and Online Steering

Assessing Success and Uptake for New Standards in Cloud and Grid Computing

File transfer in UNICORE State of the art

Report on Project: Advanced System Monitoring for the Parallel Tools Platform (PTP)

Developers Integration Lab (DIL) System Architecture, Version 1.0

Performance Evaluation of the GridSAM Job Submission and Monitoring System

Science Gateway Services for NERSC Users

Workflow Tools at NERSC. Debbie Bard NERSC Data and Analytics Services

Siebel Application Deployment Manager Guide. Siebel Innovation Pack 2013 Version 8.1/8.2 September 2013

ANALYSIS OF GRID COMPUTING AS IT APPLIES TO HIGH VOLUME DOCUMENT PROCESSING AND OCR

APS Package Certification Guide

Multiagent Control of Traffic Signals Vision Document 2.0. Vision Document. For Multiagent Control of Traffic Signals. Version 2.0

OSCi Workshop Beijing, China A Cloud Platform for Service Oriented Software Development & Running

Cloud Computing. Lecture 5 Grid Case Studies

Benefits of Upgrading to Phoenix WinNonlin 6.2

Soaplab - a unified Sesame door to analysis tools

Parallels Mac Management v4.0

Continuous security audit automation with Spacewalk, Puppet, Mcollective and SCAP

Living in a mixed world -Interoperability in Windows HPC Server Steven Newhouse stevenn@microsoft.com

Simplifying Administration and Management Processes in the Polish National Cluster

32-Bit Workload Automation 5 for Windows on 64-Bit Windows Systems

AHE Server Deployment and Hosting Applications. Stefan Zasada University College London

API Architecture. for the Data Interoperability at OSU initiative

A Web-based Portal to Access and Manage WNoDeS Virtualized Cloud Resources

Deploying Business Virtual Appliances on Open Source Cloud Computing

IBM BPM V8.5 Standard Consistent Document Managment

Working with HPC and HTC Apps. Abhinav Thota Research Technologies Indiana University

IS-ENES WP3. D3.8 - Report on Training Sessions

Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery

Transcription:

Enhancing the performance of scientific workflow execution in e-science environments by harnessing the standards based Parameter Sweep Model 23. July 2013 Shahbaz Memon, Sonja Holl, Bernd Schuller, Andrew Grimshaw

Outline Introduction Taverna Workflow System Motivation: Advanced Workflow Optimization JSDL-Parameter Sweep Parameter Sweep Implementation Conclusions 23. Juli 2013 2

Introduction Scientific workflows are extensively being used for defining and generating computer simulations Same application need rerun with different parameters e.g. verifying and validating simulation models Parameter sweep concept can be used to represent an abstract job with iterative parameters An improved workflow optimization use case through the use of parameter sweep concept is used as a Job submission and management middleware 23. Juli 2013 3

Taverna Workflow System Open source written in Java Graphical Interface: Workbench Execution: Taverna Worfklow Engine Data Flow Driven Activities represent Web services, Java classes, local scripts plugin was developed in 2010 23. Juli 2013 4

Integrated Architecture Taverna Service Panel Workflow Diagram Credential Manager Workflow Engine Client Service Activity UI Service Activity Components Taverna Plugin Gateway Service Orchestrator Grid Resource Information Service Workflow Engine Tracing Service Server Tier core services Target System Interface JUDGE (Torque) core services Target System Interface JUROPA (MOAB/Torque) Target System Tier 23. Juli 2013 5

Grid Middleware Client and Server software Seamless and secure access to HPC resources Based on open standards: X.509, SAML, XACML, WS-* Service Oriented Architecture (SOA) and Web Services Resource Framework (WS-RF) compliant Support Grid Standards (JSDL, OGSA-BES) Clients Service Orchestrator Workflow Engine core services Target System Interface Site 1 JUDGE(Torque/Maui) Gateway Grid Resource Information Service Tracing Service core services Target System Interface Site 2 JUROPA (Torque/Moab) 23. Juli 2013 6 Server Tier Target System Tier

Use Case: Advanced Workflow Optimization The choice of parameters has a tremendous impact on overall scientific experimentation performed via workflows Choose the best parameters instead of default ones Default parameters Weak scientific results Proof of concept: Taverna plugin has been implemented to optimize workflow parameters by applying Genetic Algorithms Sub-workflows are considered for the optimization Several instances are executed in parallel Each instance with one parameter sample The resultant values are taken as fitness 23. Juli 2013 7

23. Juli 2013 8

Job Generation and Submission Taverna a) b) a 1 a n b 1 b n d). D at E n a v App JSDL Instances Server /X Service Orchestrator XUUDB... e) Target System 23. Juli 2013 EGI Community Forum 2013 9

Limitations Client manually submits N job (same application) requests, with only difference in application parameters Comes with an additional overhead, Client responsible of managing each sweep manually N job status remote call outs Data staging-in for each job Waste of network resources 23. Juli 2013 10

Parametric Job Generation and Submission Taverna a) b) c) a 1 a n b 1 b n Sweep Generator D at E n a v App d) Server /X D at E n a v App JSDL-PS Instance Service Orchestrator XUUDB... e) Target System Job Instance 23. Juli 2013 XSEDE13 11

JSDL Parameter Sweep Extensions http://forge.gridforum.org/sf/projects/jsdl-wg 1. Extension of Job Submission and Description Language (JSDL) specification [GFD.136] 2. A common requirement to select a job and submit it 10, 50, 300 times, each time making some modifications to the original/master JSDL (e.g. args, parameters, output dir, input file whatever). 3. The JSDL + PS extensions allows you to group the master JSDL + the required modifications (which JSDL fields require sweeping); Saves writing multiple separate JSDL docs. Can be any value within the JSDL document itself, Can be any value within a named file that is referenced by the JSDL (e.g. an input file). Actually yields multiple separate jobs (rather than solely parameter sweeps). Slide Courtesy: Geoff Williams et al.

Slide Courtesy: Geoff Williams et al. JSDL Sweep Overview 1. Nest <Sweep> elements within a JSDL doc. 2. The <Assignment> identifies which set of <Parameters> should be swept / iterated using the given sweep <Function>. 3. <Parameter> + <Function> are abstract (can define different implementations as required). JSDL + PS JSDL Select + give new values for <App> element 4. Parameters: DocumentNode FileSweep 5. Functions: Values, LoopInteger, LoopDouble PS ext

JSDL-PS Example PRM 157 186 23. Juli 2013 XSEDE13 14

Parameter Sweep Implementation Job execution service consume JSDL-PS request and pass it to the XNJS Clients Gateway XNJS parses JSDL and forwards request to the Target System Interface (TSI) TSI translates the incoming XNJS message to batch system specific commands Service Orchestrator Workflow Engine core services XNJS Target System Interface Site 1 JUDGE(Torque/Maui) Grid Resource Information Service Tracing Service core services XNJS Target System Interface Site 2 JUROPA (Torque/Moab) 23. Juli 2013 15 Server Tier Target System Tier

XNJS Execution Management System Separated from core (Web) services tier Manage life cycle of atomic jobs (stage-in, execute, stage-out) Perform JSDL parsing Validation against resource and job requirements Manage user access to jobs submit, start, stop Support file spaces (Job Working Directory, HOME, ROOT) core services XNJS Target System Interface Local RMS (e.g. Torque, LL, LSF, etc.) 23. Juli 2013 16

XNJS Sweep Job Handling core services XNJS Job Submission JSDL-PS JSDLProcessor Local RMS qsub Job Script Target System Interface Local RMS (e.g. Torque, LL, LSF, etc.) Sweep Library Resolve JSDL-PS instances Contains Sweep SweepProcessor Simple JSDL TSI Submission Simple JSDL Processing 23. Juli 2013 17 Yes No Submit generated JSDLs

XNJS Job Working Directory Structure Parent Job Parent Job WD Child 1 Child N Job1-WD Job2-WD XNJS LRMS Parent Job Working Directory is used for file staging-in Files are staged-in once to the parent job WD Soft links of the each staged-in file is created in the child job working directories 23. Juli 2013 18

Client Performance Impact Conventional Submission Sweep Submission 23. Juli 2013 EGI Community Forum 2013 19

Conventional vs Sweep Conventional Submission Sweep Generator File upload (small example) File upload (large example) Average CPU load on the client (%) 400 kb 700 MB 12.89 360 20 kb 35 MB 7.79 244 Max. CPU load on the client (%) 23. Juli 2013 EGI Community Forum 2013 20

Conclusions Parametric job run is optimized by submitting one remote job in lieu of hundreds or thousands JSDL-PS implementation will benefit wide range of applications which are relying on workflow optimization Taverna plugin is extended to use the server side parameter sweep capabilities Future Directions More intuitive file staging-ins Combining standard out / err Meaningful job status reports 23. Juli 2013 21

Questions? 23. Juli 2013 EGI Community Forum 2013 22