A Workflow Service for Biomedical Applications

Similar documents
An agent-based layered middleware as tool integration

FarMAS: a MAS for Extended

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

SOFTWARE TESTING TRAINING COURSES CONTENTS

A collaborative platform for knowledge management

SAP Data Services 4.X. An Enterprise Information management Solution

THE CCLRC DATA PORTAL

Translation Protégé Knowledge for Executing Clinical Guidelines. Jeong Ah Kim, BinGu Shim, SunTae Kim, JaeHoon Lee, InSook Cho, Yoon Kim

A solution for heterogeneous domotic systems integration

EVALUATION ONLY. WA2088 WebSphere Application Server 8.5 Administration on Windows. Student Labs. Web Age Solutions Inc.

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf

Software Architecture Document

ANSYS EKM Overview. What is EKM?

irods and Metadata survey Version 0.1 Date March Abhijeet Kodgire 25th

Rotorcraft Health Management System (RHMS)

File S1: Supplementary Information of CloudDOE

D83167 Oracle Data Integrator 12c: Integration and Administration

Developing SOA solutions using IBM SOA Foundation

Application. 1.1 About This Tutorial Tutorial Requirements Provided Files

IBM WebSphere ILOG Rules for.net

zen Platform technical white paper

EAI-Low Level Design Document

Intellicyber s Enterprise Integration and Application Tools

Automate Your BI Administration to Save Millions with Command Manager and System Manager

HETEROGENEOUS DATA INTEGRATION FOR CLINICAL DECISION SUPPORT SYSTEM. Aniket Bochare - aniketb1@umbc.edu. CMSC Presentation

Scheduling in SAS 9.4 Second Edition

Using Oracle Data Integrator with Essbase, Planning and the Rest of the Oracle EPM Products

Business-Driven Software Engineering Lecture 3 Foundations of Processes

Agenda. Overview. Federation Requirements. Panlab IST Teagle for Partners

Informatica MRS Backup

Training Management System for Aircraft Engineering: indexing and retrieval of Corporate Learning Object

Distributed Database for Environmental Data Integration

Oracle SOA Suite: The Evaluation from 10g to 11g

EMBL Identity & Access Management

Deploying to WebSphere Process Server and WebSphere Enterprise Service Bus

Government Service Bus

MOBILE ARCHITECTURE FOR DYNAMIC GENERATION AND SCALABLE DISTRIBUTION OF SENSOR-BASED APPLICATIONS

Oracle Data Integrator 12c: Integration and Administration

Oracle Warehouse Builder 10g

Metastorm BPM Interwoven Integration. Process Mapping solutions. Metastorm BPM Interwoven Integration. Introduction. The solution

Middleware Support for Real-Time Stream Processing Luigi Romano

StreamServe Persuasion SP5 Upgrading instructions

Data Grids. Lidan Wang April 5, 2007

Scheduling in SAS 9.3

Web. Studio. Visual Studio. iseries. Studio. The universal development platform applied to corporate strategy. Adelia.

Performance Analysis, Data Sharing, Tools Integration: New Approach based on Ontology

1 What Are Web Services?

Oracle Data Integrator: Administration and Development

Clinical and research data integration: the i2b2 FSM experience

CREATING AND APPLYING KNOWLEDGE IN ELECTRONIC HEALTH RECORD SYSTEMS. Prof Brendan Delaney, King s College London

Amit Sheth & Ajith Ranabahu, Presented by Mohammad Hossein Danesh

SQL Server Integration Services. Design Patterns. Andy Leonard. Matt Masson Tim Mitchell. Jessica M. Moss. Michelle Ufford

Software Development Kit

ACE GIS Project Overview: Adaptable and Composable E-commerce and Geographic Information Services

Asset Tracking System

Implementing Ontology-based Information Sharing in Product Lifecycle Management

MD Link Integration MDI Solutions Limited

Position Paper for The Fourth W3C Web and TV Workshop. Mingmin Wang Oriental Cable Network

ACS 5.x and later: Integration with Microsoft Active Directory Configuration Example

Clinical Knowledge Manager. Product Description 2012 MAKING HEALTH COMPUTE

Lesson 4 Web Service Interface Definition (Part I)

Crawl Proxy Installation and Configuration Guide

Dr. Pat Mirenda. Software Design Specification Document

To introduce software process models To describe three generic process models and when they may be used

Chapter 5. Regression Testing of Web-Components

U.S. Navy Automated Software Testing

Selbo 2 an Environment for Creating Electronic Content in Software Engineering

Taking full advantage of the medium does also mean that publications can be updated and the changes being visible to all online readers immediately.

SAS IT Resource Management 3.2

Overview. Stakes. Context. Model-Based Development of Safety-Critical Systems

Getting Started with Oracle

Configuring Firewalls An XML-based Approach to Modelling and Implementing Firewall Configurations

design coding monitoring deployment Java Web Framework for the Efficient Development of Enterprise Web Applications

A generic approach for data integration using RDF, OWL and XML

MicroStrategy Course Catalog

IBM Tivoli Directory Integrator

CREDENTIALS & CERTIFICATIONS 2015

WAREHOUSE MANAGEMENT SOFTWARE

Transcription:

A Workflow Service for Biomedical Applications Emanuela Merelli Paolo Romano Lorenzo Scortichini Università di Camerino National Cancer Research Institute Università di Camerino ITALY ITALY ITALY 2004 Bioinformatics Italian Society Meeting Padova 27 marzo 2004

Outline The Workflow in the BioScience Domain The Workflow-based Activity Coordination Workflow service The supporting technology: Agent-based middleware supports Workflow Service Future Activities and Conclusions Padova 27 marzo 2004 2

Workflows in the BioScience Domain Definition The computerised facilitation or automation of a business process, in whole or part. (from Workflow Management Coalition-Reference Model) Goals to design and implement a data analysis process (standardized protocols - S. Hoon et al. 03) to simulate a high-level biological process (Peleg et al. 03, Amici et al. 04) Advantages for data analysis, it makes possible: to reproduce the analysis to reuse intermediate results to create a transparent analysis environment to support a good practice to free the bioscientist from repetitive interactaction with the web to verify structure, functional and dynamic process requirements Padova 27 marzo 2004 3

Hypothetical Scenario for O2I project Oncology over Internet project aims to develop a framework to support searching, retrieving and filtering information from Intenert for oncology research and clinics A possible scenario involving the use of biological resources Biological resources (micro-organisms, cell lines ) are essential for implementing a good, reproducible experiment High quality biological resources are available at some specialized centers (Biological Resources Centers:ATCC, DSMZ, ) and related catalogues are available on-line Molecular Biology (MB) databases, e.g., sequence dbs, often include information (strain numbers, accession numbers) on the original resources Researchers assessing MB databases often need extended information regarding resources to finally request materials Padova 27 marzo 2004 4

A simple workflow example Use context: to verify a mutation experiment by reproducing Goal: Retrieve abstracts from a literature db for identifying the best cell line for reproducing a human TP53 mutation experiment linked to a particular tumour-habitssex combination Activities: use Bioinformatics Services available on Internet in order to achieve the desired result 1. Retrieve all mutations (IDs) observed in the 7th exon in men who are ex-smokers and drinkers by searching p53 mutations database SRS implementation at IST, Genova 2. Retrieve all mutations (IDs) observed by using B9 cell line as original resource by searching p53 mutations database SRS implemerntation at IST, Genova 1 st Activity 2 nd Activity 3. Retrieve all abstracts of the correlated bibliographic references, of a specific mutation ID by searching Medline 3 rd Activity Achievement: To integrate on-line Bioinformatics data in a unique result freeing the Bioscientist from the need to personally interact with remote sites Padova 27 marzo 2004 5

An example of workflow at user level linea_cellulare= B9 intron_exon= 7-exon sex= M fumo= ex-fumatore alcool= bevitore Intron exon : intron_exon Sesso : sex Fumo : fumo Alcool : alcool Linea Cellulare : linea_cellulare Info_Mutazioni Nome_Mutazioni Intersezione_Mutazioni Abstract Info_Mutazioni Retrieve all mutations (IDs) Find all mutations observed in the B1 7th exon in men observed by using B4 B9 cell line who are ex-smokers and drinkers Nome_Mutazioni 1 st Activity 3 rd Activity Intersezione_Mutazioni Abstract merge [Intersezione_Mutazioni > 0] Retrieve all abstracts B5 of the correlated bibliographic references, of a specific mutation ID Intersezione_Mutazioni 4 nd Activity [Intersezione_Mutazioni = 0] 2 nd Activity Padova 27 marzo 2004 6

Use Cases Use cases are a set of scenarios tied together by a common user goal. Where a scenario is a sequence of steps describing an interaction between a user and a system. Some examples: find all possible mutations that involve a given protein X find all possible cell lines related to a given tumour Y select all possible abstracts referring to a given protein Z select all possible abstracts referring to cell line W UML definition Workflow as a composition of uses cases Padova 27 marzo 2004 7

Use cases in Cell Line domain A1: Find information about the cell line named x A2: Find all cell lines derived from a specific tumour or pathology A3: Find all Cell Lines producing a specific protein A4: Given a specific Cell Line, find all related bibliographic references A5: Given a specific Cell Line, find all information about produced proteins Use cases in Mutation domain Use cases and Application Domains B1: Find all mutations observed in a specific intron/exon in subjects with specific sex and life habits (i.e. smokers/ drinkers) B2: Find all mutations in subjects affected by a given pathology B3: Find all subjects affected by a tumoural pathology and with a given protein mutation B4: Find all mutations observed by using a given cell line B5: Given a specific mutation, find all abstracts of the correlated bibliographic references Use cases in Bibliographic database domain C1: Select all abstracts of bibliographic references, whose text includes a given term C2:.. Padova 27 marzo 2004 8

Tools suitcase Wf Management Wf Editor modelling and definition Wf Checker analyzing consistency of the model Wf Compiler - translating model to executable code User interface: Web-based GUI, Console, System Management Middleware platform deployment tool Account management tool System s Diagnosis tool System maintenance tool Traceability tool Padova 27 marzo 2004 9

User accounts managements This page is offered by the BioAgent platform site. It is possible to access to the system services If you have already a login and password, Submit Through the New User botton you can register Padova 27 marzo 2004 10

The offered services Remove a user Change user inforation Add a new use case View the available workflows Get agent data Exit Padova 27 marzo 2004 11

Use Cases management The Get Use Case botton allows to configure a new use case 1- Choose the application domain 2- Choose one of the available use cases Padova 27 marzo 2004 12

Use Cases configuration B1: Find all mutations observed in a specific intron/exon in subjects with specific sex and life habits (i.e. smokers/ drinkers) Input parameters Choose the coordination operator Padova 27 marzo 2004 13

The Workflow and Use cases Global textual view of the workflow. Submit Workflow Remove the use case Change configuration of a use case Submit a single use case Submit the workflow Padova 27 marzo 2004 14

Data management The Get Agents Data allows to manage data resulting from the workflow submission Once we have selected data of interest, we can remove or view in XML or HTML format Padova 27 marzo 2004 15

Data view XML data format XHTML data format Padova 27 marzo 2004 16

Future extensions to the tool Design and implementation of a knowledge database Automatic generation, from a workflow to a Multiagent system that behaves as the execution engine of a WfMS Development of an ontological service to support the user during the use case creation Padova 27 marzo 2004 17

The agent-based supporting technology

From Data to Knowledge and vice versa (Merelli et al. 02) Web meta-data ontologies (human concepts) + workflow MAS XML + RDF elements Information + coordination Services data format Data access + code Padova 27 marzo 2004 19

System s software architecture (Corradini et al 03) Padova 27 marzo 2004 20

The O2I system s architecture User Layer System Layer Run-Time Layer Long-transaction Retrieval Service User Application Workflow Short-transaction Workflow Mng High Level Integration Module Low Level Integration Module Web Services Knowledge Base Temporary Data Repository Remote Place where Tools are available Padova 27 marzo 2004 21

BioAgent System Architecture for O2I User Layer System Layer Run-Time Layer User Application Workflow Long-transaction Workflow Mng Workflow Executor Agents Retrieval Service Wrapper Agent EMBL FASTA ASN.1 GenBank RDB Kw db and Use Cases repository Knw Mng Agent Service HTML XML TXT ADb ULAD ALAID DoS Temporary Data Repository Remote data format Padova 27 marzo 2004 22

Software Architecture (Corradini et al. 04) User Application Workflow Workflow Management B1 B6 B4 User level workflow User Layer Application Agents Service Agents Management Activity A1 Activity A2 WE WE Activity B1 Activity B2 WA WA WA Insieme di esecutori di workflow (workflow executors, WE) Activity C1 Agent level workflow WE A WE B WE C System Layer Resources (data, tools and services) Core Tool A Tool B Tool C Run-Time Layer Padova 27 marzo 2004 23

Workflow Management Service The workflow management service is a prototype which allows the defintion of complex queries (use case) by using workflow as a coordinaton model. The application uses a databases to allow the user to configure use cases and manage resulting data. The application provides a graphical interface The application is a plug-in for the BioAgent platform linea_cellulare= B9 intron_exon= 7-intron sex= M fumo= ex-fumatore alcool= bevitore Intron exon : intron_exon Sesso : sex Fumo : fumo Alcool : alcool Linea Cellulare : linea_cellulare Info_Mutazioni Nome_Mutazioni Intersezione_Mutazioni Abstract Info_Mutazioni B1 B4 Nome_Mutazioni Intersezione_Mutazioni Abstract B6 B5 [Intersezione_Mutazioni > 0] Intersezione_Mutazioni [Intersezione_Mutazioni = 0] Padova 27 marzo 2004 24

An example of workflow at agent level introne/exon= 7-intron sex= M fumo= ex-fumatore alcool= bevitore lista={"www.a...","www.b... } Indice Info_Mutazioni[] controlla_risultato Intersezione_Mutazioni Indice Agent A Place: lista[indice] linea Cellulare =B9 lista={"www.a...","www.b..."} Indice Nomi_Mutazioni[] controlla_risultato Agent B Place: lista[indice] Indice Move lista={"www.f...","www.g..."} Indice abstract[] controlla_risultato mutazioni Intersezione_Mutazioni Agent C Indice Info_Mutazioni controlla_risultato Move Cerca mutazione Controlla lista Intron exon : intron_exon Sesso : sex Fumo : fumo Alcool : alcool Lista : lista Elemento : indice Indice Nomi_Mutazioni controlla_risultato [controlla_risultato no fine lista] Cerca mutazione Controlla lista Linea Cellulare : linea_cellulare Lista : lista Elemento : indice Indice Place: lista[indice] [controlla_risultato = no fine lista] Move [controlla_risultato = fine lista] [controlla_risultato = fine lista] Indice abstract Cerca Abstract Intersezione_Mutazioni B6 mutazioni Controlla Mutazioni [mutazioni = mutazioni finite] Controlla controlla_risultato lista [mutazioni = altre mutazioni] Intersezione_mutazioni Lista : lista Elemento : indice [controlla_risultato = no fine lista] [controlla_risultato = fine lista] Mutazioni : Intersezione_Mutazioni Padova 27 marzo 2004 25

Compiler stage: executable application agents DoS ALAID Activity Activity Activity Activity Activity Activity Compiler Activity Activity Padova 27 marzo 2004 26

Wrapper-Agent: general scenario WA WA WA QueryString: XML XML XML ProgramOption:.. SELECT. FROM WHERE.. Adaptor AIXO Adaptor AIXO Adaptor AIXO HTML Web Page Flat Files from Command Line Program E. Bartocci et al. 03 Bartocci et al 03 RDBMS Padova 27 marzo 2004 27

Wrapper-based System: Retrieval articles about P53 protein XML Filter and Map XSLT XML XML Trasl. TEXT Access XML GRAMMAR XML < > ID P53_HUMAN <entry> STANDARD; PRT; 393 AA. AC P04637; <ID Q16848; name="p53_human" Q9UBI2; type="standard" molecule="prt" lenght="393"/> DT 13-AUG-1987 <AC value= P04637 /> (Rel. 05, Created) <AC value= Q16848 /> <AC value= Q9UBI2 /> <DT day= 13 month= AUG year= 1987 rel= 05 /> </entry> Padova 27 marzo 2004 28

BioAgent x O2I (Merelli et al. 02) http://www.bioagent.net Padova 27 marzo 2004 29

Future Activities and Conclusions We are implementing an ontology-based wrapper agent developing the first prototype of the compiler to allow the automatic generation of user-agents enriching the set of tools supporting workflow specification We plan to evaluate our approach in using Workflow as a coordination model for modelling biological process We conclude saying that development of integrated platform for real applications, as those in Bioinformatics domain, is a very difficult task due to both heterogeneity of data format and wide variety of tools which continuously evolve Padova 27 marzo 2004 30

Our references for this work E. Mereli, R. Culmon & L. Mariani 02 E. Bartocci, L. Mariani & 03 F. Corradini, L. Mariani & 03 R.Amici, F. Corradini & 04 F. Corradini, L. Mariani & 04 Bioagent: an agent based platform for bioinformatics AIXO: Any Input XML Output, a generalized wrapper, ICEIS03 PEGAA: A Programming Environment for Global Activity-based Applications, WOA03 A Process Algebra View of Coordination Models with a Case Study in Computational Systems Biology An agent-based approach to toolintegration Padova 27 marzo 2004 31