Why use workflow? In recent years, explosive amounts of biological information has been obtained and deposited in various databases.

Size: px
Start display at page:

Download "Why use workflow? In recent years, explosive amounts of biological information has been obtained and deposited in various databases."

Transcription

1 Workflow technology It s a generic mechanism to integrate diverse types of available resources (databases, servers, software applications and different services). It facilitates knowledge exchange within traditionally divergent fields such as molecular biology, clinical research, computational science, physics, chemistry and statistics. 1

2 Workflow tecnology Researchers can easily incorporate and access diverse, distributed tools and data to develop their own research protocols for scientific analysis. 2

3 Applications Application of workflow technology has been reported in areas like drug discovery, genomics, large-scale gene expression analysis, proteomics, and system biology. 3

4 Why use workflow? In recent years, explosive amounts of biological information has been obtained and deposited in various databases. Workflow systems could become crucial for enabling scientists to deal with this data explosion. A workflow system is highly flexible and can accommodate any changes or updates whenever new or modified data and corresponding analytical tool become available. 4

5 How workflow works Workflow environment allows life science researchers to perform the integration themselves without involving any programming. Workflow system allows the construction of complex in silico experiments in the form of workflows and data pipelines. 5

6 Data Pipelines Data pipelining is a relatively simple concept. Any computational component or node has data inputs and data outputs. Data pipelining views these nodes as being connected together by pipes through which data flows. In a workflow controlled data pipeline, as the data flows, it is transformed and raw data is analyzed to become information and the collected information gives rise to knowledge. 6

7 Workflow description A workflow system (Hollingsworth, 1995) is a holistic unit that defines, manages, and executes workflow processes aided by software. The order of execution is defined by a computer representation of the workflow process logic. 7

8 Workflow description Internally, a workflow system uses a Workflow Language or Meta-Languages for process specification (Michael and Jörg, 1999) to define the workflow process logic, to be executed by workflow execution engine or workflow controller 8

9 Workflow description (2) Visual representation of the workflow process logic is generally carried out using a Graphical User Interface (GUI) where different types of nodes (data transformation point) or software components are available for connection through edges or pipes that define the workflow process 9

10 Workflow component The anatomy of a workflow node or component is basically defined by three parameters: input metadata transformation rules, algorithms or user parameters, output metadata. 10

11 Anatomy of a workflow node The component properties are best described by the input metadata, output metadata and user defined parameters or transformation rules. The inputs ports can be constrained to only accept data of a specific type such as those provided by another component 11

12 Node linking Nodes can be plugged together only if the output of one, previous (set of) node(s) represents the mandatory input requirements of the following node. Thus, the essential description of a node actually comprises only in and output that are described fully in terms of data types and their semantics. 12

13 A general workflow based framework for life sciences domain 13

14 A general workflow based framework for life sciences domain A general workflow has three different layers: clients layer, component and enactment layer, and database layer 14

15 Clients layer The client has a Graphical User Interface for the creation of workflows along with process definition service. The user can create workflows using any combination of the available tools, services or databases in workflow system by dragging/dropping and linking graphical icons. Process definition services use Meta-Languages for Workflow and Process Modeling. 15

16 Component and enactment layer This layer has component services and enactment services and it can interact with database layer. Component services provide different software components, tools and services (grid An enactment service consists of one or more workflow engines in order to manage and execute workflow instances. Workflow engine or workflow controller is responsible for the part or all of the runtime control environment within an enactment service. and web services). 16

17 Database layer Database layer consist of diverse range of data sources (remote as well as local). 17

18 Commercial Workflow Systems SciTegic s Pipeline Pilot chemistry (compound library acquisition, combinatorial library design, molecular property calculators, filters, and manipulators), ADME/Tox, Decision Trees, Modeling, R Statistics, Reporting, Sequence Analysis, BioMining, Text Analytics and Integration Collection (flexible mechanisms to link external applications and databases) 18

19 Commercial Workflow Systems InforSense KDE environment and its specialized extension BioSense High performance bioinformatics solutions ranging from sequence analysis to microarray informatics and remote database annotation ChemSense High range of chemoinformatics solutions ranging from the analysis and visualization of chemical libraries to the development of combinatorial chemistry libraries, and includes a wide range of QSAR, ADME-Tox prediction, molecular modeling and evaluation methods 19

20 Commercial Workflow Systems InforSense KDE and SciTegic s Pipeline Pilot are state of the art workflow systems helping faster and efficient research in life sciences domain. Due to the high costs involved, these are still not accessible to academics a major contributor to scientific research. Other commercial workflow system to be mentioned are INCOGEN VIBE, BioLog s BioLib, Science Factory s übertool. 20

21 Public Domain Workflow Systems There are dozens of open-source workflow systems in the life sciences domain, many of which were initially developed as specific and often small-scale projects. 21

22 Public Domain Workflow Systems Opensource workflow systems represent a major source of advantage for the academia not just because they are free of license charges, but also because the open-source workflow systems are based on community models of development in which people from diverse background actively contribute to the application. 22

23 Public Domain Workflow Systems KNIME (University of Konstanz) It is based on the Eclipse platform and, it provides an excellent data mining platform for drug discovery informatics, bioinformatics and chemistry research Taverna It addresses problems beyond the capabilities of the present system to improve many areas including Data flow centric model, Scalability and Data streaming. 23

24 Public Domain Workflow Systems my Grid It is developing semantically enabled grid middleware for supporting bioinformatics and drug discovery applications and, is regarded as the most powerful workflow system in public domain. Different application tools built on mygrid cover fields like Genome and Proteome Annotation (e-protein), Integrative Systems Biology (myib), Integration of Biological Data (PlaNet, EMBRACE), Proteomics (ISPIDER), Medical and Health Informatics (PsyGrid, MIAS-Grid), Chemoinformatics (CDK- Taverna, Chimatica) andtext mining (Epitheliome). 24

25 Taverna Taverna project aims to provide a language and software tools to facilitate easy use of workflow and distributed compute technology within the escience community Taverna is available freely under the terms of the GNU Lesser General Public License (LGPL) and it is possible to download at the site Taverna is written in Java 5, and runs on a selection of platforms. 25

26 Taverna Taverna allows a biologist or bioinformatician with limited computing background and limited technical resources and support to construct highly complex analyses over public and private data and computational resources, all from a standard PC, UNIX box or Apple computer. 26

27 Taverna The screenshot below shows the workbench in action running an example workflow. 27

28 Workflow example 28

29 Specialized Softwares for Integration with Life Sciences Workflow Systems Within the last few years a large number of tools and softwares dealing with different computational problems related to life sciences have been constantly developed Incorporating third party or new tools into existing frameworks needs a flexible, modular and customizable workflow framework A plug-in based architecture (like Eclipse) is a better option for an ideal workflow system. The workflow execution engine or controller communicates with the integrated components plug-ins using workflow language or XML. 29

30 Desired Traits and Future Trends Workflow systems can be data-intensive, computationintensive, analysis intensive, visualization-intensive (e.g., visualization pipeline systems such as AVS, OpenDX, and SCIRun), process-intensive or a combination of one or more traits. 30

31 Desired Traits and Future Trends Workflow systems can be data-intensive, computationintensive, analysis intensive, visualizationintensive (e.g., visualization pipeline systems such as AVS, OpenDX, and SCIRun), process-intensive or a combination of one or more traits. There are many common desired traits (Ludäscher et al., 2005) of workflow systems like seamless access to resources and services, service composition and reuse, scalability, detached execution, reliability and faulttolerance, semantic binding, process provenance and data provenance. 31

32 Desired Traits and Future Trends Present workflow systems in life sciences need to integrate several resources like Semantic web technology (Shadbolt et al., 2006), grids services and web services. Web and grid services provide access to distributed resources, while workflow techniques enable the integration of these resources to perform in silico experiments. 32

33 Conclusion We described workflow based framework to speed up the life sciences research. Largescale data analysis needs flexible workflow based integration of different software tools and application from diverse domains which can provide in silico experimental design through visual programming and execution on grids. 33

34 Conclusion Workflow systems in life sciences are still evolving and the final goal is a distributed and a ubiquitous environment which can integrate web services, semantic web, grids, domain specific tools and ontologies. 34

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel

More information

BIOINFORMATICS Supporting competencies for the pharma industry

BIOINFORMATICS Supporting competencies for the pharma industry BIOINFORMATICS Supporting competencies for the pharma industry ABOUT QFAB QFAB is a bioinformatics service provider based in Brisbane, Australia operating nationwide and internationally. QFAB was established

More information

Pipeline Pilot Enterprise Server. Flexible Integration of Disparate Data and Applications. Capture and Deployment of Best Practices

Pipeline Pilot Enterprise Server. Flexible Integration of Disparate Data and Applications. Capture and Deployment of Best Practices overview Pipeline Pilot Enterprise Server Pipeline Pilot Enterprise Server (PPES) is a powerful client-server platform that streamlines the integration and analysis of the vast quantities of data flooding

More information

University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology

University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology Programme Structure - the MSc outcome will require 180 credits total (full-time only) - 60

More information

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS 1. The Technology Strategy sets out six areas where technological developments are required to push the frontiers of knowledge

More information

speed thought Getting the most of CHEMAXON Integration June 2006 of The Power of at the

speed thought Getting the most of CHEMAXON Integration June 2006 of The Power of at the ETL Data Mining Workflow Engine In Database Analytics Process Knowledge Creation How Soon Can We Deliver? Which Project Is Most Successful? What More Information Do We Need? Where Is The Risk In My Portfolio?

More information

Cheminformatics and Pharmacophore Modeling, Together at Last

Cheminformatics and Pharmacophore Modeling, Together at Last Application Guide Cheminformatics and Pharmacophore Modeling, Together at Last SciTegic Pipeline Pilot Bridging Accord Database Explorer and Discovery Studio Carl Colburn Shikha Varma-O Brien Introduction

More information

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf Jenkins as a Scientific Data and Image Processing Platform Ioannis K. Moutsatsos, Ph.D., M.SE. Novartis Institutes for Biomedical Research www.novartis.com June 18, 2014 #jenkinsconf Life Sciences are

More information

An Experimental Workflow Development Platform for Historical Document Digitisation and Analysis

An Experimental Workflow Development Platform for Historical Document Digitisation and Analysis An Experimental Workflow Development Platform for Historical Document Digitisation and Analysis Clemens Neudecker, Mustafa Dogan, Sven Schlarb (IMPACT) Paolo Missier, Shoaib Sufi, Alan Williams, Katy Wolstencroft

More information

Soaplab - a unified Sesame door to analysis tools

Soaplab - a unified Sesame door to analysis tools Soaplab - a unified Sesame door to analysis tools Martin Senger, Peter Rice, Tom Oinn European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK http://industry.ebi.ac.uk/soaplab Abstract

More information

IEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper

IEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper IEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper CAST-2015 provides an opportunity for researchers, academicians, scientists and

More information

Agilent s Kalabie Electronic Lab Notebook (ELN) Product Overview ChemAxon UGM 2008 Agilent Software and Informatics Division Mike Burke

Agilent s Kalabie Electronic Lab Notebook (ELN) Product Overview ChemAxon UGM 2008 Agilent Software and Informatics Division Mike Burke Agilent s Kalabie Electronic Lab Notebook (ELN) Product Overview ChemAxon UGM 2008 Agilent Software and Informatics Division Mike Burke Kalabie: Extending the OpenLAB Architecture Agilent User Interface

More information

IO Informatics The Sentient Suite

IO Informatics The Sentient Suite IO Informatics The Sentient Suite Our software, The Sentient Suite, allows a user to assemble, view, analyze and search very disparate information in a common environment. The disparate data can be numeric

More information

A W orkflow Management System for Bioinformatics Grid

A W orkflow Management System for Bioinformatics Grid A W orkflow Management System for Bioinformatics Grid Giovanni Aloisio, Massimo Cafaro, Sandro Fiore, Maria Mirto C A C T/IS U FI SP A CI, University of Lecce and NNL/INFM&CNR,Italy NETTAB 2005, 5-7 October

More information

INCOGEN Professional Services

INCOGEN Professional Services Custom Solutions for Life Science Informatics Whitepaper INCOGEN, Inc. 3000 Easter Circle Williamsburg, VA 23188 www.incogen.com Phone: 757-221-0550 Fax: 757-221-0117 info@incogen.com Introduction INCOGEN,

More information

VIBE. Visual Integrated Bioinformatics Environment. Enter the Visual Age of Computational Genomics. Whitepaper

VIBE. Visual Integrated Bioinformatics Environment. Enter the Visual Age of Computational Genomics. Whitepaper VIBE Visual Integrated Bioinformatics Environment Whitepaper Enter the Visual Age of Computational Genomics INCOGEN, Inc. 104 George Perry Williamsburg, VA 23185 www.incogen.com Phone: 757-221-0550 info@incogen.com

More information

The Mantid Project. The challenges of delivering flexible HPC for novice end users. Nicholas Draper SOS18

The Mantid Project. The challenges of delivering flexible HPC for novice end users. Nicholas Draper SOS18 The Mantid Project The challenges of delivering flexible HPC for novice end users Nicholas Draper SOS18 What Is Mantid A framework that supports high-performance computing and visualisation of scientific

More information

Semantic and Personalised Service Discovery

Semantic and Personalised Service Discovery Semantic and Personalised Service Discovery Phillip Lord 1, Chris Wroe 1, Robert Stevens 1,Carole Goble 1, Simon Miles 2, Luc Moreau 2, Keith Decker 2, Terry Payne 2 and Juri Papay 2 1 Department of Computer

More information

Software Development Kit

Software Development Kit Open EMS Suite by Nokia Software Development Kit Functional Overview Version 1.3 Nokia Siemens Networks 1 (21) Software Development Kit The information in this document is subject to change without notice

More information

Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery

Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery Center for Information Services and High Performance Computing (ZIH) Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery Richard Grunzke*, Jens Krüger, Sandra Gesing, Sonja

More information

Semantic Data Management. Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies

Semantic Data Management. Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies Semantic Data Management Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies 1 Enterprise Information Challenge Source: Oracle customer 2 Vision of Semantically Linked Data The Network of Collaborative

More information

Building Semantic Content Management Framework

Building Semantic Content Management Framework Building Semantic Content Management Framework Eric Yen Computing Centre, Academia Sinica Outline What is CMS Related Work CMS Evaluation, Selection, and Metrics CMS Applications in Academia Sinica Concluding

More information

Doctor of Philosophy in Computer Science

Doctor of Philosophy in Computer Science Doctor of Philosophy in Computer Science Background/Rationale The program aims to develop computer scientists who are armed with methods, tools and techniques from both theoretical and systems aspects

More information

Using Ontologies in Proteus for Modeling Data Mining Analysis of Proteomics Experiments

Using Ontologies in Proteus for Modeling Data Mining Analysis of Proteomics Experiments Using Ontologies in Proteus for Modeling Data Mining Analysis of Proteomics Experiments Mario Cannataro, Pietro Hiram Guzzi, Tommaso Mazza, and Pierangelo Veltri University Magna Græcia of Catanzaro, 88100

More information

On the Applicability of Workflow Management Systems for the Preservation of Business Processes

On the Applicability of Workflow Management Systems for the Preservation of Business Processes On the Applicability of Workflow Management Systems for the Preservation of Business Processes Rudolf Mayer, Stefan Pröll, Andreas Rauber sproell@sba-research.org University of Technology Vienna, Austria

More information

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

Semantic and Personalised Service Discovery

Semantic and Personalised Service Discovery Semantic and Personalised Service Discovery Phillip Lord 1, Chris Wroe 1, Robert Stevens 1,Carole Goble 1, Simon Miles 2, Luc Moreau 2, Keith Decker 2, Terry Payne 2 and Juri Papay 2 1 Department of Computer

More information

Primetime for KNIME:

Primetime for KNIME: Primetime for KNIME: Towards an Integrated Analysis and Visualization Environment for RNAi Screening Data F. Oliver Gathmann, Ph. D. Director IT, Cenix BioScience Presentation for: KNIME User Group Meeting

More information

Describing Web Services for user-oriented retrieval

Describing Web Services for user-oriented retrieval Describing Web Services for user-oriented retrieval Duncan Hull, Robert Stevens, and Phillip Lord School of Computer Science, University of Manchester, Oxford Road, Manchester, UK. M13 9PL Abstract. As

More information

Open Source Software in Life Science Research. Woodhead Publishing Series in Biomedicine

Open Source Software in Life Science Research. Woodhead Publishing Series in Biomedicine Brochure More information from http://www.researchandmarkets.com/reports/2719842/ Open Source Software in Life Science Research. Woodhead Publishing Series in Biomedicine Description: The free/open source

More information

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing

More information

MEng, BSc Computer Science with Artificial Intelligence

MEng, BSc Computer Science with Artificial Intelligence School of Computing FACULTY OF ENGINEERING MEng, BSc Computer Science with Artificial Intelligence Year 1 COMP1212 Computer Processor Effective programming depends on understanding not only how to give

More information

IDL. Get the answers you need from your data. IDL

IDL. Get the answers you need from your data. IDL Get the answers you need from your data. IDL is the preferred computing environment for understanding complex data through interactive visualization and analysis. IDL Powerful visualization. Interactive

More information

RAPIDMINER FREE SOFTWARE FOR DATA MINING, ANALYTICS AND BUSINESS INTELLIGENCE. Luigi Grimaudo 178627 Database And Data Mining Research Group

RAPIDMINER FREE SOFTWARE FOR DATA MINING, ANALYTICS AND BUSINESS INTELLIGENCE. Luigi Grimaudo 178627 Database And Data Mining Research Group RAPIDMINER FREE SOFTWARE FOR DATA MINING, ANALYTICS AND BUSINESS INTELLIGENCE Luigi Grimaudo 178627 Database And Data Mining Research Group Summary RapidMiner project Strengths How to use RapidMiner Operator

More information

Delivering the power of the world s most successful genomics platform

Delivering the power of the world s most successful genomics platform Delivering the power of the world s most successful genomics platform NextCODE Health is bringing the full power of the world s largest and most successful genomics platform to everyday clinical care NextCODE

More information

Manjrasoft Market Oriented Cloud Computing Platform

Manjrasoft Market Oriented Cloud Computing Platform Manjrasoft Market Oriented Cloud Computing Platform Innovative Solutions for 3D Rendering Aneka is a market oriented Cloud development and management platform with rapid application development and workload

More information

TIBCO Spotfire Helps Organon Bridge the Data Gap Between Basic Research and Clinical Trials

TIBCO Spotfire Helps Organon Bridge the Data Gap Between Basic Research and Clinical Trials TIBCO Spotfire Helps Organon Bridge the Data Gap Between Basic Research and Clinical Trials Pharmaceutical leader deploys TIBCO Spotfire enterprise analytics platform across its drug discovery organization

More information

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India 1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India Call for Papers Colossal Data Analysis and Networking has emerged as a de facto

More information

San Diego Supercomputer Center, UCSD. Institute for Digital Research and Education, UCLA

San Diego Supercomputer Center, UCSD. Institute for Digital Research and Education, UCLA Facilitate Parallel Computation Using Kepler Workflow System on Virtual Resources Jianwu Wang 1, Prakashan Korambath 2, Ilkay Altintas 1 1 San Diego Supercomputer Center, UCSD 2 Institute for Digital Research

More information

Monitoring Infrastructure (MIS) Software Architecture Document. Version 1.1

Monitoring Infrastructure (MIS) Software Architecture Document. Version 1.1 Monitoring Infrastructure (MIS) Software Architecture Document Version 1.1 Revision History Date Version Description Author 28-9-2004 1.0 Created Peter Fennema 8-10-2004 1.1 Processed review comments Peter

More information

Classic Grid Architecture

Classic Grid Architecture Peer-to to-peer Grids Classic Grid Architecture Resources Database Database Netsolve Collaboration Composition Content Access Computing Security Middle Tier Brokers Service Providers Middle Tier becomes

More information

Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov

Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov Search and Data Mining: Techniques Applications Anya Yarygina Boris Novikov Introduction Data mining applications Data mining system products and research prototypes Additional themes on data mining Social

More information

Integrated Rule-based Data Management System for Genome Sequencing Data

Integrated Rule-based Data Management System for Genome Sequencing Data Integrated Rule-based Data Management System for Genome Sequencing Data A Research Data Management (RDM) Green Shoots Pilots Project Report by Michael Mueller, Simon Burbidge, Steven Lawlor and Jorge Ferrer

More information

10/21/10. Formatvorlage des Untertitelmasters durch Klicken bearbeiten

10/21/10. Formatvorlage des Untertitelmasters durch Klicken bearbeiten Formatvorlage des Untertitelmasters durch Klicken bearbeiten Introduction Who is Pramari? Leading US Based RFID Software and Consulting Company Member of EPCGlobal (Standards Group for RFID) Partnered

More information

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets!! Large data collections appear in many scientific domains like climate studies.!! Users and

More information

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.1 Coordination Deliverable 11.3 Selected Standards

More information

irods and Metadata survey Version 0.1 Date March Abhijeet Kodgire akodgire@indiana.edu 25th

irods and Metadata survey Version 0.1 Date March Abhijeet Kodgire akodgire@indiana.edu 25th irods and Metadata survey Version 0.1 Date 25th March Purpose Survey of Status Complete Author Abhijeet Kodgire akodgire@indiana.edu Table of Contents 1 Abstract... 3 2 Categories and Subject Descriptors...

More information

An Introduction to Genomics and SAS Scientific Discovery Solutions

An Introduction to Genomics and SAS Scientific Discovery Solutions An Introduction to Genomics and SAS Scientific Discovery Solutions Dr Karen M Miller Product Manager Bioinformatics SAS EMEA 16.06.03 Copyright 2003, SAS Institute Inc. All rights reserved. 1 Overview!

More information

Informatics and Knowledge Management at the Novartis Institutes for BioMedical Research (NIBR)

Informatics and Knowledge Management at the Novartis Institutes for BioMedical Research (NIBR) Informatics and Knowledge Management at the Novartis Institutes for BioMedical Research (NIBR) Enable Science in silico & Provide the Right Knowledge to the Right People at the Right Time to enable the

More information

UniGR Workshop: Big Data «The challenge of visualizing big data»

UniGR Workshop: Big Data «The challenge of visualizing big data» Dept. ISC Informatics, Systems & Collaboration UniGR Workshop: Big Data «The challenge of visualizing big data» Dr Ir Benoît Otjacques Deputy Scientific Director ISC The Future is Data-based Can we help?

More information

UGENE Quick Start Guide

UGENE Quick Start Guide Quick Start Guide This document contains a quick introduction to UGENE. For more detailed information, you can find the UGENE User Manual and other special manuals in project website: http://ugene.unipro.ru.

More information

Survey of Canadian and International Data Management Initiatives. By Diego Argáez and Kathleen Shearer

Survey of Canadian and International Data Management Initiatives. By Diego Argáez and Kathleen Shearer Survey of Canadian and International Data Management Initiatives By Diego Argáez and Kathleen Shearer on behalf of the CARL Data Management Working Group (Working paper) April 28, 2008 Introduction Today,

More information

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

RIC 2007 SNAP: Symbolic Nuclear Analysis Package. Chester Gingrich USNRC/RES 3/13/07

RIC 2007 SNAP: Symbolic Nuclear Analysis Package. Chester Gingrich USNRC/RES 3/13/07 RIC 2007 SNAP: Symbolic Nuclear Analysis Package Chester Gingrich USNRC/RES 3/13/07 1 SNAP: What is it? Standard Graphical User Interface designed to simplify the use of USNRC analytical codes providing:

More information

secure intelligence collection and assessment system Your business technologists. Powering progress

secure intelligence collection and assessment system Your business technologists. Powering progress secure intelligence collection and assessment system Your business technologists. Powering progress The decisive advantage for intelligence services The rising mass of data items from multiple sources

More information

MEng, BSc Applied Computer Science

MEng, BSc Applied Computer Science School of Computing FACULTY OF ENGINEERING MEng, BSc Applied Computer Science Year 1 COMP1212 Computer Processor Effective programming depends on understanding not only how to give a machine instructions

More information

The Service Revolution software engineering without programming languages

The Service Revolution software engineering without programming languages The Service Revolution software engineering without programming languages Gustavo Alonso Institute for Pervasive Computing Department of Computer Science Swiss Federal Institute of Technology (ETH Zurich)

More information

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik Leading Genomics Diagnostic harma Discove Collab Shanghai Cambridge, MA Reykjavik Global leadership for using the genome to create better medicine WuXi NextCODE provides a uniquely proven and integrated

More information

Integrating pharmacological data

Integrating pharmacological data Integrating pharmacological data For scientists For software and application developers A semantic data integration infrastructure Open PHACTS is a 3-year project of the Innovative Medicines Initiative

More information

Autonomic computing system for selfmanagement of Machine-to-Machine networks

Autonomic computing system for selfmanagement of Machine-to-Machine networks Self-IoT 2012, September 17th 2012, San Jose, California, USA in conjunction with ICAC 2012 Autonomic computing system for selfmanagement of Machine-to-Machine networks Mahdi BEN ALAYA, Salma MATOUSSI,Thierry

More information

Workprogramme 2014-15

Workprogramme 2014-15 Workprogramme 2014-15 e-infrastructures DCH-RP final conference 22 September 2014 Wim Jansen einfrastructure DG CONNECT European Commission DEVELOPMENT AND DEPLOYMENT OF E-INFRASTRUCTURES AND SERVICES

More information

AUTOMATIC WORKFLOW PROCESS AND DATA ANNOTATION: DESIGN AND EXECUTION ENVIRONMENT AMRITA BASU. (Under the Direction of Krzysztof J.

AUTOMATIC WORKFLOW PROCESS AND DATA ANNOTATION: DESIGN AND EXECUTION ENVIRONMENT AMRITA BASU. (Under the Direction of Krzysztof J. AUTOMATIC WORKFLOW PROCESS AND DATA ANNOTATION: DESIGN AND EXECUTION ENVIRONMENT by AMRITA BASU (Under the Direction of Krzysztof J. Kochut) ABSTRACT This thesis presents an ontology-based approach to

More information

Enabling Collaboration Using the Biomedical Informatics Research Network (BIRN):

Enabling Collaboration Using the Biomedical Informatics Research Network (BIRN): Enabling Collaboration Using the Biomedical Informatics Research Network (BIRN): Karl Helmer Ph.D. Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital June 4, 2010 BIRN

More information

Peer to Peer Search Engine and Collaboration Platform Based on JXTA Protocol

Peer to Peer Search Engine and Collaboration Platform Based on JXTA Protocol Peer to Peer Search Engine and Collaboration Platform Based on JXTA Protocol Andraž Jere, Marko Meža, Boštjan Marušič, Štefan Dobravec, Tomaž Finkšt, Jurij F. Tasič Faculty of Electrical Engineering Tržaška

More information

SOFTWARE TESTING TRAINING COURSES CONTENTS

SOFTWARE TESTING TRAINING COURSES CONTENTS SOFTWARE TESTING TRAINING COURSES CONTENTS 1 Unit I Description Objectves Duration Contents Software Testing Fundamentals and Best Practices This training course will give basic understanding on software

More information

Enterprise Application Integration: Integration and Utilization of SAS Products

Enterprise Application Integration: Integration and Utilization of SAS Products Paper 237-25 Enterprise Application Integration: Integration and Utilization of Products Matthew K. Hettinger, Mathet Consulting, Schaumburg, IL ABSTRACT The enterprise today requires many applications

More information

Client/Server Computing Distributed Processing, Client/Server, and Clusters

Client/Server Computing Distributed Processing, Client/Server, and Clusters Client/Server Computing Distributed Processing, Client/Server, and Clusters Chapter 13 Client machines are generally single-user PCs or workstations that provide a highly userfriendly interface to the

More information

Design of a Scientic Workow for the Analysis of Microarray experiments with Taverna and R

Design of a Scientic Workow for the Analysis of Microarray experiments with Taverna and R Design of a Scientic Workow for the Analysis of Microarray experiments with Taverna and R Marcus Ertelt Proposal for a diploma thesis December 2006 - May 2007 referees: Prof. Dr. Ulf Leser, PD Dr. Wolfgang

More information

Scientific versus Business Workflows

Scientific versus Business Workflows 2 Scientific versus Business Workflows Roger Barga and Dennis Gannon The formal concept of a workflow has existed in the business world for a long time. An entire industry of tools and technology devoted

More information

Dr Alexander Henzing

Dr Alexander Henzing Horizon 2020 Health, Demographic Change & Wellbeing EU funding, research and collaboration opportunities for 2016/17 Innovate UK funding opportunities in omics, bridging health and life sciences Dr Alexander

More information

QsarDB first 100 DOIs for predictive models

QsarDB first 100 DOIs for predictive models QsarDB first 100 DOIs for predictive models Uko Maran Institute of chemistry, University of Tartu, Estonia LOD: Content Data Predictive (and descriptive) models? Goal Components Persistent digital identifiers

More information

2012 LABVANTAGE Solutions, Inc. All Rights Reserved.

2012 LABVANTAGE Solutions, Inc. All Rights Reserved. LABVANTAGE Architecture 2012 LABVANTAGE Solutions, Inc. All Rights Reserved. DOCUMENT PURPOSE AND SCOPE This document provides an overview of the LABVANTAGE hardware and software architecture. It is written

More information

Unified Batch & Stream Processing Platform

Unified Batch & Stream Processing Platform Unified Batch & Stream Processing Platform Himanshu Bari Director Product Management Most Big Data Use Cases Are About Improving/Re-write EXISTING solutions To KNOWN problems Current Solutions Were Built

More information

Technical. Overview. ~ a ~ irods version 4.x

Technical. Overview. ~ a ~ irods version 4.x Technical Overview ~ a ~ irods version 4.x The integrated Ru e-oriented DATA System irods is open-source, data management software that lets users: access, manage, and share data across any type or number

More information

The Trials and Tribulations and ultimate success of parallelisation using Hadoop within the SCAPE project

The Trials and Tribulations and ultimate success of parallelisation using Hadoop within the SCAPE project The Trials and Tribulations and ultimate success of parallelisation using Hadoop within the SCAPE project Alastair Duncan STFC Pre Coffee talk STFC July 2014 SCAPE Scalable Preservation Environments The

More information

Bioinformatics Grid - Enabled Tools For Biologists.

Bioinformatics Grid - Enabled Tools For Biologists. Bioinformatics Grid - Enabled Tools For Biologists. What is Grid-Enabled Tools (GET)? As number of data from the genomics and proteomics experiment increases. Problems arise for the current sequence analysis

More information

THE CCLRC DATA PORTAL

THE CCLRC DATA PORTAL THE CCLRC DATA PORTAL Glen Drinkwater, Shoaib Sufi CCLRC Daresbury Laboratory, Daresbury, Warrington, Cheshire, WA4 4AD, UK. E-mail: g.j.drinkwater@dl.ac.uk, s.a.sufi@dl.ac.uk Abstract: The project aims

More information

Big Data Executive Survey

Big Data Executive Survey Big Data Executive Full Questionnaire Big Date Executive Full Questionnaire Appendix B Questionnaire Welcome The survey has been designed to provide a benchmark for enterprises seeking to understand the

More information

Accelerate > Converged Storage Infrastructure. DDN Case Study. ddn.com. 2013 DataDirect Networks. All Rights Reserved

Accelerate > Converged Storage Infrastructure. DDN Case Study. ddn.com. 2013 DataDirect Networks. All Rights Reserved DDN Case Study Accelerate > Converged Storage Infrastructure 2013 DataDirect Networks. All Rights Reserved The University of Florida s (ICBR) offers access to cutting-edge technologies designed to enable

More information

3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India

3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India 3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India Call for Papers Cloud computing has emerged as a de facto computing

More information

GenomeSpace Architecture

GenomeSpace Architecture GenomeSpace Architecture The primary services, or components, are shown in Figure 1, the high level GenomeSpace architecture. These include (1) an Authorization and Authentication service, (2) an analysis

More information

How To Develop An Open Play Context Framework For Android (For Android)

How To Develop An Open Play Context Framework For Android (For Android) Dynamix: An Open Plug-and-Play Context Framework for Android Darren Carlson and Andreas Schrader Ambient Computing Group / Institute of Telematics University of Lübeck, Germany www.ambient.uni-luebeck.de

More information

Sanjeev Kumar. contribute

Sanjeev Kumar. contribute RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a

More information

Manjrasoft Market Oriented Cloud Computing Platform

Manjrasoft Market Oriented Cloud Computing Platform Manjrasoft Market Oriented Cloud Computing Platform Aneka Aneka is a market oriented Cloud development and management platform with rapid application development and workload distribution capabilities.

More information

Looking into the Future of Workflows: The Challenges Ahead

Looking into the Future of Workflows: The Challenges Ahead Looking into the Future of Workflows: The Challenges Ahead Ewa Deelman Contributors: Bruce Berriman, Thomas Fahringer, Dennis Gannon, Carole Goble, Andrew Jones, Miron Livny, Philip Maechling, Steven McGough,

More information

PROGRESS Portal Access Whitepaper

PROGRESS Portal Access Whitepaper PROGRESS Portal Access Whitepaper Maciej Bogdanski, Michał Kosiedowski, Cezary Mazurek, Marzena Rabiega, Malgorzata Wolniewicz Poznan Supercomputing and Networking Center April 15, 2004 1 Introduction

More information

Intro to Data Management. Chris Jordan Data Management and Collections Group Texas Advanced Computing Center

Intro to Data Management. Chris Jordan Data Management and Collections Group Texas Advanced Computing Center Intro to Data Management Chris Jordan Data Management and Collections Group Texas Advanced Computing Center Why Data Management? Digital research, above all, creates files Lots of files Without a plan,

More information

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

Objectives. Chapter 2: Operating-System Structures. Operating System Services (Cont.) Operating System Services. Operating System Services (Cont.

Objectives. Chapter 2: Operating-System Structures. Operating System Services (Cont.) Operating System Services. Operating System Services (Cont. Objectives To describe the services an operating system provides to users, processes, and other systems To discuss the various ways of structuring an operating system Chapter 2: Operating-System Structures

More information

Simon Miles King s College London Architecture Tutorial

Simon Miles King s College London Architecture Tutorial Common provenance questions across e-science experiments Simon Miles King s College London Outline Gathering Provenance Use Cases Our View of Provenance Sample Case Studies Generalised Questions about

More information

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.1 Coordination Deliverable 11.2 Community Needs of

More information

Obfuscated Biology -MSc Dissertation Proposal- Pasupula Phaninder University of Edinburgh S1031624@sms.ed.ac.uk March 31, 2011

Obfuscated Biology -MSc Dissertation Proposal- Pasupula Phaninder University of Edinburgh S1031624@sms.ed.ac.uk March 31, 2011 Obfuscated Biology -MSc Dissertation Proposal- Pasupula Phaninder University of Edinburgh S1031624@sms.ed.ac.uk March 31, 2011 1 Introduction In this project, I aim to introduce the technique of obfuscation

More information

Teaching Computational Thinking using Cloud Computing: By A/P Tan Tin Wee

Teaching Computational Thinking using Cloud Computing: By A/P Tan Tin Wee Teaching Computational Thinking using Cloud Computing: By A/P Tan Tin Wee Technology in Pedagogy, No. 8, April 2012 Written by Kiruthika Ragupathi (kiruthika@nus.edu.sg) Computational thinking is an emerging

More information

Data Grids. Lidan Wang April 5, 2007

Data Grids. Lidan Wang April 5, 2007 Data Grids Lidan Wang April 5, 2007 Outline Data-intensive applications Challenges in data access, integration and management in Grid setting Grid services for these data-intensive application Architectural

More information

Software review. Pise: Software for building bioinformatics webs

Software review. Pise: Software for building bioinformatics webs Pise: Software for building bioinformatics webs Keywords: bioinformatics web, Perl, sequence analysis, interface builder Abstract Pise is interface construction software for bioinformatics applications

More information

Oracle Service Bus Examples and Tutorials

Oracle Service Bus Examples and Tutorials March 2011 Contents 1 Oracle Service Bus Examples... 2 2 Introduction to the Oracle Service Bus Tutorials... 5 3 Getting Started with the Oracle Service Bus Tutorials... 12 4 Tutorial 1. Routing a Loan

More information

Computational Science and Informatics (Data Science) Programs at GMU

Computational Science and Informatics (Data Science) Programs at GMU Computational Science and Informatics (Data Science) Programs at GMU Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ Outline Graduate Program

More information

Paradigm Changes Affecting the Practice of Scientific Communication in the Life Sciences

Paradigm Changes Affecting the Practice of Scientific Communication in the Life Sciences Paradigm Changes Affecting the Practice of Scientific Communication in the Life Sciences Prof. Dr. Martin Hofmann-Apitius Head of the Department of Bioinformatics Fraunhofer Institute for Algorithms and

More information

Data processing goes big

Data processing goes big Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,

More information

The requirements of recording and using provenance in e-science experiments

The requirements of recording and using provenance in e-science experiments The requirements of recording and using provenance in e-science experiments Simon Miles, Paul Groth, Miguel Branco and Luc Moreau School of Electronics and Computer Science University of Southampton Southampton,

More information